Introduction To Communication Sciences and Disorders (Weismer) - Rehabilita Shop

Introduction
to
Communication
Sciences and Disorders
The Scientific Basis of Clinical Practice
Introduction
to
Communication
The Scientific Basis of Clinical Practice
Gary Weismer, PhD

David K. Brown, PhD
5521 Ruffin Road
San Diego, CA 92123
e-mail: information@pluralpublishing.com
Website: https://www.pluralpublishing.com
Copyright © 2021 by Plural Publishing, Inc.
Typeset in 10/12 Palatino by Flanagan’s Publishing Services, Inc.

Printed in Canada by Friesens Corporation
All rights, including that of translation, reserved. No part of this publication may be reproduced, stored in a
retrieval system, or transmitted in any form or by any means, electronic, mechanical, recording, or otherwise,
including photocopying, recording, taping, Web distribution, or information storage and retrieval systems
without the prior written consent of the publisher.
For permission to use material from this text, contact us by

Telephone: (866) 758-7251
Fax: (888) 758-7255
e-mail: permissions@pluralpublishing.com
Every attempt has been made to contact the copyright holders for material originally printed in another source. If any have
been inadvertently overlooked, the publishers will gladly make the necessary arrangements at the first opportunity.
Disclaimer: Please note that ancillary content (such as documents, audio, and video, etc.) may not be
included as published in the original print version of this book.
Library of Congress Cataloging-in-Publication Data
Names: Weismer, Gary, (Professor Emeritus), author. | Brown, David K. (Professor of audiology),
author.
Title: Introduction to communication sciences and disorders : the
scientific basis of clinical practice / Gary Weismer, David K. Brown.
Description: San Diego, CA : Plural Publishing, Inc., [2021] | Includes
bibliographical references and index.
Identifiers: LCCN 2019029827 | ISBN 9781597562973 (paperback) | ISBN
1597562971 (paperback)
Subjects: MESH: Communication Disorders | Voice Disorders | Hearing
Disorders | Language Development | Speech — physiology |
Hearing — physiology
Classification: LCC RC423 | NLM WL 340.2 | DDC 616.85/5 — dc23
LC record available at https://lccn.loc.gov/2019029827
Contents
Preface xv
Acknowledgments xvii
Reviewers xix
1 Introduction to Communication Sciences 1

and Disorders
Introduction: Communication Sciences and Disorders as a Discipline 1
Communication Sciences and Disorders: The Whole Is Greater 2
Than the Sum of Its Parts
An Interdisciplinary Field 3
Translational Research 4
Does the Basic Science Work? Does the Clinic Work? 6
Evidence-Based Practice 7
A Typical Undergraduate Curriculum 10
Who Are the Professionals in Communication Sciences and Disorders? 10
Preparation for, and the Profession of, Speech-Language Pathology 10
Preparation for, and the Profession of, Audiology 12
Order of Chapters in the Text 13
Chapter Summary 14
References 14
2 The Nervous System: Language, Speech, and

17
Hearing Structures and Processes
Introduction 17
Central and Peripheral Nervous Systems 17
The Neuron 18
The Synapse 21
Tour of Gross Neuroanatomy 21
Frontal Lobe 22
Occipital Lobe 23
Temporal Lobe 23
Parietal Lobe 24
Hidden Cortex 24
Subcortical Nuclei 24
Brainstem, Cerebellum, and Spinal Cord 26
The Auditory Pathways 27
v
vi Introduction to Communication Sciences and Disorders: The Scientific Basis of Clinical Practice
The Dominant Hemisphere and the Perisylvian Language Areas 28

Arcuate Fasciculus (Dorsal Stream) and Ventral Stream 30
Functional Magnetic Resonance Imaging and Speech and Language 31
Brain Activity
Functional Magnetic Resonance Imaging 31
Diffusion Tensor Imaging 32
Chapter Summary 33
References 34
3 Language Science 35
Introduction 35
What Is Language? 35
Language: A Conventional System 36
Language: A Dynamic System 36
Language Is Generative 37
Language Uses Mental Representations 37
Language Is Localized in the Brain 38
Components of Language 38
Form 38
Social Use of Language (Pragmatics) 42
Language and Cognitive Processes 42
Why 43
How 43
When 43
Chapter Summary 44
References 44
4 Communication in a Multicultural Society 45

Introduction 45
Why It Matters 47
Difference Versus Disorder 47
Standardized Testing and Language Difference Versus Disorder 48
Accent, Dialect, and Culture 50
Accent 50
Dialect 51
Code Switching 52
Foreign Accent 53
Bilingualism and Multilingualism 54
Chapter Summary 55
References 55
5 Preverbal Foundations of Speech and 57

Language Development
Introduction 57
Preparatory Notes on Developmental Chronologies 58
0 to 3 Months: Expression (Production) 58
0 to 3 Months: Perception and Comprehension 60
3 to 8 Months: Production 61
Contents vii
8 to 12 Months: Production 65
Gesture and Preverbal Language Development 66
Chapter Summary 66
References 67
6 Typical Language Development 69

Introduction 69
12 to 18 Months 71
18 to 24 Months 71
Three Years (36 Months) 72
Multiword Utterances, Grammatical Morphology 72
Expanding Utterance Length: A Measure of Linguistic Sophistication 74
Grammatical Morphology 76
Typical Language Development in School Years 77
Metalinguistic Skills 77
Pragmatic Skill: Discourse 78
Complex Sentences 81
Sample Transcript 81
Chapter Summary 83
References 83
7 Pediatric Language Disorders I 85

Introduction 85
Specific Language Impairment/Developmental Language Disorder 85
Language Characteristics of Children with SLI/DLD 86
Summary of the Language Disorder in SLI/DLD 88
What Is The Cause of SLI/DLD? 88
The Role of Genetics in SLI/DLD 88
Language Delay and Autism Spectrum Disorder 89
Language Characteristics in ASD 89
Language Delay and Hearing Impairment 92
Epidemiology of Hearing Loss 92
Language Characteristics in Hearing Impairment 92
Speech and Language Development and Hearing Impairment 93
Chapter Summary 94
References 95
8 Pediatric Language Disorders II 99

Introduction 99
Criteria for a Diagnosis of ID 99
Down Syndrome (DS): General Characteristics 100
Epidemiology and the DS Phenotype 101
Language Characteristics in DS 102
Fragile X Syndrome: General Characteristics 104
Epidemiology of FXS 106
Language Characteristics in FXS 106
Chapter Summary 109
References 109
viii Introduction to Communication Sciences and Disorders: The Scientific Basis of Clinical Practice
9 Language Disorders in Adults 111

Introduction 111
Review of Concepts for the Role of The Nervous System In Speech, 111
Language, and Hearing
Cerebral Hemispheres 111
Lateralization of Speech and Language Functions 112
Language Expression and Comprehension Are Represented in 112
Different Cortical Regions of the Left Hemisphere
Connections Between Different Regions of the Brain 112
Perisylvian Speech and Language Areas of the Brain 112
Adult Language Disorders: Aphasia 114
Classification of Aphasia 114
Aphasia Due to Stroke: A Summary 122
Traumatic Brain Injury and Aphasia 124
Nature of Brain Injury in TBI 124
Language Impairment in TBI 125
Dementia 126
Brain Pathology in Dementia 126
Language Disorders in Dementia 127
Chapter Summary 128
References 128
10 Speech Science I 131

Introduction 131
The Speech Mechanism: A Three-Component Description 131
Respiratory System Component (Power Supply for Speech) 131
The Respiratory System and Vegetative Breathing 133
Speech Breathing 134
Clinical Applications: An Example 137
The Larynx (Sound Source for Speech) 138
Laryngeal Cartilages 138
Laryngeal Muscles and Membranes 139
Phonation 141
Characteristics of Phonation 142
Upper Airway (Consonants and Vowels) 145
Muscles of the Vocal Tract 146
Vocal Tract Shape and Vocalic Production 146
Velopharyngeal Mechanism 147
Valving in the Vocal Tract and the Formation of Speech Sounds 149
Coarticulation 149
Chapter Summary 150
References 151
11 Speech Science II 153

Introduction 153
The Theory of Speech Acoustics 154
The Sound Source 154
Contents ix
The Sound Filter 154

Vowel Sounds Result From the Combination of Source and 155
Filter Acoustics
Resonant Frequencies of Vowels Are Called Formants: Spectrograms 156
The Tube Model of Human Vocal Tract Makes Interesting Predictions 158
and Suggests Interesting Problems
A Spectrogram Shows Formant Frequencies and Much More 158
Speech Synthesis 159
Speech Recognition 159
Speech Acoustics and Assistive Listening Devices 159
Speech Perception 160
The Perception of Speech: Special Mechanisms? 160
The Perception of Speech: Auditory Theories 162
Motor Theory and Auditory Theory: A Summary 163
Top-Down Influences: It Is Not All About Speech Sounds 163
Speech Intelligibility 165
Chapter Summary 166
References 166
12 Phonetics 169
Introduction 169
International Phonetic Alphabet 170
Vowels and Their Phonetic Symbols 170
Consonants and Their Phonetic Symbols 174
Clinical Implications of Phonetic Transcription 176
Chapter Summary 177
References 178
13 Typical Phonological Development 179

Introduction 179
Phonetic and Phonological Development: General Considerations 180
Phonetic and Phonological Development 180
Phonetic Development 181
Phonological Development 181
Typical Speech Sound Development 181
Determination of Speech Sound Mastery in Typically Developing Children
183
Possible Explanations for the Typical Sequence of Speech Sound Mastery 183
Phonological Processes and Speech Sound Development 186
Phonological Development and Word Learning 188
Chapter Summary 188
References 188
14 Motor Speech Disorders in Adults 191

Introduction 191
Classification of Motor Speech Disorders 191
Dysarthria 193
Subtypes of Dysarthria 193
The Mayo Clinic Classification System for Motor Speech Disorders 193
x Introduction to Communication Sciences and Disorders: The Scientific Basis of Clinical Practice
The Dysarthrias: A Summary 201

Apraxia of Speech 203
Chapter Summary 203
References 204
15 Pediatric Speech Disorders I 205

Introduction 205
Speech Delay 206
Diagnosis of Speech Delay 207
Quantitative Measures of Speech Delay and Speech Intelligibility 208
Speech Delay: Phonetic, Phonological, or Both? 209
Additional Considerations in Speech Delay and Residual and 210
Persistent Speech Sound Errors
Speech Delay and Genetics 211
Childhood Apraxia of Speech 211
CAS Compared With Adult Apraxia of Speech (AAS) 212
CAS: Prevalence and General Characteristics 214
CAS: Speech Characteristics 214
CAS and Overlap With Other Developmental Delays 215
CAS and Genetics 215
Chapter Summary 216
References 217
16 Pediatric Speech Disorders II 219

Introduction 219
Childhood Motor Speech Disorders: Cerebral Palsy 220
Subtypes of Cerebral Palsy 220
Dysarthria in Cerebral Palsy 221
Childhood Motor Speech Disorders: Traumatic Brain Injury 224
and Tumors
Traumatic Brain Injury 224
Brain Tumors 225
Treatment Options and Considerations 226
Chapter Summary 227
References 227
17 Fluency Disorders 229

Introduction 229
Incidence and Prevalence of Stuttering 229
Genetic Studies 231
Diagnosis of Developmental Stuttering 231
The Natural History of Developmental Stuttering 231
Stage I: Typical Dysfluencies 232
Stage II: Borderline Stuttering 232
Stage III: Beginning Stuttering 233
Stage IV: Intermediate Stuttering 233
Stage V: Advanced Stuttering 233
Recovery of Fluency 234
Possible Causes of Stuttering 234
Contents xi
Psychogenic Theories 234

Learning Theories 235
Biological Theories 235
Acquired (Neurogenic) Stuttering 238
Symptoms of Neurogenic Stuttering Compared With 238
Developmental Stuttering
Treatment Considerations 239
Chapter Summary 239
References 240
18 Voice Disorders 243

Introduction 243
Epidemiology of Voice Disorders 243
Initial Steps in the Diagnosis of Voice Disorders 244
Case History 244
Perceptual Evaluation of the Voice 244
Viewing the Vocal Folds 244
Measurement of Basic Voice Parameters 245
Classification/Types of Voice Disorders 247
The Hypo-Hyperfunctional Continuum 248
Phonotrauma 249
Organic Voice Disorders 252
Functional Voice Disorders 252
Neurological Voice Disorders 254
Pediatric Voice Disorders 257
Prevalence of Childhood Voice Disorders 257
Types of Childhood Voice Disorders 257
Treatment of Childhood Voice Disorders 257
Chapter Summary 258
References 258
19 Craniofacial Anomalies 261

Introduction 261
Definition and Origins of Craniofacial Anomalies 261
Embryological Development of the Upper Lip and Associated 261
Structures
Embryological Errors and Clefting: Clefts of the Lip 264
Embryological Errors and Clefting: Clefts of the Palate 265
Cleft Lip With or Without a Cleft Palate; Cleft Palate Only 266
(Isolated Cleft Palate)
Epidemiology of Clefting 267
Speech Production in CL/P and CPO 267
Diagnosis of VPI 269
VPI and Hypernasality 270
VPI, Consonant Articulation, and Speech Intelligibility 270
Clefting and Syndromes 272
Cleft Palate: Other Considerations 273
Chapter Summary 274
References 275
xii Introduction to Communication Sciences and Disorders: The Scientific Basis of Clinical Practice
20 Swallowing 277
Introduction 277
Anatomy of Swallowing 277
Esophagus 277
Stomach 278
The Act of Swallowing 278
Oral Preparatory Phase 280
Oral Transport Phase 280
Pharyngeal Phase 280
Esophageal Phase 282
Overlap of Phases 282
Breathing and Swallowing 282
Nervous System Control of Swallowing 283
Role of the Peripheral Nervous System 283
Role of the Central Nervous System 283
Variables That Influence Swallowing 284
Bolus Characteristics 284
Development 285
Age 286
Measurement and Analysis of Swallowing 286
Videofluoroscopy 286
Endoscopy 287
Client Self-Report 287
Health Care Team for Individuals With Swallowing Disorders 288
Chapter Summary 289
References 290
21 Hearing Science I: Acoustics and Psychoacoustics 293

Introduction 293
Oscillation 294
Waveform 295
Spectrum 295
Waveform and Spectrum 295
Resonance 297
Psychoacoustics 297
Pitch 298
Loudness 299
Sound Quality 299
Chapter Summary 300
References 300
2
2 Hearing Science II: Anatomy and Physiology
301
Introduction 301
Temporal Bone 301
Peripheral Anatomy of the Ear 303
Outer Ear (Conductive Mechanism) 303
Middle Ear 305
Inner Ear (Sensorineural Mechanism) 307
Chapter Summary 314
References 314
Contents xiii
23 Diseases of the Auditory System and 317

Diagnostic Audiology
Introduction 317
Hearing Evaluation 317
Case History 318
Otoscopy 318
Immittance 319
Tympanometry 320
Acoustic Reflex Threshold 323
Audiometric Testing 324
Physiological Responses 327
Vestibular Assessment 332
Audiometric Results 334
Type, Degree, and Configuration of Loss 335
Hearing and Balance Disorders 337
Patient Examples 337
Chapter Summary 341
References 341
24 Assistive Listening Devices 343

Introduction 343
Hearing Aids 343
Steps in Selecting and Fitting a Hearing Aid 343
Types of Hearing Aids 345
Hearing Aid Components 348
Auditory Implantable Devices 349
Bone-Anchored Implant 350
Middle Ear Implant 351
Cochlear Implant 354
Chapter Summary 355
Hearing Aids 355
Auditory Implantable Devices 356
References 356
25 Aural Habilitation and Rehabilitation 359

Introduction 359
Aural Habilitation 360
Assessment of Communication Needs in Children 360
Pediatric Intervention 362
Components of a
Family-Centered Intervention 364
Auditory Training in Aural Habilitation 365
Communication Options 365
Outcome Measures for Children 367
Aural Rehabilitation 367
Assessment of Communication Needs in Adults 367
Adult Intervention 368
Auditory Training in Aural Rehabilitation 368
Communication Strategies 369
Speechreading 369
xiv Introduction to Communication Sciences and Disorders: The Scientific Basis of Clinical Practice
Outcome Measures for Adults 370

Group Aural Rehabilitation 371
Chapter Summary 372
Aural Habilitation 372
Aural Rehabilitation 372
References 373
Index 375
Preface
Introduction to Communication Sciences and Disorders: The Scientific Basis of Clinical Practice is a
textbook designed and written for undergraduate students who enroll in a course that lays out
the scientific foundations for the clinical disciplines of speech-language pathology and audiol-
ogy. The great majority of departments in our field that offer an undergraduate major have
a regularly taught introductory course among their course offerings. Introductory courses in
any field, whether in psychology, anthropology, linguistics, or communication sciences and
disorders (hereafter, CS&D), are survey courses in which nearly all aspects of a field are pre-
sented. For academic disciplines that have many aspects — and most do — breadth of coverage
takes precedence over depth of coverage. Simplification of complicated material is inevitable,
and long-standing, ongoing debates in a field cannot be described in detail. An introductory
course in CS&D is subject to these characteristics, and these constraints. That being said, we
have attempted to provide a carefully measured depth in each chapter, in the hope of con-
veying the sense of excitement in the continuing expansion of the scientific basis of clinical
practice in CS&D.
This textbook is organized with a general plan of matching individual chapters to indi-
vidual lectures, or perhaps to one-and-one-half lectures. The textbook is written to give the
instructor the option of not including selected chapters in the classroom lectures, or not
assigning them as required reading material, if that is desired. For example, there are two
chapters that present information on pediatric language disorders, and two chapters that pres-
ent information on pediatric speech sound disorders. For each pair of chapters, one chapter
presents information on two or three disorders, and the other presents information on two or
three other disorders. An instructor who decides to present examples of a particular pediatric
language or speech sound disorder can surely choose one chapter for a lecture and assign
(or not) the other chapter for reading. The same can be said of several other chapters in the
textbook. In this sense, we believe the textbook is a flexible instructional companion for both
instructors and students.
The graduate training of speech-language pathologists (SLPs) and audiologists (AuDs) is
a significant mission of CS&D departments. Communication Sciences and Disorders is, at its
core, a clinical discipline. But if a clinical endeavor is to be disciplined, the core must include
material that supports and motivates clinical practice with knowledge that has emerged from
the research laboratory. This text is primarily concerned with the scientific basis of clinical
practice, the former being a first step to qualify for the latter professional skill.
Clinical information is not ignored in the textbook. In fact, all chapters that present the
nature of language, speech, and hearing disorders include some information on diagnosis
and treatment of communication disorders. In some chapters, this information is integrated
with the presentation of the main material, in others a brief section describes clinical issues
relevant to the communication disorder(s) under discussion. A fixed formula is not used for
the inclusion of clinical information in various chapters of the textbook; rather, in each chapter
that presents information on communication disorders, the clinical information is placed in
the location that seemed (in our opinion) to make the most sense.
xv
xvi Introduction to Communication Sciences and Disorders: The Scientific Basis of Clinical Practice
Curricula in departments of CS&D are structured to include classes on typical and dis-
ordered language, on typical and disordered speech, and on typical and disordered hearing.
This is to say that language, speech, and hearing occupy three different categories of course-
work. The categories are organized more for the structure of a curriculum, rather than a belief
that language, speech, and hearing processes are separate. They are not. The integrated nature
of language, speech, and hearing processes, whether typical (normal) or disordered, is known
by all clinicians and scientists concerned with communication sciences and disorders. For
example, a child who is seen in the clinic for a delay in the mastery of speech sounds often has
delays in language acquisition as well, and is at risk for reading delays. Similarly, an Ameri-
can child who is born deaf may have delays in oral language development but have typical
language development in American Sign Language (ASL).
This textbook follows the approach of separating language, speech, and hearing chapters.
But we ask students to keep in mind that this is a teaching decision (much like the organiza-
tion of courses, as stated earlier), not a statement that the areas are separate. Language chap-
ters are presented first, followed by speech chapters and then hearing chapters; this sequence
is arbitrary. One of us (GW) taught the introductory course in the University of Wisconsin–
Madison CS&D department for 20 years, changing the order of the language, speech, and
hearing categories several times to see if one sequence was more effective than others; the
order did not seem to make a difference.
The textbook covers a lot of information; this is a necessary feature of a text designed to
be the primary reading material for a survey course in communication sciences and disorders.
Some areas of the field may be mentioned only briefly, which does not mean we believe they
do not merit careful discussion. Decisions were made to limit discussion of certain areas to a
minimum to accommodate the goal of a compact textbook.
Two final comments are in order. First, the use of pronouns is an efficient and straight-
forward way to construct sentences in a textbook with frequent references to people. In cases
(which constitutes most of the uses) we have chosen to limit pronouns to “he” and “she,” and
to alternate between the use of the two when the reference is to a person who is (for example),
a clinician or person seeking services. Second, the pattern and extent of citations vary across
chapters. Every effort has been made to provide interested students and instructors with up-
to-date references, and with review papers that provide overviews of the current state of both
the research and clinical aspects of a topic under study.
We hope the textbook and the course are effective in creating an enhanced understanding
of the importance of successful communication, and of the need to understand the impact of
a communication disorder on every aspect of an individual’s life.
Happy learning!
Acknowledgments
Kalie Koscielak, Valerie Johns, and Angie Singh, we are indebted to you for years of support
and encouragement.
Susan Ellis Weismer had a profound influence on the shaping of Chapters 3, 5, 6, 7, and 8.
Professor Ellis Weismer read and reread successive drafts of these chapters, each time making
spot-on suggestions for revision. We cannot thank her enough.
Once again, as it is with previous textbooks, Maury Aaseng’s beautiful artwork is a defining
feature of this textbook. Thanks, Maury.
Thanks to Professor Susan Thibeault, and Eileen Peterson, for their gracious offer and prepa-
ration of images for Chapter 18.
Thanks to Denny and Shelley Weismer for the photo of Friday, their African gray.
Thanks to Professor Jenny Hoit for her enormous and generous influence on several parts of
this textbook.
Anna Ollinger read drafts of several chapters and made excellent suggestions for clarification
of concepts and organization.
Thanks to Professor Steven Kramer for his influence on the audiology portions of this textbook.
The people named are not responsible for any errors that may exist in the textbook; whatever
errors exist are solely our responsibility.
xvii
Reviewers
Plural Publishing, Inc., and the authors would like to thank the following reviewers for taking
the time to provide their valuable feedback during the development process:
Gretchen Bennett, MA, CCC-SLP Breanna Krueger, PhD, CCC-SLP

NYS Licensed Speech-Language Pathologist University of Wyoming
Coordinator of Speech-Language Clinical
Services Florence Lim-Hardjono, MA, PhD
Clinical Associate Professor/Supervisor (ABD), CCC-SLP
SUNY at Buffalo Speech-Language and Mount Vernon Nazarene University
Hearing Clinic
Avinash Mishra, PhD, CCC-SLP
Kate Bunton, PhD, CCC-SLP University of Connecticut
Associate Professor
Speech, Language, and Hearing Sciences Elisabeth A. Mlawski, PhD, CCC-SLP
University of Arizona Assistant Professor
Monmouth University
Jaime Fatás-Cabeza, MMA
Associate Professor Nikki Murphy, MS, CCC-SLP
Director of Translation and Interpretation University of Nevada, Reno
Department of Spanish and Portuguese
University of Arizona Kelly S. Teegardin, MS, CCC-SLP, LSLS
Cert AVT
Vicki L. Hammen, PhD, CCC-SLP Instructor I
Professor and Program Director Communication Sciences and Disorders
Communication Disorders University of South Florida
Indiana State University
Angela Van Sickle, PhD, CCC-SLP
Jennifer M. Hatfield, MHS, CCC-SLP Texas Tech University Health Sciences
Speech-Language Pathologist Center
Clinical Assistant Professor
Indiana University, South Bend Jason A. Whitfield, PhD, CCC-SLP
Bowling Green State University
Rachel Kasthurirathne, MA, CCC-SLP
Indiana University, Bloomington
xix
For Susan
For Dianne
1
Introduction to Communication
We would build a profession independent domination of medical and Freudian perspectives on

of medicine or psychology or speech, speech disorders. In 1925, approximately 25 individuals
established an independent society called the American
based in colleges and public schools.
Academy of Speech Correction. This society was intended
— Van Riper, 1981 as a research organization. One of the charter members
of this organization was Dr. Sara Mae Stinchfield, who
was the first person in the United States to be awarded
Introduction: Communication a PhD (from the University of Wisconsin) in the field
Sciences and Disorders of Speech Pathology. In 1929, the organization changed
as a Discipline its name to the American Society for the Study of Disorders
of Speech. The word “Study” in the organization’s new
This is how Charles Van Riper, one of the pioneers of name highlighted the scientific goals of the group. This
the field of Communication Sciences and Disorders, contrasted with the more practical but (in the opinion
remembered the early 20th-century beginnings of the of some of the founding members of that society) less
discipline. From the time he began to speak as a child, lofty goal of treating Communication Sciences and Dis-
Van Riper had a severe stuttering problem. In young orders. “Speech teachers,” or people who attempted
adulthood, he continued to stutter and desperately to help individuals with problems such as stuttering,
sought a “scientific” explanation for his problem. He articulation disorders, language delay, speech and lan-
reasoned that if an explanation could be identified guage problems associated with neurological disease,
through a program of systematic discovery — a pro- or unintelligible speech resulting from absence or loss
gram of scientific research — treatment methods would of hearing, were well known in society but certainly
follow from the explanations, perhaps leading to a cure not professional mainstays in schools and hospitals.
for stuttering. The newly minted American Society for the Study
Van Riper interacted with a small group of indi- of Disorders of Speech struggled a bit because of small
viduals, several of whom were also people who stut- membership and some disagreements among mem-
tered; jointly they decided to break away from the bers. As recounted by Van Riper (1981), several of the
1
2 Introduction to Communication Sciences and Disorders: The Scientific Basis of Clinical Practice
influential members wanted the group to focus on sci- founded in 1988, in recognition of the need for an orga-
entific investigation of stuttering, but others saw the nization whose primary purpose would be serving the
world of Communication Sciences and Disorders more profession of Clinical Audiology. Many of the 12,000+
broadly. Pauline Camp, who was serving as the head of Audiologists who are members of ASHA are also mem-
speech correction in the State of Wisconsin, proposed bers of the American Academy of Audiology.
that the field could grow by establishing speech cor- There is a difference between the perspectives of
rection clinics in universities. These clinics would train ASHA and AAA on the right to practice Clinical Audi-
future “speech correctionists” as well as scientists inter- ology. ASHA currently argues that a Clinical Audiolo-
ested in the nature and cause of speech disorders. As gist must have a Certificate of Clinical Competence in
trained clinicians found employment in public schools Audiology (CCC-A), issued by ASHA, as the proper
and demonstrated their ability to help children with credential for the practice of audiology. AAA’s posi-
speech problems, the need for additional trained pro- tion is that the CCC-A is not necessary for the practice
fessionals would increase, and the American Society of audiology; what is required is that students-in-
for the Study of Disorders of Speech would grow. training in audiology have a sequence of courses that
Camp’s proposed strategy for growing the pro- is recognized as the foundation for training profes-
fession was right on target. University programs were sional audiologists, and that a year of professional
developed, with the training of “service providers” work (much like an internship) follows the completion
(clinicians) and scientists conducted in the same envi- of the coursework training. In the view of AAA, this
ronment. The guiding principle of this training concept training prepares the student for state licensure as a
was the presence of clinicians and scientists in a com- Clinical Audiologist, which when obtained provides
mon environment, teaching each other and enhancing the “legal” right to practice clinical audiology. The
their respective knowledge and performance. Scien- different perspectives on the credentials needed by
tists formulated more specific and worthy research trainees to practice clinical audiology are complicated;
questions by obtaining information about the clinical readers are encouraged to visit https://www.audiology
details of communication problems in actual patients, .org/publications-resources/document-library/audi-
and clinicians sharpened their diagnostic procedures ology-licensure-vs-certification. There is a concerted
and practice techniques by learning from the research. effort among several different associations, including
This training model has persisted until the present day, ASHA and AAA, to resolve these different perspec-
and has been successful. tives (https://www.asha.org/uploadedFiles/Aligned-
In 1934, the young speech organization, much Sense-of-Purpose-for-the-Audiology-Profession.pdf).
larger than it was in 1930, was reconstituted under a
third name: the American Speech Correction Association.
This name stuck until 1947, when the association was Communication Sciences and
renamed the American Speech and Hearing Association, Disorders: The Whole Is Greater
or ASHA. In 1978, the group was renamed the Ameri- Than the Sum of Its Parts
can Speech-Language-Hearing Association, to recognize
the equivalent importance of language function (as When Van Riper remembered the early vision of a dis-
compared to the act of producing speech, or the ability cipline “independent of medicine or psychology or
to hear) in the understanding of normal and disorder- speech,” he was not thinking of abandoning the con-
ed communication function. The association has tent of these other fields of study. Rather, he imagined
retained this name to this day but is still referred to an academic and clinical field with a separate identity,
as “ASHA.” forged from the concepts and facts of medicine, psy-
As of 2018, ASHA reported a membership (including chology, and other disciplines, but clearly something
student members) of 203,945 individuals (https://www different and new — a field with its own identity, able
.asha.org/uploadedFiles/2018-Member-Counts.pdf). to stand on its own merits. It is comically ironic (to
Among the members of ASHA are 12,480 who this author, at least) that over the past 10 to 15 years,
have their primary training in Audiology and practice two buzzwords on college campuses have been “inter-
as Clinical Audiologists. Many of these professionals disciplinary research” and “translational research.”
are also members of the American Academy of Audiol- The field of Communication Sciences and Disorders
ogy (AAA), an organization whose mission is to define embraced these two activities — in fact, defined itself by
the training and practice guidelines for professionals an interdisciplinary and translation mentality — long
who work as clinical audiologists (https://www.audi- before they became fashionable and fundable claims in
ology.org/about-us/academy-information). AAA was university settings.
1 Introduction to Communication Sciences and Disorders 3
An Interdisciplinary Field critical to language usage. The clinician and scientist

in Communication Sciences and Disorders deal with
Communication Sciences and Disorders is a field prac- disorders of language structure and usage, and must
ticed and studied by individuals with expertise in a therefore have expertise in the broad areas of hearing,
variety of academic and clinical disciplines. It is truly cognition, and linguistics.
interdisciplinary, the product (but not merely the sum)
of many different areas of knowledge. Speech is pro-
duced by moving structures of the respiratory system,
The term “cognitive-linguistic” refers
larynx, and vocal tract (the latter sometimes referred
to psychological processes applied to
to as the “upper articulators,” including the tongue,
the use of language forms. “Cogni-
lips, and jaw). Scientists and clinicians who are inter-
tion” refers to several psychological processes,
ested in communication disorders must understand
including memory; executive function (e.g.,
the anatomy (structure) and physiology (function) of
planning behavior, connecting current behavior
these body parts. When a person speaks, air pressures
with future consequences); the development,
and flows are generated throughout the speech mech-
refinement, and stabilization of mental represen-
anism, and an acoustic signal (what you hear when
tations; brain computation speeds; and transfer
someone talks) is emitted from the lips and/or nose.
of information from one type of memory (e.g.,
An understanding of these aerodynamic and acoustic
short-term memory) to another (e.g., long-term
phenomena of speech requires at least a foundation of
memory). These various aspects of cognition are
knowledge of basic physics.
listed here as separate processes but in fact may
When the acoustic signal emerges from the talk-
overlap and in some cases be different reflections
er’s mouth (or nose), it is metaphorically “aimed” at
of a single psychological process. “Linguistic”
another person who receives it through his or her audi-
refers to any aspect of language form — sounds,
tory mechanism. This makes it clear that the anatomy
words, sentences, tone of voice, and so forth. The
and physiology of the auditory system must be mas-
term “cognitive-linguistic” is used here to indi-
tered by the person specializing in Communication
cate that the psychological processes previously
Sciences and Disorders. As with the process of speech
listed (among others) are applied to language
production, hearing and comprehending acoustic sig-
forms and therefore to communication. The
nals involve complex mechanisms understood prop-
same cognitive processes are applied to other
erly only with a decent amount of knowledge in the
forms of knowledge, as well (such as spatial
areas of anatomy, physiology, and physics (and other
reasoning or mathematics).
areas as well).
Of course, when talkers produce speech, they want
to communicate a message. The nature and structure
of the message — what is being communicated, and We are not done. Because speech and language
the form it takes when it is spoken — is determined by develop throughout infancy and childhood and may
linguistic-cognitive processes. For example, linguistic- change throughout the lifetime and especially in old
cognitive processes are set into motion by the simple age, expertise in Communication Sciences and Disor-
act of asking someone to have coffee. An idea must be ders requires a solid knowledge of child development
developed and structured in linguistic terms according and aging. Most obvious, perhaps, is the need to have
to the intent and wishes of the person doing the ask- a broad and deep expertise concerning the many dis-
ing. The idea is something like, “I want to spend time eases and conditions associated with speech, hearing,
with this person and suggesting we have coffee at a and language disorders. Extensive medical knowledge
comfortable café seems like a good approach,” but the is absolutely necessary to function as an effective spe-
manner in which this “want” is structured as a message cialist in Communication Sciences and Disorders. This
can vary wildly, depending on many factors. “Would knowledge ranges from how surgeries on structures of
you like to have coffee?” “Hey, how ’bout we grab (for example) the brain, tongue, and ear affect speech,
some coffee?” “I’m really sleepy, let’s stop at Completely hearing, and language function, to how pharmaceutical
Wired and get some coffee.” “I’d really like to talk to interventions (such as drugs for Parkinson’s disease, or
you over coffee.” “Let’s have a no-obligation date over schizophrenia, or even chronic arthritis) may change a
coffee.” “Coffee?” These different ways to convey the patient’s ability to communicate.
same message reflect variation in underlying cognitive Finally, legal and technical issues are relevant to the
processes and linguistic structure, both of which are profession of Communication Sciences and Disorders.
These issues concern a person’s right to receive the Table 1–1. Some Areas of Knowledge Required
proper services when he or she has a speech, hearing, for People to be Effective Professionals in the Field of
or language disorder, as well as the requirements for Communication Sciences and Disorders
professional accreditation as someone who can pro-
vide services or train people to provide services, or the Neuroscience
requirement of extensive training in research to mentor Brain anatomy (structure)
students who intend to devote their careers to research. Brain physiology (function)
Our field has been fortunate to have professional lead- Neuropharmacology (chemicals and their role in
ers who can lay claim to both clinical and research brain function)
expertise. Motor control (how brain controls movement)
Table 1–1 provides a partial summary of the areas Sensory function (how brain processes sensation)
of knowledge and, in many cases, expertise, required
of the professional in Communication Sciences and Anatomy and Physiology of the Speech Mechanism
Disorders. This list includes the areas previously men- (muscles, ligaments, membranes, cartilages, etc.,
associated with the respiratory, laryngeal, and upper
tioned and adds a few more for good measure. There
airway system, which collectively are called the
are (at least) two ways to react to this list. One is to
“speech mechanism”)
feel intimidated by the need to know so much about so
many areas. The other is to look at the combination of Anatomy and Physiology of the Hearing Mechanism
these different types of knowledge as something spe- (bones, membranes, ligaments, special structures of
cial, as an opportunity to be informed about many dif- the ear)
ferent areas of study and, most importantly, to employ Child Development
an integrated and synthesized fund of this information
in an understanding of the most human of behaviors, Aging
communication. Of course, a single individual is not Diseases of the Head, Neck, Respiratory System,
likely to be an accomplished expert in each of these Auditory System, and Brain
areas, but a commitment to learn the basic principles
of each of the disciplines listed in Table 1–1, to use this Syndromes
knowledge when providing clinical services to a per- Physics
son with a communication disorder, to function as an Aerodynamics
effective member of a clinical or research team, or to
Acoustics
develop an answer to a research question, is genuinely
Movement
exciting. Communication Sciences and Disorders is the
original, lifelong learning discipline.1 Cognition
Memory and Processing
Planning
Translational Research Manipulation and Use of Symbols
Researchers and clinicians are often trained in the same Linguistics

department and yet do not interact professionally to a Phonetics and Phonology
significant degree. This has been a concern in various Morphology
branches of medicine, as well as in departments such Syntax
as Psychology, and Communication Sciences and Dis- Semantics
orders. Many scientists in these professions are trained Pragmatics
to do something they understand as “basic science.”
In basic science, research questions are asked for the
sake of improving the knowledge base in a field, or to
address purely theoretical questions. An assumption of not need to be motivated or prompted by immediate
this approach to research has been that basic science, clinical concerns; any improvement in knowledge of
if done well, will eventually have an effect on clinical the world must have implications for the betterment
practice. In this way of thinking, “basic science” does of humankind.
1
As a university professor in Communication Sciences and Disorders, I more than once told students that it was hard to believe someone was
willing to pay me to come to my office every day, learn new things in many different areas, and use this information in my research, in the
classroom, and in mentoring teaching (one-on-one instruction, as with graduate students training to be researchers).
Let’s consider an example of a possible link between The relatively new buzzword for applied sci-
basic science and clinical application. A fair number of ence is “translational research,” or research in which
scientists have investigated birdsong and its relation- the results of basic science can be translated relatively
ship to the evolution of human language (reviews can quickly to clinical application. The hypothetical vocal
be found in Fitch, 2000, 2006, and Deacon, 1998). Much exercise study is one example of translational research;
of this work has been funded by a federal agency, the many others have been proposed (see Ludlow et al.,
National Institutes of Health (NIH), whose primary 2008; Raymer et al., 2008). The National Institutes of
mission is to sponsor research that ultimately improves Health (NIH), the federal agency having the mission
health care in the United States. The research on bird- of funding and setting priorities for health-care-related
song (and vocalizations produced by other, nonhuman research activities in the United States, published in
species) has been “sold” to the federal agency by claim- 2008 the following text on its website concerning trans-
ing potential links between, on the one hand, an under- lational research:
standing of why and how birds sing, and on the other
hand, a better understanding of speech and language To improve human health, scientific discoveries must
capabilities in humans. The link between birdsong and be translated into practical applications. Such dis-
human communication is evolutionary, in which bird- coveries typically begin at “the bench” with basic
song is a “step” along the evolutionary path to human research — in which scientists study disease at a molec-
vocalization for purposes of communication. The rea- ular or cellular level — then progress to the clinical
soning is extended by arguing that a better understand- level, or the patient’s “bedside.”
ing of the basic “mechanisms” of vocal communication,
which can be studied in birds using techniques that Scientists are increasingly aware that this bench-
cannot be used in humans,2 should eventually lead to a to-bedside approach to translational research is a two-
better understanding of the partial or complete failure way street. Basic scientists provide clinicians with new
of similar mechanisms in humans. A better understand- tools for use in patients and for assessment of their
ing of disease-related problems in human vocalization impact, and clinical researchers make novel observa-
should, this reasoning concludes, result in better ways tions about the nature and progression of disease that
to diagnose and treat human vocalization disorders. often stimulate basic science. See https://nexus.od.nih
Basic science such as work on birdsong has been .gov/all/2016/03/25/nihs-commitment-to-basic-sci
criticized for occupying federal funds that might be ence/ for a summary of the benefits of funding both
used to fund “applied” research. “Applied science” is kinds of research.
research with more immediate clinical consequences, The National Institute on Deafness and Other Com-
research with less distance between the results of a munication Disorders (NIDCD), the NIH institute that
study and its potential use in clinical settings. For is the primary funder of research in Communication
example, funding could be provided for a research Sciences and Disorders, has a specific funding program
program in which participants with healthy voices are for translational research (as of 2017). This funding
enrolled in a vocal exercise regime (like the kind of mechanism is called the Research Grants for Translating
warm-up exercises used by many professional singers) Basic Research into Clinical Tools. The stated objective and
and compared to a group of participants who do not requirements of these grants are as follows:
engage in this exercise (a “control group”). The applied
research question is, do nonspeech vocal exercises gen- [T]o provide support for research studies that trans-
eralize, or translate, to the use of the voice in everyday late basic research findings into better clinical tools for
speech? Perhaps the effect of the vocal exercise could human health. The application should seek to trans-
be evaluated by having listeners judge the quality of late basic behavioral or biological research findings,
participants’ voices, with the critical comparison being which are known to be directly connected to a human
the “goodness” (pleasing quality?) of voices pre- versus clinical condition, to a practical clinical impact. Tools
postexercise. This is basic, nonclinical research — non- or technologies advanced through this FOA [Funding
clinical because the participants do not have voice dis- Opportunity Announcement] must overcome existing
orders — but a positive result, where exercise produces obstacles and should provide improvements in the
a more pleasing voice, points more directly to a specific diagnosis, treatment or prevention of a disease process.
clinical application in patients with voice problems. For the purposes of this FOA, the basic science advance-
2
uch as creating a small area of brain damage to see how it affects the development of birdsong, or depriving a newborn bird of exposure
S
to his or her species’ song to determine if, as the baby bird matures, the song develops in the same way as in birds who are exposed to their
song from birth.
ment must have previously demonstrated potential for ment on 21st-century university campuses and in gov-
clinical impact and the connection to a human clinical ernment funding agencies. It is quite another thing to
condition must be clearly established. The research claim scientific success as the result of interdisciplinary
must be focused on a disease/disorder within one or efforts, or to show that basic science has indeed been
more of the NIDCD scientific mission areas: hearing, translated to clinical application. A major goal of this
balance, smell, taste, voice, speech, or language. text is to present introductory information on normal
Research conducted under this FOA is expected to and disordered communication processes in a way that
include human subjects. Preclinical studies in animal highlights previous, and the latest, scientific findings
models are allowed only for a candidate therapeutic that have emerged from interdisciplinary thinking. For
that has previously demonstrated potential for the the time being, the reader is asked to trust the claim
treatment of communication disorders. The scope of that the growth of the scientific basis of normal com-
this FOA allows for a range of activities encouraging the munication processes, and Communication Sciences
translation of basic research findings to practical impact and Disorders, has been nothing short of spectacular
on the diagnosis, treatment, and prevention of deafness over the last 50 years. None of this would have been
and other communication disorders. [https://grants possible if speech, language, and hearing scientists
.nih.gov/grants/guide/pa-files/PAR-17-184.html] had not been open to the influences and thinking of
scientists in areas such as linguistics, physiology,
The first statement presents the issue of “trans- neuroscience, and psychology (among others). Most
lational research” with molecular or cellular work as importantly, the openness of these scientists to the
the basic science, but basic science exists at the behav- experience and knowledge of clinical speech-language
ioral level of analysis, as well. This is why the NIDCD pathologists and audiologists has made a huge differ-
description mentions a “range of activities” in its mis- ence to the growth of the scientific knowledge base in
sion to fund translational research in Communication normal and disordered communication.
Sciences and Disorders. It is not a goal of this text to present detailed
Both of these NIH statements imply that it is the information on therapy (management) techniques for
basic scientist’s obligation to show how laboratory persons with speech, language, and/or hearing disor-
results can be “translated” to clinical settings. This is in ders. Readers will learn a great deal about speech, lan-
contrast to earlier models of the basic science/applied guage, and hearing disorders, but a full treatment of
science dichotomy, in which the basic scientist might clinical processes and procedures is a topic for a more
have said, “I’ll do the bench work (very basic science) advanced course of study, typically in graduate pro-
and down the road, perhaps way down the road, clini- grams (see later in the chapter).
cians can figure out how to use my findings when they An aspect of the clinical process that is discussed
diagnose and treat patients.” In this view, the clinician, throughout this text is the diagnosis of speech, lan-
not the scientist, has the primary responsibility for guage, and/or hearing disorders. Technically, diagno-
translating the basic science to clinical application. The sis involves the identification and determination of the
second paragraph of the statement sounds remarkably nature and cause of a disorder. Notice the inclusion
similar to the concept, described previously, of train- of “nature” in this definition. Proper techniques must
ing “speech correctionists” in university settings where be employed to describe a disorder and to document
clinical practice informs the direction of research pro- the characteristics of a communication disorder that
grams, and research findings enhance clinical practice. make it different from other communication disorders.
Pauline Camp suggested this concept in 1934, and our A good part of this text is therefore devoted to descrip-
discipline has been guided by the “two-way street” tions of how we know a specific speech, language,
philosophy since that time. As a field, we have under- and/or hearing disorder is “x” and not “y.”
stood the potential value of “translational research” for This text does not shy away from controversies in
a long time. our field about the nature and causes of certain com-
munication disorders. As in any health-care-related
field, many diagnoses remain unclear and are the
Does the Basic Science Work? subject of ongoing debate. In the best of all worlds
Does the Clinic Work? (sorry,Voltaire), we would welcome absolute certainty
concerning the diagnosis of human diseases and con-
It is all well and good to claim that people in the field ditions. The world-as-is, however, does not allow such
of Communication Sciences and Disorders understood certainty, but let’s not regard the gray areas as defeats;
the value of interdisciplinary work, and practiced they are opportunities. Uncertainty and controversy
translational research well before the concept was so have always been the engines of scientific advance-
christened and attained the status of an official movement. Not knowing, or disagreement about what we
do know, pushes science forward. Diagnosis, then, disorders, the concept of evidence-based practice (EBP)
is a critical part of the scientific underpinnings of a and its role in speech, language, and/or hearing ther-
health-care-related discipline such as Communication apy is integral to an understanding of how knowledge
Sciences and Disorders. In many cases, questions con- of typical and disordered communication is related to
cerning clinical diagnosis and the basic science founda- treatment of communication disorders.
tion of our field are completely intertwined. EBP, a movement with roots in the medical world,
The second part of the heading for this section asks, takes as its central concept that any treatment approach
“Does the Clinic Work?” Do speech-language patholo- should be supported by scientifically based evidence
gists and audiologists make a difference in the lives of of the treatment’s effectiveness (the term “efficacy”
people with communication disorders? Although this is often used to refer to effectiveness of a therapy
text does not present detailed information about treat- procedure, but the technical sense of “efficacy” is an
ment of communication disorders, there is widespread experimental demonstration that a particular clinical
evidence for treatment success. technique shows promise as an effective management
It is important for the reader to know that many tool; it is like a first step in the determination of a treat-
of the services offered by clinicians in our field have ment’s real-world effectiveness). The need to formalize
been documented as being effective. In the absence such a notion may at first glance seem surprising, for
of such documentation, the entire enterprise of train- should a treatment not be administered in the absence
ing clinicians to treat communication disorders could of solid evidence that it works? Again, in the best of all
be questioned. Fortunately, our interdisciplinary and worlds this would be so, but in much of medicine and
translational approach to understanding communica- behavioral sciences, including Communication Sciences
tion disorders has produced diagnosis and manage- and Disorders, the effectiveness of treatments is often
ment techniques that are effective for many patients. unknown or only partially supported by research data.
A selective sampling of publications in which this EBP must be based on proper outcome measures.
clinical success is reviewed includes results for voice Evidence for the success of a therapeutic approach
therapy (Angadi, Croke, & Stemple, 2017; Desjardins, requires the measurement of one or more variables
Halstead, Cooke, & Bonilha, 2017; Ramig &Verdolini, after (or sometimes during) the treatment. Outcome
1998; Ruotsalainen, Sellman, Lehto, Jauhiainen, & measures should have the best possible face validity,
Verbeek, 2007), hearing disorders (Ferguson, Kitter- meaning that the measures provide good indices of
ick, Chong, Edmonson-Jones, Barker, & Hoare, 2017; the phenomena they are supposed to represent. An
Kaldo-Sandström, Larsen, & Andersson, 2004; Mendel example from basketball helps to understand face
2007), stuttering (Baxter et al., 2015; R. Ingham, J.C. validity of outcome measures. If an outcome measure
Ingham, Bothe, Wang, & Kilgo, 2015; Tasko, McClean, is desired for a player’s in-game shooting accuracy fol-
& Runyan, 2007), childhood articulatory disorders lowing several months of intense practice of nongame,
(Gierut, 1998; Wren, Harding, Goldbart, & Roulstone, unguarded shooting, the percentage of shots made
2018), and childhood language disorders (Law, Garrett, over 100 attempts has good face validity if the measure
& Nye, 2008; Tyler, Lewis, Haskill, & Tolbert (2003). is taken during games. The measure has much poorer
Students who obtain undergraduate and graduate face validity if the measure is taken over 100 shots
degrees in our field learn the scientific basis and tech- attempted during multiple games of HORSE. Shoot-
nical details of these successful clinical strategies. This ing percentage during games is a much better outcome
is not to say that we have conquered all, or even many, measure for “real-world” shooting as compared to
of the communication disorders affecting people shooting percentage during games of HORSE.
around the world. Indeed, there is a substantial amount An example from health care, closer to the con-
of disagreement concerning precisely what constitutes cerns of this textbook, is one of drug treatment for epi-
therapy “success” for people with communication lepsy for which there may be the potential for multiple
disorders, and a specific therapy technique may work outcome measures with face validities that are only
for some patients but not others. But the articles listed subtly different. The question is, after 6 months of drug
previously show a pattern of success for many com- treatment, are there fewer seizures as reported by the
munication disorders; continuing research will add to patient (one potential outcome measure)? As reported
this list. by the patient, are there no seizures over the same time
period (a second potential outcome variable)? After
6 months of drug treatment, can a seizure be induced in
Evidence-Based Practice the clinical setting by very bright flashing lights (a third
potential outcome variable)? Or, after 6 months, are the
Although this text does not present detailed informa- blood levels of the drug in the “correct” range based
tion on management (treatment) of communication on values reported in the scientific literature (a fourth
potential outcome variable)? At first glance, the first Levels of Evidence

two outcome measures have the best face validity — the
Level I and II evidence are usually based on large
best evidence for reduction of seizures is a report from
numbers of participants to generate the most reli-
a patient that seizure episodes have been reduced or
able statistical results. In Table 1–2, Level I evidence
eliminated. Some clinicians and scientists, however,
is summarized as “systematic reviews” or “meta-
may think that patient-reported data are unreliable
analyses” of RCTs. An RCT is an experiment in which
because they are subject to the notorious uncertainties
each individual from an initial, large pool of partici-
of memory or even a patient’s misrepresentation of sei-
pants is randomly assigned to one of two (or more)
zure history. Measures such as inducement of a seizure
treatments. Ideally, neither the experimenters nor the
by flashing lights or drug blood levels are regarded as
participants have knowledge about which treatment
more objective (and have a clearly quantitative basis)
has been assigned to any participant in the study. The
and therefore may seem more reliable than the patient
participants (and in many cases, the experimenters)
reports of seizure history. Yet, from the perspective of
are “blind” to which participants have been assigned
the patient, inducement of a seizure in a controlled
to which treatments, and the participants are “blind”
clinical setting or “good” drug blood levels mean very
to the status of their treatment condition (real treat-
little when he or she is losing consciousness two or
ment group, or placebo group). This is an example of a
three times a week or even having many episodes of
“double-blind” experiment.
preseizure activity.
In Table 1–2, Level I evidence is summarized as
The choice of a proper outcome measure (or mea-
“systematic reviews” or “meta-analyses” of RCTs.
sures) is not straightforward and is often the subject
A systematic review is the organization and evaluation
of considerable debate. The debate is lively and even
of data from many different, individual RCTs, and a
heated when the behaviors of speech, language, and
“meta-analysis” is a quantitative (statistical) analysis
hearing disorders are evaluated for their response to
of the data from many such studies. A meta-analysis of
therapy. Readers may want to keep this in mind when
the results of many different studies can only be done
considering the concept of EBP.
when the data from each study are sufficiently compa-
The concept of EBP has taken on a life of its own
rable — as when the same pretreatment and outcome
as an academic discipline, and there is no end to the
measures were used in the different studies (such as
debate about precisely what serves as “good” scientific
number of seizures per week), the same blinding con-
evidence for the efficacy of a treatment. Table 1–2 pre
ditions, the same dosage levels, and so forth.
sents a six-level EBP model of “goodness” of evidence,
Level II evidence is the result from a single RCT.
with the “best” evidence at the top (Level I) and the
Level II evidence is high-level scientific evidence but
worst at the bottom (Level VI). This simplified model
is not as trustworthy as having many different dem-
of EBP serves the purposes of this discussion well and
onstrations, from different laboratories and different
has been presented several times in the Communica-
scientists, of the same outcome. In other words, when
tion Sciences and Disorders literature (Dodd, 2007;
Level II evidence is replicated several times, Level I evi-
Dollaghan, 2004; Moodie, Kothari, Bagatto, Seewald,
dence has been produced.
Miller, & Scollie, 2011).
Table 1–2. Levels of Evidence Applied to Evidence- Level I and II Evidence in Communication Sci-
Based Practice: A Simplified Model ences and Disorders. In Communication Sciences
and Disorders, it is relatively difficult to obtain Level
Level I Systematic reviews and meta-analyses of I and II evidence. How easy is it to find, for example,
randomized controlled trials (RCTs) 100 people who have a similar stuttering problem, or
100 people who have had a stroke and who have very
Level II A single RCT
similar problems with expressing or comprehending
Level III Nonrandomized, controlled (well-designed) speech? How easy is it to find 100 children with autism,
treatment studies who all have the same communication challenges and
similar characteristics in noncommunication domains?
Level IV Nonexperimental studies
In each of these cases, the answer is: Not easy at all.
Level V Case reports and/or narrative literature In addition, it is unusual for different laboratories that
reviews study communication disorders, and even a single
communication disorder such as stuttering (as one
Level VI Expert/authority opinion
example), to use the same measure of stuttering fre-
quency (perhaps number of stuttered words per 100 experimental controls. The lack of a control group
words produced). For these reasons and others, RCTs whose performance can be compared to an experimen-
are unusual in our field. tal group is a common problem in experiments that
Many RCTs in medical fields contrast an experi- align with Level IV evidence.
mental group that receives a trial drug for a condition Level IV-type evidence is found in in the speech,
or disease, and a control group that receives a placebo. language, and hearing literature. Treatments are
Both groups take pills on a schedule, but do not know applied to a group of individuals with communication
if they are taking the experimental medication or sugar disorders, in the absence of proper controls. People
pills. Such experiments in Communication Sciences with communication disorders improve following the
and Disorders raise an ethical question: how do you treatment, and a conclusion is reached that the specific
withhold treatment from a group of individuals with a treatment is to be valued for its positive effect on the
communication disorder? communication impairment. In the absence of controls,
RCTs are unusual in our field. They are difficult however, any form of treatment, not the specific treat-
to execute because of many factors, not the least of ment employed, may have improved the communica-
which is assembling an initial participant pool, with tion skills of a group of persons with a communication
the same kind and degree of speech/language/hearing impairment.
challenges, from which random assignment to different
treatment types is possible. Perhaps this explains why Levels V and VI Evidence. Levels V and VI are types
some introductory texts in Communication Sciences of evidence considered to be poor support for a treat-
and Disorders (e.g., Justice & Redle, 2014) choose to ment approach in any field. Case reports, which con-
talk broadly about EBP from sources external to scien- sider the outcome of a specific treatment applied to
tific pursuits. These sources include patient values and a single patient, or to a series of patients with similar
preferences, and clinician expertise (Sackett, Rosen- characteristics, lack controls and cannot be generalized
berg, Gray, Haynes, & Richardson, 1996). These factors to a larger group of patients. The absence of experimen-
are considered jointly with scientific data as contribu- tal controls and the study of only a single or few indi-
tions to EBP in the life of a speech-language patholo- viduals contribute heavily to the evaluation of this kind
gist or audiologist. The absence of solid Level I and II of evidence as “poor quality.” Even so, case reports
“high-level” evidence in our field places greater weight are common in the health care literature, including
on the other factors (patient preference, clinical experi- the treatment literature in Communication Sciences
ence) in treatment decisions made by speech-language and Disorders.
pathologists and audiologists. An argument can be made that case reports
gain value when they are organized and synthesized
Level III Evidence. The description of Level III evi- in a single publication, with conclusions drawn from
dence in Table 1–2 is “non-randomized, controlled the careful analysis of results across reports. The
(well-designed) treatment studies.” As in the case of problem with this line of thinking is that the primary
RCTs, two groups are typically studied and compared, problem of lack of experimental controls in each case
one receiving Treatment X, the other Treatment Y (or report is not solved by accumulating many case stud-
no treatment). Level III evidence does not involve ran- ies. The shared flaw of most case studies, of no experi-
domization from a pool of eligible subjects but must be mental controls, means that a summary of many cases
well controlled in other ways. for the purpose of providing evidence to support
Studies that produce Level III evidence are rela- a treatment approach is a summary of many flawed
tively common in the communication sciences and dis- experiments.
orders literature. Level III evidence often comes from Another type of Level V evidence is the narrative
studies with a relatively small number (e.g., 10 to 20) literature review. Narrative literature reviews are pub-
of participants in each group, certainly smaller than the lications in which a large number of research papers,
group numbers in (for example) drug trials. In addi- most often those that provide Level III evidence, are
tion to the absence of randomization of participants to organized and evaluated for the purpose of drawing
treatment conditions, the relatively small number of qualitative conclusions about a focused issue. Narrative
participants in Level III studies renders them less pow- reviews are popular in Communication Sciences and
erful statistically and, therefore, less “valued” than RCTs. Disorders and are published in leading journals. Narra-
tive reviews have poor evidence quality for the purpose
Level IV Evidence. Level IV evidence is produced of supporting a treatment approach, because ultimately
when a study is performed in the absence of proper they are position papers, like editorials, with a primary
aim of persuading readers that their conclusion(s) is Phonological Development and Disorders; Voice, Cra-
(are) preferable to alternate conclusions.3 niofacial, and Fluency Disorders; parts of Neural Bases
The narrative review and its aim to persuade by of Speech, Language, and Hearing Disorders; Auditory
summaries of existing research findings and theoretical Rehabilitation; Child Language Disorders: Assessment
issues, is a more scholarly version of the lowest evi- and Intervention) provide basic information on the
dence level, that of expert/authority opinion. Anyone classification, causes, and nature of the many diseases
can have an opinion that is stated as the likely truth. and conditions associated with communication disor-
When “anyone” turns out to be an authority in a disci- ders. Some curricula may have a course called “Pre-
pline, and asks that his or her position be accepted not clinical Observation,” in which students are introduced
on the basis of published data but on his or her author- to the clinical process by observing clinical sessions,
ity, the evidence has little or no value. rather than being directly involved in diagnosing or
The concept of EBP is firmly grounded in the inter- treating communication disorders.
action and co-dependency of laboratory experiments
and clinical practice. Scientists construct experiments
to generate results in support of proper diagnosis and
effective clinical management, and clinicians apply the Who Are the Professionals
findings to their patients and evaluate their real-world in Communication Sciences
results. On the basis of those clinical results, scientists and Disorders?
may adjust their experiments to provide additional and
improved data for EBP. Students obtain undergraduate and graduate degrees
in preparation for a job. In the field of Communication
Sciences and Disorders, this preparation is for employ-
A Typical Undergraduate ment as a speech-language pathologist or audiologist
Curriculum in an educational or health care setting. Or, a student
may prepare for a career as a professor in a college or
Table 1–3 shows the undergraduate, major curricu- university setting. At the undergraduate level, training
lum in Communication Sciences and Disorders at is not differentiated across these different career paths.
the University of Wisconsin–Madison. This sequence Nearly everyone who intends to be a professional in
of courses is more or less representative of curricula Communication Sciences and Disorders learns a com-
in any department in the United States that offers an mon scientific foundation for the field, as summarized
undergraduate degree in our field (some variation will in Table 1–3.
occur from department to department). The course for
which this text was written is shown in parentheses
because it is not a requirement in the UW–Madison Preparation for, and the Profession
department for an undergraduate major in the field. of, Speech-Language Pathology
Rather, this course is taken each semester by a large
number of students to satisfy a breadth requirement in The requirements to practice as a speech-language
the College of Letters and Science. Many students who pathologist (SLP) include coursework that furnishes
choose to major in Communication Sciences and Disor- a knowledge base specified by ASHA, completion of
ders at UW–Madison do take the introductory course, a master’s degree, a clinical fellowship, and success-
and in many cases, the exposure to the field provided ful performance on a national exam. The information
by the class is the reason they choose Communication presented here is based on ASHA’s published certifi-
Sciences and Disorders as their major. cation standards as of 2014, as well as some revisions
A group of courses in the curriculum (Speech Sci- and amendments to these standards published in 2016.
ence; Hearing Science; Neural Bases of Speech, Hear- ASHA documents are available at https://www.asha.org
ing, and Language; Speech Acoustics and Perception; Students finishing an undergraduate major in
Language Development in Children and Adolescents; Communication Sciences and Disorders apply to mas-
the Phonetic Transcription module of Phonological ter’s degree training programs in the fall semester of
Development and Disorders) establishes a solid sci- their senior year (or later, if they decide to take a year
entific foundation for normal (typical) processes of or two off before beginning graduate school). There
communication. Other courses (the second part of are currently over 200 such training programs in the
3
he author feels free to point to the evidentiary weakness of narrative reviews because he has published several of them. Conversely, narrative
T
reviews may organize the literature in a way that is useful for clinicians and scientists as they pursue their professional goals.
Table 1–3. The Undergraduate Curriculum for a Major in Communication Sciences and Disorders
at University of Wisconsin–Madison
Course Comments
(Introduction to Communication Sciences Survey of field
and Disorders)
Speech Anatomy and Physiology (speech Anatomy and physiology of speech mechanism
science) (respiratory system, larynx, upper articulators)
Hearing Anatomy and Physiology (hearing Anatomy and physiology of hearing mechanism;
science) basic acoustics
Neural Bases of Speech, Language, and Basic neuroanatomy and diseases of nervous system
Hearing Disorders that affect communication
Language Development in Children and Typical language development from infancy through
Adolescents adulthood, with information on atypical language
development
Speech Acoustics and Perception Speech acoustics, speech perception, role of speech
acoustics in understanding articulation processes
and understanding the speech signal
Phonological Development and Disorders Basic phonetics; typical development of the speech
sound systems of languages; definition, causes, and
nature of developmental speech sound disorders
Voice, Craniofacial, and Fluency Disorders Classification, causes, and nature of disorders of
the larynx (voice disorders), syndromes and other
related genetic/embryological disorders affecting
the speech mechanism, and fluency disorders such
as developmental stuttering
Introduction to Audiology Hearing science, approaches to evaluation of

auditory disorders, interpretation of frequently used
hearing tests
Preclinical Observation Introduction to clinical issues via lectures and

observation of clinical interactions
Auditory Rehabilitation Principles and techniques for auditory training of

individuals with hearing impairment
Child Language Disorders: Assessment Child language disorders in various populations,

and Intervention their classification, nature, and causes, plus the
scientific basis of assessment
United States, as well as about 10 in Canada. Clinical work designed to build on the foundation developed by
training programs, either established or under devel- the undergraduate course of study. Coursework at the
opment and based closely or more generally on the undergraduate and master’s degree levels is designed
ASHA model, are available in Australia, Brazil, Bel- to meet certain training standards established by ASHA
gium, China, England, Finland, Germany, Hong Kong, (for American universities) or Speech-Language and
Ireland, Italy, the Netherlands, New Zealand, Scot- Audiology Canada (SAC, for Canadian universities).
land, South Africa, South Korea, Sweden, and Taiwan, A critical component of training at the master’s level
among others. is direct clinical experience in the diagnosis and treat-
Many opportunities exist to find a program that ment of speech-language disorders. ASHA standards
fits an individual student’s needs. Master’s degree require students to obtain 400 hours of direct clinical
training typically includes 2 years of advanced course- experience by the end of their master’s program. The
knowledge and skills derived from coursework and SLPs diagnosis and treat a wide variety of speech
clinical experience are one component of eligibility for and language problems. These clinical activities range
certification by ASHA. from early intervention with a child who has delayed
ASHA sets certification standards for students in language development or is showing an early form of
training, and develops other documents and standards stuttering, to diagnoses and treatment of voice prob-
for professional SLPs (described in full detail at https:// lems, or to serving as a member of a team of health care
www.asha.org/Certification/2014-Speech-Language- professionals who provide services to children with cleft
Pathology-Certification-Standards/). The steps toward palates. There are too many areas of professional speech-
clinical certification are summarized in Table 1–4. language pathology involvement to mention here, but
SLPs work in a variety of settings. Hospital clinics, most are discussed in this text. The scope of speech-
private medical practices, public and private schools, language pathology practice is published at https://
rehabilitation centers, and nursing homes are common www.asha.org/policy/SP2016-00343/ (2016 revision).
work sites for SLPs. SLPs also work in private practice,
setting up businesses that offer on-site diagnosis and
therapy (like a physician’s private practice) or contract- Preparation for, and the
ing their services to other sites. A significant number of Profession of, Audiology
SLPs work in university settings, supervising the train-
ing activities of future SLPs. Finally, SLPs who earn a This is a professional doctorate, analogous to the profes-
PhD typically work on a university faculty where they sional doctoral degree required for the practice of (for
teach and do research. These individuals — probably example) optometry and pharmacy.4
like your instructor in this course — continue their At the current time (based on 2012 standards),
schooling past their master’s training for (on average) an individual enrolled in an AuD program and who
between 3 and 5 years to earn the PhD degree. At most seeks the Certificate of Clinical Competence in the area
universities, faculty members are expected to perform of Audiology (CCC-A) must obtain a minimum of 75
and publish research, provide classroom instruction credit hours of postbaccalaureate (graduate-school)
at the undergraduate and graduate levels, serve on study. Like the master’s degree in Speech-Language
committees, and mentor students in laboratory or clini- Pathology, students enrolled in an AuD program take
cal settings. academic courses and are engaged in clinical practica.
Table 1–4. Steps to Obtaining Certificate of Clinical Competence in Speech-

Language Pathology (CCC-SLP) or Audiology (CCC-A)
CCC-SLP Undergraduate degree (or equivalent) in accredited university program →

Clinical master’s degree from accredited university program (2 years of
coursework + total of 400 hours of supervised clinical training) →
Clinical Fellowship Year (36 weeks, full-time clinical work, supervised by
professional who already holds CCC-SLP) →
Pass national exam
CCC-A Undergraduate degree (or equivalent) in accredited university program →

AuD degree from accredited university program (3 years of coursework
+ 1 year of full-time, clinical practice, supervised by professional who
already holds CCC-A) →
Pass national exam
4
he reader may wonder if there has been a similar movement among SLPs to require a professional doctorate in speech-language pathology
T
for clinical practice. There have been several attempts to specify standards and training objectives for something that might be called the
“SplD”— Doctor of Speech-Language Pathology — but the movement has never been sufficiently focused or persuasive to prompt a serious
consideration of abandoning the clinical master’s degree in favor of a professional doctorate as the “entry-level” degree to practice clinical
speech-language pathology. Professional doctorates in Speech-Language Pathology are offered at some universities. One model for this degree
and its requirements is the program at the University of Pittsburgh (https://www.shrs.pitt.edu/CScD).
The clinical practica in an AuD program are designed hearing aid fitting (see Chapter 24). Audiologists are
to prepare students to diagnose and treat hearing and also regularly employed by otologists (physicians who
balance disorders, and to counsel patients about these deal with diseases of the ear). In addition to testing
disorders and their ongoing management. Unlike mas- hearing and fitting hearing aids, audiologists assess bal-
ter’s-level training for SLPs, the final year of the AuD ance problems (the mechanisms for balance and hear-
program is spent in full-time clinical practice under the ing are closely related), help patients manage excessive
supervision of an individual who has the CCC-A. In production of earwax, and train people with hearing
total, AuD students obtain a minimum of 1,820 hours loss to enhance their communication skills. The full
of clinical practicum before they receive the degree.5 scope of practice for persons with an AuD is published
Full details of ASHA requirements for earning at https://www.asha.org/policy/SP2004-00192/
an AuD are published at https://www.asha.org/Cert
ification/2012-Audiology-Certification-Standards/
The AuD program typically requires 3 or 4 years Order of Chapters in the Text
of postbaccalaureate study, including the final year of
full-time clinical practice. Currently, there are approxi- This text is organized into three general areas: lan-
mately 75 AuD programs in the United States. guage, speech, and hearing. Each of these general areas
Audiologists work in many of the same settings as is initiated with a chapter (or two) on normal processes.
SLPs, with some exceptions. For example, audiologists These normal processes are presented to support an
may work for hearing aid companies, contributing to instructional philosophy that disordered language,
the design and use of hearing aids and dispensing them speech, or hearing are understood best by reference to
to patients. The fitting of hearing aids is done by find- “normal” processes and behavior.
ing the best style and amplification characteristics for Two important aspects of the material in this text
the specific characteristics of a patient’s hearing loss; should be kept in mind when reading the chapters.
an important component of the AuD training involves Although the text is arranged in the order of language,
Pauline Camps’ 1934 plan to train to educate all children with disabilities.” The law
SLPs for work in the public schools provides that each child with a disability who
looked like prophecy when, in 1975, attends public school be provided with an Indi-
Public Law 94-142 was enacted. Public Law 94-142 vidualized Education Program (“IEP” in the jargon
is now called the Individuals with Disabilities Educa- of school officials) designed by a team of special-
tion Act (IDEA) and is linked with the Americans ists. These specialists include (among others such
with Disabilities (ADA) act, both originally formal- as occupational therapists, physical therapists, and
ized as laws in 1990. These laws had the specific special education teachers) SLPs and audiologists.
purpose of protecting the rights of children with To meet the needs of children with disabilities,
disabilities (and their parents) and guarantee- these professionals must be employed by public
ing access to a public education and the special school systems. The specifics of IDEA have
services required to make that education effective. undergone some changes since the original 1975
Specifically, the law had four purposes: (a) “to enactment of the law, but children with disabilities
assure that all children with disabilities have avail- are still guaranteed, by law, a public education and
able to them . . . a free appropriate public education specially designed educational plan. Public school
which emphasizes special education and related systems in the United States therefore offer many
services designed to meet their unique needs,” employment opportunities for SLPs and audiolo-
(b) “to assure that the rights of children with gists. More information on PL 94-142, its history,
disabilities and their parents . . . are protected,” and specifics, can be obtained by typing in “IDEA”
(c) “to assist States and localities to provide for the in any search engine on the Web. The curious Web
education of all children with disabilities,” and surfer will find the roots of PL94-142, ADA, and
(d) “to assess and assure the effectiveness of efforts IDEA in the Civil Rights Act of 1964.
5
he difference between this requirement for an AuD degree, and the degree requirement for clinical training of SLPs is more logistical than
T
substantial. As described in the text, students who complete an accredited master’s degree in speech-language pathology must have a 36-week,
full-time clinical experience before they are eligible for the CCC-SLP. This experience is obtained after the degree is completed, usually in the
form of a job. AuD programs incorporate this requirement into their degree programs.
speech, and hearing , the sequencing of material and International Journal of Language and Communication Disor-
the content of each chapter is somewhat arbitrary. The ders, 50, 676–718.
separation of Communication Sciences and Disorders Deacon, T. W. (1998). The symbolic species. New York, NY:
into language versus speech versus hearing disorders, W. W. Norton.
Desjardins, M., Halstead, L., Cooke, M., & Bonilha, H. S.
and the separation of their normal processes, is not real-
(2017). A systematic review of voice therapy: What “effec-
istic for real-world clinical and scientific settings. The
tiveness” really implies. Journal of Voice, 31, e13–e32.
material in this textbook is separated in this way for Dodd, B. (2007). Evidence-based practice and speech-
instructional purposes, but the reader should keep in language pathology. Folia Phoniatrica et Logopaedica, 59,
mind the interconnected processes and disorders of the 118–129.
three areas. When appropriate, the reader is reminded Dollaghan, C. A. (2004). Evidence-based practice in communi-
of these interconnections. cation disorders: What do we know and when do we know
it? Journal of Communication Disorders, 37, 391–400.
Ferguson, M. A., Kitterick, P. T., Chong, L. Y., Edmonson-
Jones, M., Barker, F., & Hoare, D. J. (2017). Hearing for
Chapter Summary mild to moderate hearing loss in adults. Cochrane Data-
base of Systematic Reviews. doi:10.1002/14651858.CD012023
.pub2
Communication Sciences and Disorders was formal-
Fitch, W. T. (2000). The evolution of speech: A comparative
ized as an academic discipline in the 1920s and 1930s, review. Trends in Cognitive Science, 4, 258–267.
and has enjoyed enormous growth since that time. Fitch, W. T. (2006). The biology and evolution of music: A com-
The field grew because of its interdisciplinary parative perspective. Cognition, 100, 173–215.
roots and interactions but forged a separate identity. Gierut, J. A. (1998). Treatment efficacy: Functional phonologi-
The field is also committed to taking the results of cal disorders in children. Journal of Speech, Language, and
laboratory research and “translating” them to the clinic Hearing Research, 41, S85–S100.
for both diagnostic (determining what the problem is) Ingham, R. J., Ingham, J. C., Bothe, A. K., Wang, & Kilgo,
and treatment purposes. M. (2015). Efficacy of the Modifying Phonation Intervals
Clinical and academic degrees are available in the (MPI) stuttering treatment program with adults who
stutter. American Journal of Speech-Language Pathology, 24,
areas of speech-language pathology and audiology.
256–271.
Most of the regulations that govern the training
Justice, L. M., & Redle, E. E. (2014). Communication sciences
and conduct of professionals in Communication Sci- and disorders: A clinical evidence-based approach (3rd. ed.).
ences and Disorders are developed and overseen by Boston, MA: Pearson.
the American Speech-Language-Hearing Association Kaldo-Sandström, V., Larsen, H.C., & Andersson, G. (2004).
(ASHA). Internet-based cognitive-behavioral self-help treatment of
This text is written to introduce students to Com- tinnitus: Clinical effectiveness and predictors of outcome.
munication Sciences and Disorders; the text surveys a American Journal of Audiology, 13, 185–192.
wide range of normal processes of language, speech, Law, J., Garrett, Z., & Nye, C. (2003). Speech and language
and hearing, and the way in which these processes therapy interventions for children with primary speech
can be affected by disease and/or conditions to affect and language delay or disorder. Cochrane Database of Sys-
tematic Reviews. doi:10.1002/14651858.CD004110
communication.
Ludlow, C. L., Hoit, J., Kent, R., Ramig, L. O., Shrivastav, R.,
In this textbook, emphasis is placed on what we
Strand, E., . . . Sapienza, C. M. (2008). Translating prin-
know about normal (typical) communication pro- ciples of neural plasticity into research on speech motor
cesses as well as the nature of language, speech, and recovery and rehabilitation. Journal of Speech, Language, and
hearing disorders that affect an individual’s ability to Hearing Research, 51, S240–S258.
communicate. Mendel, L. L. (2007). Objective and subjective hearing aid
assessment outcomes. American Journal of Audiology, 16,
118–129.
References Moodie, S. T., Kothari, A., Bagatto, M. P., Seewald, R., Miller,
L. T., & Scollie, S. D. (2011). Knowledge translation in
Angadi, V., Croke, D., & Stemple, J. (2019). Effects of vocal audiology: Promoting the clinical application of best evi-
function exercises: A systematic review. Journal of Voice, 33, dence. Trends in Amplification, 15, 1–18.
124.e.13–124.e34 Ramig, L. O., & Verdolini, K. (1998). Treatment efficacy: Voice
Baxter, S., Johnson, M., Blank, L., Cantrell, A., Brumfitt, S., disorders. Journal of Speech, Language, and Hearing Research,
Enderby, P., & Goyder, E. (2015). The state of the art in 41, S101–S116.
non-pharmacological interventions for developmental Raymer, A. M., Beeson, P., Holland, A., Kendall, D., Maher,
stuttering. Part 1: A systematic review of effectiveness. L. M., Martin, N., . . . Gonzalez Rothi, L. J. (2008). Trans-
lational research in aphasia: From neuroscience to neu- severity and speech naturalness. Journal of Communication
rorehabilitation. Journal of Speech, Language, and Hearing Disorders, 40, 42–65.
Research, 51, S259–S275. Tyler, A. A., Lewis, K. E., Haskill, A., & Tolbert, L. C. (2003).
Ruotsalainen, J. H., Sellman, J., Lehto, L., Jauhiainen, M., Outcomes of different speech and language goal attack
& Verbeek, J. H. (2007). Interventions for treating func- strategies. Journal of Speech, Language, and Hearing Research,
tional dysphonia in adults. Cochrane Database of Systematic 46, 1077–1094.
Reviews. doi:10.1002/14651858.CD006373.pub2 Van Riper, C. (1981). An early history of ASHA. ASHA Maga-
Sackett, D. L., Rosenberg, W. M., Gray, J. A., Haynes, R. B., zine, 23, 855–858.
& Richardson, W. S. (1996). Evidence based medicine: Wren, Y., Harding, S., Goldbart, J., & Roulstone, S. (2018).
What it is and what it isn’t. British Medical Journal, 312, A systematic review and classification of interventions
71–72. for speech-sound disorder in preschool children. Interna-
Tasko, S. M., McClean, M. D., & Runyan, C. M. (2007). Speech- tional Journal of Language and Communication Disorders, 53,
motor correlates of treatment-related changes in stuttering 446–467.
2
The Nervous System:
Language, Speech, and Hearing
Structures and Processes
Introduction ology is relevant to the role of the nervous system in

speech, hearing, and language. Selected, examples of
This chapter presents an overview of the nervous sys- neuroanatomical and neurophysiological topics that
tem, and how it functions in language, speech, and are matched — that is, function associated with specific
hearing. The chapter has been written specifically to structures — are presented in Table 2–1.
support brain-related material covered in subsequent
chapters. Students interested in detailed presentations
on brain anatomy and function, including informa- Central and Peripheral
tion relevant to speech, language, and hearing, are Nervous Systems
encouraged to consult the outstanding texts by Kandel,
Schwartz, Jessell, Siegelbaum, and Hudspeth (2012), The nervous system includes all neural tissue in the
and Bear, Connors, and Paradiso (2015). Additional body. The nervous system has two subcomponents, the
information on brain function for speech, hearing, central nervous system and the peripheral nervous system.
and language can be found in Kent (1997), Bhatnagar The central nervous system (CNS) includes the cerebral
(2013), and Hoit and Weismer (2016). hemispheres and their contents — the “brain” housed
In the current chapter, the anatomy and physiology inside the skull — and the entire mass of tissue beneath
of the basic unit of the nervous system — the neuron — is the hemispheres (including the cerebellum, brainstem,
presented first. A quick tour of gross neuroanatomy fol- and spinal cord: see later in chapter). The periph-
lows. The topic of gross neuroanatomy applies to the eral nervous system (PNS) includes the many nerves
structural components of the nervous system. The term extending from the CNS to innervate (control the func-
neurophysiology denotes the study of brain function. tion of) various parts of the body. For example, nerves
Knowledge of both neuroanatomy and neurophysi- in the foot run up the leg and into your spinal cord; this
17
Table 2–1. Selected Examples of Neuroanatomy and Neurophysiology Topics

in the Study of the Nervous System
Neuroanatomy Neurophysiology
Structure of neuron membrane Nature of electrical impulses
conducted by neurons
Structure of nerve attachment to Release of neurotransmitter and its

muscle effect on muscle fibers
Clusters of cells in brainstem that Movement of tongue when part of

send nerve fibers to muscles of the these brainstem cells are affected by
tongue disease
Relative volume of auditory cortex in Observation via functional brain

left versus right hemisphere of brain imaging of difference between left
hemisphere and right hemisphere
auditory cortex when speech is
presented to listener
Note. The examples are meant to clarify the difference between the two aspects of studying
the nervous system.
nerve is part of the PNS. When the fibers of the nerve

enter the spinal cord, they are in the CNS. Figure 2–1 Nervous System Cells
provides an image of these two broad components of
the nervous system; structures labeled “nerves” are part Neurons are not the only cell type
of the PNS, the remaining neural structures are part of in the nervous system. The nonsignaling cells
the CNS. Figure 2–2 shows a simple way to understand serve the purpose of providing structural and
the distinction between the CNS and PNS. metabolic support to neurons. What is meant
by “nonsignaling”? Neurons communicate with
each other; they pass information from one
The Neuron neuron to other neurons. The nonsignaling cells
in the CNS do not transmit information in the
Figure 2–3 shows a schematic drawing of two neu- brain but nonetheless have critical functions.
rons. The neuron is the basic cell unit of the nervous Most of these cells are called glial cells, and they
system. The neuron has a cell body, with a nucleus are more numerous than neurons. Glia is the
at its center. This cell body issues a long fiber called Greek word meaning “glue,” a fitting name for
an axon. cells that hold neurons together. Glial cells also
When a brain is removed from an organ donor, support neurons by “feeding” them with nutri-
fixed in a special solution, and dissected, two shades of ents and oxygen. Some tumors of the brain have
tissue are seen. These two types of tissue are called the their beginnings in these glial cells, rather than
gray matter and the white matter of the brain. Gray mat- in the neurons.
ter consists of clusters of cell bodies, and white matter
is composed of bundles of axons.
Groups of cell bodies in in the nervous system are Axons have a whitish appearance because they
organized together for a specific function. Similarly, are covered with a substance called myelin. Myelin
bundles of axons are pathways that connect groups of functions like an electrical insulator, allowing axons
such specialized cell bodies in one part of the nervous to conduct electrical impulses at high speeds. Some
system to another group of cell bodies elsewhere in the axons, fewer than myelinated axons, lack myelin and
nervous system. The concepts of gray matter and white conduct electrical impulses at relatively slow speeds.
matter are critical to understanding both the anatomy Axons may lose myelin as a result of disease, as in mul-
and physiology of the brain. tiple sclerosis.
2 The Nervous System: Language, Speech, and Hearing Structures and Processes 19
Cerebrum
Diencephalon
Midbrain (mesencephalon)
Cerebellum
Pons
Cranial Medulla
Nerves
Spinal cord
Cervical
nerves
Thoracic
nerves
Lumbar
nerves
Sacral
nerves
Coccygeal
nerve
Figure 2–1. The divisions between the central nervous system (CNS) and
peripheral nervous system (PNS). Structures of the CNS include the cerebrum
(cerebral hemispheres), the diencephalon, the brainstem (midbrain, pons,
and medulla), the cerebellum, and the spinal cord. Structures shown in the
PNS include the cranial nerves (red and light green ) and the spinal nerves.
The cranial nerves emerge from the brainstem; the spinal nerves emerge
from the spinal cord. See text for additional details.
Nervous System
CNS PNS
Cerebral hemispheres Nerves to and from brainstem
Brainstem Cerebellum Nerves to and from spinal cord
Spinal cord Sensory receptors and motor endplates
Figure 2–2. Simple summary of the components of the CNS and PNS.
Dendrites Axon terminals
Cell body
Nucleus
Axon Synaptic cleft
Figure 2–3. Two neurons. The image shows the cell bodies and their nuclei, axons, dendrites, axon terminals,
and a single synapse.
Figure 2–3 shows spiny-like projections from the energy reaches the projections at the end of the axon,
cell bodies of the two neurons. At the ends of the axons, a small amount of a chemical substance, called a neu-
there are long projections ending in button-like struc- rotransmitter, is released into the space between two
tures. The spiny projections are called dendrites, and the or more axons. A portion of the chemical is deposited
projections from the end of the axons are called terminal on the dendrites of another neuron’s cell body, which
buttons (or simply, terminal segments). Both projections causes the membrane to change its electrical proper-
are specialized structures that allow information to be ties. If everything goes well (and it usually does), the
received or sent to other neurons; they are the places changes in the membrane’s properties cause this neu-
where one neuron “talks” to another. ron’s cell body to “fire” an impulse, which repeats the
Neurons conduct electrical impulses that originate process of conducting electrical energy down the axon
in the cell body and travel like electrical current down and releasing more chemical to affect another neuron’s
the axon to the terminal buttons. When the electrical cell body. Neurons talk to each other in this electro-
Myelin Trivia
Here are two interesting myelin facts opment, continuing well into the later teenage
for your next trivia contest. First, many axons and young adult years. Perhaps this explains why
that conduct pain and temperature impulses from toddlers seem to react so slowly to events that look
various parts of the body to the CNS are unmyelin- as if they should be painful. Just about everyone
ated. The relatively slow “conduction time” of has seen a small child fall to the ground and
these pathways explains why reactions to extreme hesitate for what seems like a very long time before
temperatures and painful stimuli often seem to crying. One interpretation of the hesitation is that
take a long time. Most of us have experienced this the child is waiting for an adult’s reaction, to see if
when touching something very hot but apparently she should be crying. Another interpretation is that
not realizing it for a second or two. Second, there the pain information moves at a slow rate in the
are specific processes that “wrap” myelin around toddler’s nervous system because of the relative
neurons, a process that has a long course of devel- lack of myelinization.
chemical way: electrical energy turned into chemical Second, the transmission of information between
energy, turned again into electrical energy, and so forth. neurons depends on several different neurotransmit-
The transfer of information in the nervous system ters, which are critical for normal brain function. When
depends critically on this chain of events. Diseases that there is too much or too little of a specific neurotrans-
affect either the dendrites or the terminal buttons can mitter, signs of neurological disease may appear. Com-
have a significant effect on information transfer within munication problems in neurological disease can be
the PNS and/or CNS and, therefore, on the behavioral related to deficiencies or excesses of neurotransmitters.
function of human beings.
Tour of Gross Neuroanatomy

The Synapse
Gross anatomy is the study of the large-structure
As illustrated in Figure 2–3, a synapse is the space components of a human body part. What follows is a
between the projections at the end of the axon (ter- brief tour of the gross anatomy of the human nervous
minal buttons) and the spiny projections (dendrites) system.
from the cell body of an adjacent neuron. The synapse Figure 2–4 shows the cerebral hemispheres of
also includes connections between the terminal but- the CNS as seen from above. The front of the cerebral
tons and dendrites, as illustrated by the dashed lines hemispheres is to the left of the image. From this van-
in Figure 2–3. Thus, the termination of the axon and tage point, the cerebral hemispheres are seen to con-
dendrites of an adjacent cell body are joined together, sist of two symmetrical halves. These halves are the
linking them as a unit. right (top of image) and left (bottom of image) cere-
Figure 2–3 does not do justice to the incredibly com- bral hemispheres. The right and left hemispheres are
plicated structure and function of synapses throughout grossly symmetrical, but in certain cases, parts of one
the nervous system. Two observations illustrate this side are bigger or differently shaped than the analo-
complexity. First, within the CNS there are between 10 gous part on the other side. More importantly, perhaps,
and 100 billion neurons. It really is not important if the the two halves of the brain are not symmetrical in their
actual number is 10, 20, 70, or 100 billion, it is sufficient functions. The left hemisphere is specialized for certain
to note the tremendous number of neurons packed functions and the right hemisphere for other functions.
into a relatively small space. With so many neurons in Some of these specializations are discussed in greater
such a small volume, the size of the synaptic “spaces” detail later in this chapter.
between adjacent neurons is very, very tiny (roughly 20 The visible tissue in Figure 2–4 is the cortex. The
nanometers, or 20/1,000,000,000th of a meter). The cell cortex is composed of densely-packed cell bodies —
body of a single neuron is contacted by the projections gray matter — that perform the most complex functions
from many axons, and the terminal buttons of one neu- of an organism. The highly developed and extensive
ron contact the dendrites of multiple cell bodies. The cortical tissue in humans is responsible for the wide
pattern of connections among the huge number of neu- range of human abilities that are far more sophisticated
rons within the CNS is highly overlapped and dense. than the abilities of non-humans.
Gyri Sulci
Right Hemisphere
Left Hemisphere
Figure 2–4. View from above of a fixed human brain, with the front of the
brain to the left of the image. Note the two hemispheres of the brain (right hemi-
sphere, top half of image, left hemisphere, lower half of image). The gyri are the
hills of tissue, the sulci the fissures (grooves) between the gyri.
The two hemispheres are connected by a massive Frontal Lobe

bundle of axons (white matter) called the corpus cal-
losum. Axons arising from cortical cells in one hemi- The frontal lobe, as suggested by its name, is the front
sphere travel to the other hemisphere where they make part of the cerebral hemispheres. It is separated from
synapses with other cortical cells. There are roughly the more back part of the brain by a long, deep sulcus
200 million axons in the corpus callosum; the connec- running down the side of the brain. This central sul-
tions between hemispheres are fine-grained and dense. cus is a dividing line between the front and back of the
One hand knows what the other hand is doing. brain. The frontal lobe is separated from the temporal
The surface of the brain appears as thick, humped lobe below by the sylvian fissure (also called the lateral
ridges separated by deep fissures. The ridges are called sulcus, see Figure 2–5).
gyri (singular = gyrus), and the fissures are called sulci The frontal lobe has many functions, including
(singular = sulcus) or fissures. The gyri and sulci give executive function. Executive function includes the skill
the cortex its characteristic appearance. of planning actions, connecting current behavior with
Humans have much more cortical tissue than even future consequences of that behavior, and imposing
the most advanced primates, such as chimpanzees. organization on the tasks of everyday life. Diseases that
As suggested previously, the volume and complexity affect the areas of the frontal lobe that control executive
of human cortical tissue constitute one reason for the function have a major impact on behavior, including
huge cognitive “edge” we enjoy relative to other mem- communication.
bers of the animal kingdom. This edge almost certainly The frontal lobe contains a gyrus, directly in front
includes, and perhaps is defined by, our ability to use of the central sulcus, called the primary motor cortex
speech and language in creative and novel ways. (Figure 2–5, shaded blue). This gyrus extends from the
Figure 2–5 shows an artist’s rendition of a side top of the brain to the sylvian fissure and contains the
view of the left hemisphere of the brain, plus parts of cell bodies of neurons that control muscle contractions
the brainstem and spinal cord (discussed later). Both in all parts of the body. The arrangement of these cells
hemispheres have four lobes, including the frontal, is systematic, from the top to bottom of the primary
parietal, temporal, and occipital lobes. The same four motor cortex. Muscles toward the bottom of the body,
lobes can also be identified in the right hemisphere. such as the lower leg and foot, are controlled by cells
Central sulcus
Parietal lobe
Frontal lobe
Occipital lobe
Temporal lobe
Sylvian
fissure
Pons Cerebellum
Medulla
oblongata
Spinal cord
Figure 2–5. Side view of the left hemisphere, showing the four
lobes as well as the brainstem, cerebellum, and upper part of spinal
cord.
at the top of the primary motor cortex. Muscles of the previous description). It is known to be a “speech cen-
face, tongue, and lips, on the other hand, are controlled ter” of the brain, containing tissue that is essential for
by cells at the bottom of the primary motor cortex, close speech production. Broca’s area is discussed in greater
to the sylvian fissure. The top-to-bottom arrangement detail in the section, “The Dominant Hemisphere and
of cells in the primary motor cortex is therefore an the Perisylvian Language Areas.”
inverted representation of muscles in the body. This
systematic arrangement of cells in the primary motor
cortex is called somatotopic representation. Occipital Lobe
Other parts of the frontal lobe take part in the plan-
ning of action. Even a simple action, such as reaching The occipital lobe forms the back of each cerebral hemi-
for a doorknob to open a door, requires a complex and sphere. The occipital lobe has front boundaries with
properly sequenced set of muscular contractions. The both the parietal and temporal lobes (see Figure 2–5).
force of contraction and the timing of the sequence of The sulci (plural of sulcus) separating the occipital lobe
muscles to be used must be based on a plan. The frontal from the parietal and temporal lobes are not as easily
lobe plays a major role in the learning and activation of seen as the more dramatic central sulcus and sylvian
these plans. Such plans are essential to speech produc- fissure. The primary function of the occipital lobe is to
tion. To speak the word “production,” the contraction process visual stimuli.
patterns of the many muscles of the head, neck, and
respiratory system must be planned for accurate articu-
lation of speech sounds, their sequencing, and prosody Temporal Lobe
of the three-syllable utterance. A disorder called apraxia
of speech affects this planning of speech sequences and The temporal lobe forms much of the lateral (side) part
may have an underlying cause of damage to the frontal of the cerebral hemispheres and is separated from the
lobe of the left hemisphere. frontal and parietal lobes by the sylvian fissure. The
Broca’s area is also located in the frontal lobe. Bro- temporal lobe plays an important role in hearing. In
ca’s area is just in front of the primary motor cortex (see Figure 2–5, three gyri of the temporal lobe can be seen,
each one oriented more or less horizontally, with a just these surface features, however, and especially so
slight upward tilt from the forward tip of the lobe to in humans. The evolution of the human brain required
the back boundary adjoining the occipital lobe. The fitting the enormous computing power of cortical neu-
top-most gyrus contains the primary auditory cortex (see rons into a relatively small space (the skull). Clearly,
Figure 2–5), which contains neurons that receive audi- there is room for just so much area on the surfaces
tory impulses originating in the sensory end organ of shown in Figure 2–5.
the ear (called the cochlea, covered in more detail in Human evolution took a novel approach to pack-
Chapter 22). ing more cortex into this small space by “burying”
The temporal lobe also plays an important role in millions of cortical cells in the deep sulci separating
the lexicon (words and their meanings stored in the the many gyri that define the surface of the brain. In a
cerebral hemispheres) and in the relations between “fixed” brain (one that has been hardened somewhat
words. It also is important to aspects of speech and with a special solution, prior to removal from the skull),
language perception. The temporal lobe also plays an like the one shown in Figures 2–4 and 2–5, adjacent gyri
important role in memory and emotion. can be pulled apart to reveal previously hidden, inte-
rior walls of cortical cells. The cortex is not simply the
surface of the brain and its thickness of cell bodies, but
Parietal Lobe also the cell bodies buried within the deep sulci of the
human brain. When a human brain is compared to that
The parietal lobe extends from the central sulcus back of other animals, a striking feature is the greater com-
to the front boundary of the occipital lobe. A portion of plexity and much deeper sulci of the human version.
the parietal lobe is separated from the temporal lobe by In so-called lower animals, the surface of the brain may
the sylvian fissure. The parietal lobe shares boundaries look positively smooth compared to the human brain.
with the other three lobes of the cerebral hemispheres. The smoothness reflects the absence of deep sulci and,
The parietal lobe contains a gyrus, immediately therefore, the absence of extra cortical cells.
in back of the central sulcus, called the primary sensory In the human brain, there is also hidden cortex
cortex (Figure 2–5, shaded yellow). This gyrus parallels beneath the temporal, frontal, and parietal lobes. In a
the course of the primary motor cortex of the frontal fixed human brain, the front end of the temporal lobe
lobe, extending from the top of the brain to the sylvian can be pulled away from the rest of the brain to expose
fissure. The cell bodies of the primary sensory cortex additional cortical gyri. These gyri make up the insular
can be thought of as the final station in the brain for the cortex (or, simply, the insula). Many scientists believe
collection of tactile (touch) information (other types of the insular cortex plays an important role in speech, as
sensory information also find their way to the primary well as in memory and emotional functions.
sensory cortex). Like the primary motor cortex, cells in
the primary sensory cortex are arranged somatotopi-
cally, with touch information from the lower part of Subcortical Nuclei
the body represented at the top of the gyrus, and touch
information from the face, tongue, and lips represented Figure 2–6 shows a view of the cerebral hemispheres
toward the bottom of the sensory cortex. in which the cortical surface and white matter in the
In addition to the primary function of touch sen- cerebral hemispheres have been made transparent,
sation, the parietal lobe integrates large amounts of revealing structures inside the hemispheres. As previ-
sensory data and plays an important role in coordinat- ously described, clusters of cell bodies within the brain
ing various sources of information critical to cognitive are called nuclei; the structures shown in Figure 2–6 are
functions, including language. Extensive connections several of the subcortical nuclei (below the cortex, within
exist between the parietal lobe and each of the other the hemispheres).
three lobes, allowing the integration of visual, auditory, Subcortical nuclei include the basal ganglia (some-
and touch sensations, as well as motor control informa- times called the basal nuclei) and the thalamus. The basal
tion. This integration is fundamental for higher-level ganglia include five separate nuclei, which for simplic-
control and cognitive functions. ity are shown in Figure 2–6 as a single, but complex,
structure.
The basal ganglia play an important role in con-
Hidden Cortex trol of movement, including movements of the speech
mechanism. Damage to the basal ganglia may impair
The four lobes in each hemisphere are seen easily on a person’s ability to produce speech. In addition, the
the surface of the brain. There is more to the cortex than basal ganglia play a role in language. Although the lan-
Basal ganglia
Thalamus
Figure 2–6. Selected subcortical nuclei, contained within the cere-

bral hemispheres. Structures of the basal ganglia and thalamus are
included in this picture.
guage ability of humans is often thought of as powered of the hemispheres (shaded pale red in Figure 2–6), is a
by cortical cells, the basal ganglia and cortex are exten- collection of nuclei that serves as the main connection
sively interconnected and communicate with each between the basal ganglia and cortex. It is also the main
other. Many scientists believe these subcortical nuclei “relay station” for the transmission of sensory events
have an important role in human communication. (e.g., touch, vision, auditory) from the outside world
The thalamus, the egg-like structure in the middle to the cortex.
The Basal Ganglia and Parkinson’s Disease

Various neurological diseases, such as Parkinson’s disease,
Tourette’s syndrome, and Huntington’s disease, are known to be related
to damage to structures of the basal ganglia. In Parkinson’s disease, for
example, patients show several signs (evidence of a disease that can be
observed on clinical examination), including rigid (stiff) limbs, slow or
absent movement, and tremors. These signs are evidence of disruption
of basal ganglia physiology, mostly having to do with a deficiency of a
neurotransmitter called dopamine. Some of the lost dopamine can be
replaced with drugs. Another form of therapy involves insertion of a tiny
electrode in the brain to provide electrical stimulation to structures in the
basal ganglia. The therapeutic procedure, called deep brain stimulation
(DBS), often provides some relief from the signs noted above.
Figure 2–7 shows another view of the brain that cal nuclei). Finally, the image provides a good look at
illustrates the relationship of subcortical to cortical the dense network of white matter within the cerebral
structures. This is a frontal cut through the cerebral hemispheres.
hemispheres of a human brain, as if the hemispheres
are facing you and cut through to separate them into
front and back halves; the interior of the brain is Brainstem, Cerebellum,
exposed as the front or “face” of the back half. and Spinal Cord
Figure 2–7 shows the difference between gray and
white matter in the cerebral hemispheres. The deep sul- The part of the central nervous system extending below
cus in the middle and toward the top of the image is the the cerebral hemispheres includes the brainstem, cer-
dividing line between the left and right hemispheres of ebellum, and spinal cord. These parts of the brain also
the brain. Note the grayish, top layer of the brain, hav- play important roles in communication and its disorders.
ing a thickness somewhat like the rind of a watermelon.
Note also the deep sulci in which the gray matter forms
Brainstem
interior “walls” of cortex. Immediately below the rind
of cortical cell bodies there is whitish tissue — the white In Figure 2–5, a stalk-like structure descends from the
matter of the brain. These are bundles of axons — fiber cerebral hemispheres. Its top part is the brainstem; its
tracts — running between different groups of cell bod- downward continuation is the spinal cord. The brain-
ies throughout the CNS. Any “chunk” of white matter stem is divided into three major parts, including the
within the cerebral hemispheres is a densely interwo- midbrain (hidden in Figure 2–5 by the lower edges of
ven mesh of all these different connections. the temporal lobe), pons, and medulla. The brainstem
The individual structures and their names are not serves a host of functions including regulation of blood
the point of Figure 2–7. Rather, the image presents a pressure, breathing, production of saliva and perspira-
good orientation to the distinction between gray and tion, and level of consciousness. Very serious damage
white matter. In addition, it shows the cortical gray to the brainstem is, in most cases, not consistent with
matter as well as the subcortical gray matter (subcorti- the maintenance of life.
White matter
Caudate
Thalamus
Putamen
Globus pallidus Subthalamic nucleus
Substantia nigra
Figure 2–7. Image of brain structures as if the brain is cut into front and back
halves; the perspective is looking toward the “face” of the back half. Gray and white
matter is shown, including the gray matter of the cortex and of subcortical nuclei.
White matter is seen directly below the cortex, and surrounding the subcortical nuclei.
The brainstem also contains motor neurons — neu- coordinated movement of everyday life, from walking
rons specialized for motor control, such as muscle con- to making a jump shot while surrounded by nine other
traction — that control muscles in the head and neck. players. The cerebellum is important for other brain
There are also sensory neurons that receive sensation functions as well, but its role in movement coordina-
from head and neck structures. The cell bodies of tion is very prominent. Patients with diseases of the
these neurons are organized into dense, small nuclei in cerebellum often lose the ability to produce smooth
various parts of the brainstem. The motor neurons in movement patterns.
the brainstem nuclei receive commands from cortical
motor neurons, and together the cortical and brainstem
Spinal Cord
motor neurons exert control over muscles of the larynx,
throat, tongue, lips, and jaw. The motor neurons in the The spinal cord extends from the bottom of the brain-
brainstem send out fiber tracts that exit the CNS as cra- stem down the back, terminating a little below the
nial nerves (see Figure 2–1). The term “cranial” nerve is waist. The spinal cord contains cell bodies that control
used to distinguish these from nerves running to and muscles of the arms, legs, chest, and other parts of the
from the spinal cord (“spinal” nerves), as explained torso, as well as cell bodies that receive sensory infor-
later in this chapter. mation from those structures. The nerves that run to
There are 12 paired cranial nerves (CNs). They are and from the spinal cord are called, not surprisingly,
referred to by Roman numerals (e.g, CN V; CN XII) as spinal nerves. Of primary importance for the purposes
well as by their names (e.g., CN V = trigeminal nerve; of this text are the spinal nerves that serve muscles and
CN XII = hypoglossal nerve). Five of the cranial nerves structures of the respiratory system. These muscles
carry information from brainstem motoneurons to are located between the ribs, in the abdomen, and in
muscles of the head and neck that control movements the neck. Because the respiratory system plays a criti-
of the jaw, tongue, soft palate, pharynx, and larynx cal role as the “power supply” for speech production
during speech. One of the nerves serves the sense of (Chapter 10), damage to cell bodies within the spinal
hearing, and another nerve plays a role in breathing cord or to the spinal nerves can result in speech breath-
during speech. ing problems.
The brainstem and the cranial nerves attached to
it play a critical role in the control of the speech mecha-
nism and in hearing. Damage to brainstem nuclei or The Auditory Pathways
the cranial nerves results in weakness or paralysis of
speech muscles as well as loss of sensation in head and The nervous system pathway that connects the sen-
neck structures. sory organ for hearing to the auditory cortex is special-
ized for auditory analysis. The auditory pathways are
shown schematically in Figure 2–8.
Cerebellum
The bilateral, auditory pathways begin in the
The cerebellum is the large mass of tissue immediately cochlea (marked number 1 on Figure 2–8), which is
behind the brainstem (see Figure 2–5). It is recognizable the sensory end organ of hearing. “Bilateral” means the
in fixed brain preparations because of its size, unusual pathways are the same on both sides of the head. The
appearance — somewhat like a cauliflower — relative auditory nerve (CN VIII) (number 2) emerges from
to other surface features, and because of its location the cochlea and transmits auditory information to the
at the back and base of the cerebral hemispheres. Like brainstem. The nerve enters the CNS, and its fibers
the cerebral hemispheres, the cerebellum contains gray make synapses with several nuclei in the lower brain-
and white matter. stem (numbers 3 and 4). At brainstem location #4, the
The cerebellum has extensive connections with the pathways make an interesting turn.
cerebral cortex, brainstem, spinal cord, and basal gan- Roughly 75% of the ascending fibers cross over to
glia.1 The cerebellum has been likened to a computer the other side of the brainstem (red pathway) where they
that integrates vast amounts of information about mus- make a synapse with another nucleus; the remaining
cle contraction, signals coming into the brain from the 25% of the fibers (green pathway) make synapses with
outside world including the body surface, the location cells of the corresponding nucleus on the side of entry
of the head relative to the body, and the general state in the brainstem. As the auditory fibers ascend, they
of brain activity. This integration produces the smooth, make synapses in more brainstem nuclei (represented
1
onnections between the cerebellum and basal ganglia have not always been included in neuroanatomy textbooks, but research in the past
C
10 years has identified these connections (Bostum, Dum, & Strick, 2010, 2018).
Auditory
cortex
7
3
2
Auditory
nerve
Cochlea
1
Figure 2–8. A simplified view of the peripheral and central auditory

pathways, from cochlea to cortex.
by number 5) before ascending to the thalamus (num- nections to other clusters of cells that are specialized
ber 6). The final destination of the auditory pathway is or even dedicated to certain functions, behaviors, or
the auditory cortex (number 7). processes. Two well-known examples of such special-
At each succeeding level along the auditory path- ization are the gyri in the left frontal lobe called Broca’s
way, the analysis of the auditory information becomes area, and the area of tissue in the top gyrus of the left
more complex and sophisticated. The crossing of audi- temporal lobe and perhaps extending to a small area
tory fibers from one side of the brainstem to the other of the parietal lobe called Wernicke’s area. These two
(point number 4 in Figure 2–8) means that auditory areas of a preserved human brain are shown in the left
analysis in the cortex of one hemisphere is primarily image of Figure 2–9. The central sulcus (see Figure 2–5)
from the ear on the opposite side. is shown for orientation.
Broca’s area is thought to be specialized for the
production of speech, and Wernicke’s area for the recep-
The Dominant Hemisphere and tion (comprehension) of speech and language. Of par-
the Perisylvian Language Areas ticular importance is the specialization of these areas in
the left hemisphere of the brain. Broca’s and Wernicke’s
The concept of specialization of function is that within the areas are therefore not only specialized, they are also
brain there are certain clusters of cells and their con- lateralized. Lateralization of function is the term used to
Arcuate fasciculus
Central sulcus
Wernicke’s area
Wernicke’s
Broca’s area area
Broca’s area
Ventral stream
Figure 2–9. Left, Broca’s area and Wernicke’s are shown on a preserved brain; the central sulcus is shown as a land-
mark separating the frontal and parietal lobes. Right, two pathways connecting Wernicke’s and Broca’s areas — the arcuate
fasciculus and the ventral stream.
denote brain tissue specialized for a specific function, tion in a symmetrical way. For these reasons, the term
but only on one side of the brain. According to research “Broca’s area” is reserved for the left hemisphere.
estimates, approximately 90% of humans have speech A similar story can be told for Wernicke’s area. Carl
and language functions lateralized to the left hemi- Wernicke (1848–1905) was a Prussian (his birthplace
sphere (Bear, Connors, & Paradiso, 2016). now part of Poland) physician who was interested in
How was this specialization and lateralization of the relationship between brain function and speech
function for speech and language discovered? Paul and language. Wernicke published a book in 1874 in
Broca (1824–1880), a 19th-century French physician which he described a link between lesions in the top
who followed up on similar work of other physicians, gyrus of the left temporal lobe, close to where the tem-
had a patient who suffered brain damage resulting poral lobe meets the parietal lobe, and difficulty com-
from syphilis and was almost completely unable to prehending speech and language but little difficulty
produce speech. The patient’s only speech consisted of articulating speech. However, the speech produced by
one syllable, “tan,” which he repeated over and over. these patients often conveyed jumbled ideas and even
The patient seemed to understand speech well. Broca jargon — speech that sounded like strings of words but
suggested the lesion, wherever it was in the brain, lacked meaning. The lesion location described by Wer-
affected speech production, but not speech perception nicke (Wernicke’s area) is indicated in the left image of
and language comprehension. “Lesion,” from the Latin Figure 2–9
alaedere (“injury”), means damaged or destroyed tissue. Wernicke’s area was thought of as the language
Following the patient’s death, an autopsy showed comprehension center of the brain. Like the lateralized
a large lesion in the lower part of the left frontal lobe, function of Broca’s area, the language comprehension
adjacent to and just forward of the primary motor cor- problems emerged with damage to the left hemisphere.
tex. The location of this lesion is indicated in Figure 2–9 Damage to the same location in the right hemisphere
as “Broca’s area.” Broca, as well as other physicians, did not result in language comprehension problems.
saw additional patients with the same lesion location Research to the present day confirms, in general
and the same speech symptoms. Broca concluded that ways, the specialization of Broca’s area for speech artic-
this part of the brain was specialized for the articula- ulation and Wernicke’s area for language comprehen-
tion of speech. The lateralization of this speech articula- sion. In the great majority of humans, this specialization
tion center to the left hemisphere was discovered in the is lateralized to the left hemisphere. The left hemisphere
following way: patients whose brains showed lesions is often referred to as the dominant hemisphere because it
in the same location as Broca’s area, but in the right houses the specialization for speech and language.
hemisphere, did not have speech articulation deficits. There is a good deal of controversy concerning
The brain was therefore not specialized for articula- the specifics of Broca’s and Wernicke’s areas, including
their exact boundaries and the restriction of one area to Speech perception must lead to meaning, how-
speech articulation (Broca’s) and the other to language ever, so the arcuate fasciculus cannot account for link-
comprehension (Wernicke’s). For example, damage to ing sounds to words; it is mainly for the identification
Broca’s area may result in certain language comprehen- of speech sounds. The ventral stream pathway runs
sion problems, and damage to Wernicke’s area can pro- between Wernicke’s area and cortical regions adja-
duce problems with speech articulation. More is said cent to (or in) Broca’s area, and is thought to connect
about these issues in Chapter 9. sounds with word meanings. The arcuate fasciculus
is considered as the “upper loop” (dorsal stream) con-
necting auditory analysis areas in the temporal lobe to
Arcuate Fasciculus (Dorsal articulatory areas (Broca’s area) in the frontal lobe. The
Stream) and Ventral Stream ventral stream runs between Wernicke’s areas and the
lower part of the temporal lobe before it connects to
Not surprisingly, Wernicke’s area and Broca’s area are frontal lobe regions in and around Broca’s area. The
connected by thick bundles of axons. One tract, shown upper-loop/lower-loop model of integrating auditory
on the right side of Figure 2–9, is called the arcuate fas- analysis with articulation (upper loop) and auditory
ciculus (“fasciculus” is a bundle of fibers; “arcuate” is analysis with meaning (lower loop) is called the dual-
descriptive of the arch-like configuration of the tract). stream model (Fridriksson et al., 2018).
The arcuate fasciculus connects cell bodies in Wer- Notice in Figure 2–9 how the arcuate fasciculus
nicke’s area to cell bodies in Broca’s area. The tract in (dorsal stream) and ventral stream form a flattened
Figure 2–9 is superimposed on the surface of the left loop around the sylvian fissure. In a landmark study on
hemisphere but actually runs deep (beneath) to the brain activity and speech and language, the Canadian
cortex within the temporal, parietal, and frontal lobes. neurosurgeon Wilder Penfield (1891–1976) used electri-
A fiber tract that connects Broca’s and Wernicke’s cal currents to stimulate the cortex in and around the
areas seems to make perfect sense. For example, our tissue enclosed in this flattened loop. The patients on
ability to repeat what someone says must involve some the operating table were having tissue excised because
auditory analysis of speech, in and around Wernicke’s of severe epilepsy. Penfield was able to stimulate the
area, and the transfer of that analysis to Broca’s area perisylvian areas with electrical currents while the
where the heard speech is readied for production. In brain was exposed. Penfield found that almost any
fact, scientists have argued that the arcuate fasciculus stimulus around the sylvian fissure of the left hemi-
is the pathway for connecting the auditory analysis sphere evoked some form of language behavior in his
of incoming speech sounds (Wernicke’s area) to the awake patients (Penfield & Roberts, 1959). Autopsies of
articulatory characteristics that produced the sound patients who had suffered strokes and who had speech
(Broca’s area). The arcuate fasciculus is discussed fur- and language problems also revealed frequent damage
ther in Chapter 9. in this loop of tissue. The part of the cerebral hemi-
Speech and Language, Together Always

Readers may notice the use of “speech” previous example of damage to Wernicke’s area;
and “language” as different components of the (b) a 9-year old child with cerebral palsy who
communication process (see Chapter 3). To some has difficulty producing clear speech sounds as a
extent they are. “Speech” usually refers to the result of a speech motor disorder, but who has an
planning and production of the sound sequences age-appropriate vocabulary, sequences words for
that form words. “Language” refers to the repre- sentences like a typically developing 9-year-old
sentations and the “public” usage of those symbols child, and comprehends speech perfectly; and
to communicate. Some examples of differences (c) a typically developing 4-year-old child who
between speech and language are (a) a stroke has age-appropriate speech production but has
survivor who produces speech with no errors but not mastered morphemes, the minimal units of
with no meaning — the speech sounds and words meaning discussed in Chapter 3. Of course, speech
are correct but the use of the words does not fit the and language skills interact, but examples like
situation (such as a response to a specific question) those presented show how they can be separable.
and are combined in meaningless ways (as in the
spheres enclosed within the dorsal and ventral streams Functional Magnetic
is referred to as the perisylvian cortex (“peri” being a Resonance Imaging
prefix meaning “around” or “enclosing”).2 The peri-
sylvian cortex is assumed to be important to normal Magnetic resonance imaging (MRI) is a technique that
speech and language functioning, and damage to this was developed in the 1970s and 1980s to produce very
region of the brain is likely to result in a communica- detailed images of body structures. A strong magnetic
tion impairment, ranging from mild to severe depend- field surrounding the body part of interest reacts to
ing on the extent of the damage. properties of biological tissue that affect the magnetic
field. These reactions are reconstructed as an image hav-
ing exquisite detail. Figure 2–10 shows an MR image
Functional Magnetic Resonance of the right side of the head and neck. The image was
Imaging and Speech and programmed to show structures of the medial (inside)
Language Brain Activity wall of the left hemisphere, as if the other hemisphere
has been removed from the view. The gyri and fissures
The identification of speech and language areas of the that define the surface and thickness of the cortex are
brain is often based on an approach of linking parts of easy to see, as is the white matter below the cortex. The
the brain with specific functions. A patient is seen, her cerebellum, midbrain, pons, and medulla of the brain-
symptoms documented, and if the patient passes away stem, and the cervical (neck) part of the spinal cord are
her brain may be autopsied to locate lesions that might also imaged clearly. The thick, arch-like band of white
explain the symptoms. If a sufficient number of patients matter just below the cortical tissue and in the center
are studied with the same symptoms and lesion loca- of the image is the corpus callosum, the fiber tract that
tion, the damaged part of the brain is thought to be connects the two hemispheres.
critical to the normal behavior compromised by the
neurological disease. For example, if patients who have
suffered strokes share a common symptom of read-
ing problems, and later autopsies reveal that all these
patients had lesions in the same part of the parietal
lobe, that part of the brain is thought to be critical to
normal reading ability.
This approach to identifying functions of the brain
is limited. It is unusual to find a sufficiently large group
of patients with exactly the same symptoms, and pre-
cisely the same lesion location. Thus, the ability to
generalize between lesion locations and brain func-
tion is limited by the lack of a sufficiently large sample
of cases to “prove” the point that brain structure “x”
causes behavior “y.”
The last half-century has seen a technological rev-
olution in the ability to generate images of the brain
in living individuals. This revolution includes imag-
ing techniques to identify specific brain structures and
make precise measures of their length, width, volume,
and tissue type. Enhancements of these techniques
are also available to monitor the activity of specific
brain structures as a person performs different tasks.
This chapter closes with a brief discussion of how one
technique to monitor brain activity during speech and
language tasks has enhanced understanding of speech Figure 2–10. MR image of the medial surface of a cere-
and language functions of the brain. bral hemisphere, brainstem, and spinal cord.
2
The term “perisylvian language areas” typically refers to the tissue in the left, speech- and language-dominant hemisphere. This is consistent
with the facts in a general way, but of course, when you start digging it is more complicated. For example, the arcuate fasciculus (dorsal stream)
is thought to be strongly lateralized to the left hemisphere. In contrast, the ventral stream, which connects auditory analysis to word meaning, is
thought to be active in both hemispheres (Fridriksson et al., 2018).
About 30 years ago, MRI technology was enhanced region of Wernicke’s area. Activity is also seen on the
to monitor brain activity as an individual performed a middle wall of the left hemisphere, both in cortical and
task. The technology is called functional MRI (abbre- subcortical structures.
viated as fMRI). fMRI shows which parts of the brain The “big picture” conclusion from these analyses
“light up” for different tasks. Neurons use more oxy- is that the brain areas that “control” speech production
genated blood when active, as compared with resting. (or, perhaps more generally, language expression) are
Neurons in a region of more heavily oxygenated blood more widespread than Broca’s area. The same conclu-
emit a different magnetic signal compared with neu- sion has been reached about Wernicke’s area — more
rons in a less heavily oxygenated blood region. With brain regions are active for language comprehen-
the proper equipment and software, an MRI scanner sion than the region around the upper temporal lobe
can detect these oxygen differences and show locations (Ardila, Bernal, & Rosselli, 2016b). Because these dif-
in the brain that are presumed active during the perfor- ferent areas “light up” together during communication
mance of specific tasks. behavior, it appears they are connected as a network.
A lesson learned from fMRI studies of speech It is the network that is important, not simply brain
and language behavior is that many areas of the brain centers such as Broca’s and Wernicke’s areas.
“light up” for these tasks. Broca’s and Wernicke’s cor-
tical areas are far from the complete story of how the
brain functions for human communication. Figure 2–11 Diffusion Tensor Imaging
shows data from multiple studies on brain region acti-
vation for oral language tasks (i.e., speech production) If communication is controlled by a network in which
(Ardila, Bernal, & Rosselli, 2016a). The surface of the clusters of cell bodies (such as in the cortex) are con-
left hemisphere is shown on the left, the medial wall of nected, it follows that the proper connections — fiber
the left hemisphere is shown on the right. tracts — must exist. Computed tomography (CT), MRI,
On the surface of the left hemisphere, Broca’s area and fMRI images show gray and white matter but
is active for oral language (purple) but frontal lobe do not show specific fiber tracts that connect specific
regions adjacent to but more forward than Broca’s area groups of cell bodies such as Wernicke’s and Broca’s
are also active (red). The blue areas include not only areas. An MR technique called diffusion tensor imag-
the motor cortex in the frontal lobe (gyrus immediately ing (DTI) creates images of specific fiber tracts (such as
anterior to the central fissure) but cortical tissue in the the arcuate fasciculus) that connect cell bodies.
Figure 2–11. Summary of areas of cerebral cortex of left hemisphere that are active during oral language. Note activ-
ity in multiple areas. See text for additional details. Reproduced with permission from Ardila, A., Bernal, B., & Rosselli, M.
(2016a). How localized are language brain areas? A review of Brodmann areas involvement in oral language. Archives of
Clinical Neuropsychology, 31, 112–122.
At least two important lessons have been learned Clusters of cell bodies within the cerebral hemi-
from DTI research. First, the necessary connections spheres but below the cortex are called subcortical
exist. Second, certain brain diseases have as much or nuclei; the cortex and subcortical nuclei are connected
more damage in the fiber tracts as they do in cortical by fiber tracts.
structures. In these diseases, the disruption of connec- Both cerebral hemispheres are organized into four
tions is as important to the disease process as is dam- lobes, the frontal, parietal, occipital, and temporal lobe.
age to gray matter. A good example is Alzheimer’s The left hemisphere is specialized for speech and lan-
disease, in which dementia is a primary symptom. guage in about 90% of people.
Dementia is almost always associated with a communi- Broca’s area, in the frontal lobe, appears to be
cation disorder. The presence of white matter disease in strongly associated with the production of speech,
dementia-associated communication disorders shows and Wernicke’s area, mainly in the temporal lobe but
that the planning, production, and comprehension of also involving a small region of parietal cortex, is
language are dependent as much on the connections strongly associated with comprehension of speech/
(white matter) between nuclei and cortical cell bodies language.
as they are on the cell bodies themselves (gray matter). The specialization of these areas in the left, and
This recently acquired knowledge lends further sup- not right, hemispheres was originally discovered in
port to the idea of a brain network for the production patients who had suffered strokes; damage in one of
and perception of speech and language. these areas in the left hemisphere resulted in speech
and language problems, whereas damage to the same
areas in the right hemisphere did not have the same
effect on communication abilities.
Chapter Summary
In the left hemisphere, cortical tissue surrounding
the sylvian fissure (perisylvian language areas) plays
The nervous system includes the CNS and PNS. a prominent role in speech and language. This tissue
The CNS includes the cerebral hemisphere and its includes Broca’s and Wernicke’s areas, other parts of
contents, the brainstem, cerebellum, and spinal cord; the cortex, and fiber tracts that connect different corti-
the PNS includes all the nerves connected to the brain- cal areas.
stem and spinal cord, these nerves carrying information The brainstem is part of the CNS that contains
from the CNS to different parts of the body (motor), or nuclei for the control of head and neck muscles — that
conveying information from different parts of the body is, the muscles that control speech articulation and
to the CNS (sensory). phonation; the brainstem also contains nuclei for sen-
The basic cellular unit of the nervous system is a sation from head and neck structures.
neuron. The neuron has a cell body and an axon. Clus- The spinal cord contains analogous nuclei, for
ters of cell bodies make up the gray matter of the brain, the control of muscular contraction of, and sensation
and bundles of axons the white matter. from, respiratory structures and other important body
The basic function of a neuron is to conduct elec- structures (such as the arms and legs); the spinal cord
trical impulses from the cell body via its axon to other is important for breathing in general, and breathing for
axons; neurons communicate with other neurons at speech in particular.
synapses, where electrical energy is converted into The basal ganglia, a group of subcortical nuclei,
chemical energy and then back into electrical energy. play an important role in movement.
The electrical energy of one neuron causes a neu- The thalamus, a subcortical nucleus consisting of
rotransmitter to be released at the end of its axon, many smaller nuclei, is the main sensory relay to the
which affects the cell body of another neuron by caus- cortex; the information from almost all sensory stimuli
ing it to “fire” (that is, conduct an electrical impulse). makes a final synapse in the thalamus before being sent
The surface of the cerebral hemispheres shows to the cortex.
ridges (gyri) and deep fissures (sulci); the surface is The cerebellum communicates with all parts of the
called the cortex, a thick layer consisting of gray mat- brain and is important for coordination of movement.
ter that cover the cerebral hemispheres somewhat like Modern techniques of imaging the brain allow for
the rind of a watermelon. precise identification of the size and activity levels of
Many millions of neuron cell bodies are packed specific brain structures. Studies using these techniques
into a relatively small volume in the human brain, suggest that the function of the brain is very complex,
partly by “hiding” additional cortical surfaces in the and that a network of many different brain structures
walls of the sulci. is involved in speech and language function.
References Bostum, A. C., Dum, R. P., & Strick, P. L. (2018). Functional

anatomy of basal ganglia circuits with the cerebral cortex
Ardila, A., Bernal, B., & Rosselli, M. (2016a). How localized and the cerebellum. Progress in Neurological Surgery, 33,
are language brain areas? A review of Brodmann areas 50–61.
involvement in oral language. Archives of Clinical Neuro- Fridriksson, J., den-Oden, D-B., Hillis, A. E., Hickok, G., Ror-
psychology, 31, 112–122. den, C., Basilakos, A., . . . Bonilha, L. (2018). Anatomy of
Ardila, A., Bernal, B., & Rosselli, M. (2016b). How extended is aphasia revisited. Brain, 14, 848–862.
Wernicke’s area? Meta-analytic connectivity study of BA20 Hoit, J. D., & Weismer, G. (2016). Foundations of speech and
and integrative proposal. Neuroscience Journal, https://doi. hearing: Anatomy and physiology. San Diego, CA: Plural
org/10.1155/2016/4962562 Publishing.
Bear, M. F., Connors, B. W., & Paradiso, M. (2015). Neurosci- Kandel, E. R., Schwartz, J. H., Jessell, T. M., Siegelbaum, S.
ence: Exploring the brain (4th ed.). Philadelphia, PA: Wolters A., & Hudspeth, A. J. (2012). Principles of neural science (5th
Kluwer. ed.). New York, NY: McGraw-Hill Medical.
Bhatnagar, S. C. (2013). Neuroscience for the study of communica- Kent, R.D. (1997). The speech sciences. San Diego, CA: Singular
tive disorders (4th ed.). Philadelphia, PA: Wolters Kluwer/ Publishing.
Lippincott Williams & Wilkins. Penfield, W., & Roberts, L. (1959). Speech and brain mechanisms.
Bostum, A. C., Dum, R. P., & Strick, P. L. (2010). The basal Princeton, NJ: Princeton University Press.
ganglia communicate with the cerebellum. Proceedings of
the National Academy of Sciences, 107, 8452–8456.
3
Language Science
Introduction he thought of as) the wonder of speech and language

production, perception, and comprehension. One of
What is language? In lay terms, it is the “thing” we his brothers asked him, “So, exactly what is it that you
use to communicate. Ask the average person on the are studying?” When the author answered this ques-
street this question, and he or she may respond, “You tion with an excruciatingly boring monologue about
know, it’s the words, the sentences, stuff like that.” And the intricacies of tongue movement, the nature of the
indeed, it would be difficult to argue with this answer, speech acoustic signal, the fascinating changes in air
because the words and sentences and stuff like that are pressures and air flows within the speech mechanism,
all clearly important parts of language. Why is lan- and their role in producing a series of speech sounds
guage so much more than stuff like that? to create words and convey meaning, his brother
For those of us who as children had a typical his- responded, “I don’t get it, what’s the big deal, you open
tory of speech and language development and who your mouth and you talk.”
have been fortunate enough to avoid illnesses associ- Indeed, you do just “open your mouth and talk,”
ated with communication disorders, language does or at least most of us do. For those who cannot, how-
not seem like a very big deal, at least in its daily use. ever, either because of developmental problems or as a
For humans, language is a bit like bipedal locomotion, result of a stroke, a degenerative neurological disease,
sleeping, and enjoying chocolate cake: it comes natu- or a structural problem within the speech and/or audi-
rally and seems like it should not be any other way. We tory mechanism, the deficit or loss of this natural ability
are all aware, at some level (often implicit), of the pro- has devastating consequences. It is precisely because
found way in which language defines us as a species. language is so naturally human that its disruption is
After all, you do not hear zoo animals asking each other such a big deal.
for another piece of chocolate cake. Because speech and
language are so intertwined with being human and are
so natural for us, we sometimes run the risk of treating What Is Language?
speech and language as our old, uninteresting friend,
someone we know everything about and who does not Language is studied by scientists from many differ-
surprise us. ent disciplines, including speech-language pathology,
When the author was a doctoral student at Univer- linguistics, cognitive psychology, general medicine,
sity of Wisconsin (UW)–Madison, he visited his home neuroscience, computer science, and even engineer-
on the East Coast after his first semester, full of (what ing. Individual scientists may disagree on the details
35
of exactly what they study as “language,” but we can for a listener to wait until the speaker has paused suffi-
offer a fairly broad and noncontroversial definition to ciently for the talking turn to shift to the listener. Other
organize our discussion of language science. cultures may allow lots of verbal interruptions, indeed
Language is a conventional, dynamic, and genera- may encourage them in a kind of Darwinian struggle to
tive system of components; the relationships between be heard, with no suggestion of lack of manners.
these components are used to express ideas, feelings, Language is conventional. One language’s con-
and facts in communication with other people. Lan- ventions are no better or worse than those of another
guage also uses mental representations to guide lin- language. When a group of people agree to a set of
guistic behavior. And, in a general sense, language conventions, whatever they might be, communication
is controlled by a network of specific, connected happens. English is referred to a great deal through-
regions of the brain. Each of these claims is discussed out the following discussion, but only because it is
briefly, below. the language of instruction for this course. Examples
from other languages are provided to emphasize spe-
cific points.
Language: A Conventional System
Language is conventional because, to a large extent, its Language: A Dynamic System

use is based on arbitrary specifications and rules. The
arbitrary characteristic of language components is not Language is said to be dynamic because it changes
a problem, provided that a group of people agree on over time; language evolves. Language usage is not
their use for communication. People adopt and agree the same from generation to generation, or even within
upon arbitrary language characteristics as conventions generations as individuals move through stages of life.
to be followed for maximum benefit to the group using The claim of language as an evolving system is easy
them. In this case, the benefit is communication. to verify by, for example, comparing dialog from films
The most obvious (but hardly the only) example or TV shows from the 1950s to those of the present.
of the arbitrary nature of language usage is found in Certain aspects of language — usually words — may
words and their meanings. Different words are used in become more or less “extinct” over time, certain oth-
different languages to mean the same thing. Speakers ers are adaptable and are rarely pushed to the margins
of English understand the sequence of sounds form- by contemporary culture. For example, if a sophomore
ing the word “home” because they have agreed, even attending college in 2018 was asked if he wanted to
if implicitly, on its meaning as a place where people attend a football game and responded, “That would
live. There is nothing in the sequence of an “h,” “o,” be swell,” other sophomores overhearing the response
and “m” sounds that captures the concept of a place may ask where this student’s spaceship is parked and
of your own, where you eat, sleep, and raise children, how it escaped the gravitational pull of his home planet.
any more so than the sequence of a “k,” “ah,” “s,” and On the other hand, if the student said “cool,” few of
“ah” (“casa”: Spanish), or “j,” “ee,” and “p” (“jeep”: his contemporaries would take notice. The interesting
Korean). Users of an imaginary language may call the thing about these two words is that “swell” succumbed
place where you live a “glerkin”; if everyone agreed to linguistic evolutionary pressures after frequent use
on this use, “glerkin” would induce exactly the same in the 1950s (watch a Leave it to Beaver or Father Knows
warm feelings as “home,” “casa,” and “jeep.” Best rerun, you will probably hear the word used in
Words are not the only arbitrary characteristic of a context similar to the one given above). “Cool,” on
languages. The phonetic contrasts used to make differ- the other hand, adapted and maintained its usefulness,
ences in word meanings vary widely across languages beginning its widespread usage in English as a marker
(see Chapter 12), as does word order within a sentence. of the beatnik/jazz culture of the early 1950s and then
In English, for example, the subject-verb-object word making its way through hippie culture, hair bands,
order is required (“The dog chased the rabbit,” not the frighteningly empty 1980s, and all the way to the
“chased the rabbit the dog” or “rabbit the dog chased”). present. Cool is cool, then and now. Somewhere in
In Russian, word order for these kinds of sentences is between these extremes along the continuum of verbal
optional, unless the larger conversational context of the viability, we used copasetic, groovy, far out, solid, tight,
sentence requires a specific order. Even social uses of rad, and awesome, to mean basically the same thing. For
language may be arbitrary. For example, the charac- an interesting dissection of the origins of “cool” as we
teristics of conversational turn-taking vary quite a bit now understand it to mean hip, good, great, and many
across different ethnicities and cultures. In some cul- other shades of “okay,” see the Slate column (http://
tures, it is imperative as an indicator of good manners www . slate.com/articles/life/cool_story/2013/10/
cool_the_etymology_and_history_of_the_concept_ or 60-year life span. How are these examples different

of_coolness.html) on the long history and evolution of from human communication?
the word. Human language differs from animal languages
Language is also dynamic because it is used as by virtue of its creative and consistently novel char-
a group marker to indicate, “I belong to this group” acteristics. Humans take a small number of language
without saying so explicitly. High school students and rules and use them to combine and recombine words
young college-age students typically sound different to produce interesting and, typically, never-before-
from their parents, instructors, and other adults from heard or -spoken utterances. This is the generative
whom they are one or two generations removed. That nature of human language, its ability to produce new
is to say, high school students, within the same broad utterances by applying conventional rules and word
culture (e.g., American culture), have subtly different meanings to the needs of a communication goal. Afri-
accents and not-so-subtle differences in dialect from can Grey parrots produce lots of different utterances,
their parents and grandparents. Everything from the apparently because of their outstanding mimicry and
choice of specific words to prosody is often used as a memory skills, but evidence for their ability to gener-
group marker. A 55-year-old male and 20-year-old male ate novel phrases is scant to nonexistent.1 The evidence
may both use the word “cool” as a response without for novel phrase production is somewhat more compel-
attracting attention, but the 55-year-old who explains ling in primates who have been taught sign language
how he likes to fish by saying, “That’s how I roll on and who have been reported to combine individual
the weekend” will attract attention because of the mis- signs in unique ways to achieve communication goals.
match between his age and language usage; so, too, the Nevertheless, even if one accepts these reports as accu-
contemporary 20-year-old who says, “I’m attending a rate (and not everyone does), the tremendous effort
rock and roll concert tonight.” required to teach primates sign language is in stark
The dynamic nature of language is inextricably contrast to the human infant/early toddler’s ability to
tied in with cultural shifts. When your author was generate new and useful phrases with a small set of
a teenager, certain words spoken in public created a words in the absence of formal instruction; to a signifi-
major, scandalous incident. The open-air utterance of cant degree, it just happens. Certainly it “just happens”
at least one of these words suggested a deep charac- for little humans in an environment rich with linguis-
ter flaw on the part of the speaker, like dipping your tic stimulation; put a chimp in the same rich linguistic
face into a mound of mashed potatoes and shouting, environment and she will not produce spontaneous
“Look, no hands!” at an upscale restaurant to which two- or three-word utterances, vocal or signed, around
your future in-laws have taken you to get to know you 2 years of age. There is something radically different
better. In contemporary culture, those words — and about the human child’s language capabilities and
in particular, that one — have more or less entered the potential from those of a primate, even one who has
mainstream as judged by their frequent use in public, been intensively instructed.
on TV shows, and in films, from the mouths of people The generative nature of human language is often
from all generations. taken as a proof for its genetic legacy. Clearly, if the
An interesting, mostly nontechnical survey of lan- generative nature of language is what makes it pecu-
guage evolution is found in David Crystal’s The Stories liarly human, the human genome must have a lot to
of English (2004). do with our language ability (Fisher & Vernes, 2015;
Fisher, 2017). But it would be a mistake to discount
environmental influences on human language devel-
Language Is Generative opment and skills. Human language characteristics
reflect an interaction between the environment and
Speech and language define the human species. The the human genetic endowment for sophisticated lin-
thoughtful reader may question this claim by pointing guistic behavior.
to various animal languages or in some cases an appar-
ent animal ability to produce speech. Honey bees have
an elaborate communication system for guiding their Language Uses Mental
fellow workers to sources of food, chimps have been Representations
taught rudimentary sign-language skills, and African
Grey parrots in captivity learn and produce an enor- Language learning and usage are based on the develop-
mous number of phrases over the course of their 50- ment and manipulation of mental representations. This
1
Dr. Irene Pepperberg has claimed the famous, late African Grey, Alex invented new phrases by combining previously unrelated words.
is a central idea in much of contemporary cognitive When neuroscientists talk about “localized function,”
psychology, an idea we will adhere to in this chapter, they refer to a region or regions of the brain, and con-
even though there are scientists who do not believe in nections between these regions, in which the tissue
mental representations. A mental representation is an is specialized for a specific function, or at least has
idea or image in the mind that has information content. developed a critical role in a specific function. There
Many language scientists believe there are mental rep- are many debates about brain localization for specific
resentations of the sounds that are used contrastively functions, and especially in the case of speech and
in a language. According to this view, in English there language. It is safe to conclude, however, that in the
is a mental representation of the vowel category “ee” overwhelming majority of people, language is local-
(in phonetic symbols, /i/), which can replace other ized to tissue in the left hemisphere of the brain. We
vowel categories in various syllable forms to create know about the localization of speech and language
alternative word meanings. For example, the vowel in from natural diseases (e.g., stroke) that affect specific
the word “heat” distinguishes the word for a temper- parts of the brain, from brain imaging studies of speech
ature-based concept from the word “hat”; the words and language activities, and from surgical procedures
differ only by the vowel separating the “h” from the in which parts of the brain are stimulated or rendered
“t.” The mental representation of “ee” is built up and nonfunctional for a brief (not permanent) time. Left-
stabilized by exposure to the language. hemisphere brain localization for speech and language
In contrast, in English there is presumably no men- is yet another argument for the species-specific nature
tal representation — or at least not a linguistic one — for of speech and language. The brain regions associated
the vowel “ee” produced with the lips rounded. This with speech and language are discussed in greater
is because an “ee”-like vowel made with rounded lips detail in Chapter 2.
does not distinguish words in English. “Heat” spoken
in English with the lips spread (the typical production)
or with the lips rounded would strike a listener as the Components of Language
same word, even if the rounded version sounds odd
or like a poor attempt to imitate a foreign accent (such Language is made up of well-defined components
as Swedish). The rounded “ee” is one of the variants of which are grouped into three categories. These catego-
the mentally represented category of “ee” in the Eng- ries are form, content, and use.
lish-speaker’s mind. This means it is recognized as a
member of the “ee” phoneme, albeit one with a slightly
different sound. A similar example can be made for Form
semantic categories. Language users have a mental
representation for the category “dog,” which is devel- The form category includes the components of lan-
oped from exposure and contact with many dogs. This guage that are referred to as “structural.” The three
mental representation contributes to a listener under- subcategories of form are phonology, morphology,
standing of the kind of animal that has just relieved and syntax.
himself on the front lawn when someone says, “A dog
just made a mess on our lawn.” This mental representa-
Phonology
tion and its connection to language prevent the listener
from imagining the offending animal to be, say, a lion, Phonology is the study of the sound system of a lan-
a horse, a cat, or some other type of animal with fur, guage. The phonological component of a language
primarily quadrupedal locomotion, a tail, and so forth. includes all the sounds used by a language (the phonetic
The mental representations for language units are inventory), the phoneme categories of the language (the
thought to be powered by a variety of psychological sounds that are “contrastive”), and the rules for sound
processes. These processes create, maintain, adjust, and sequences that can create words (phonotactic rules).
store the representations.
Phonetic Inventory and Phonemes. Imagine an
expedition to Madison, Wisconsin, from some far-
Language Is Localized in the Brain away planet where the language is entirely different
from English. These space travelers have a scientific
There is good evidence that speech and language capa- tradition of language study, like our own, and resolve
bilities are controlled by a specific network of brain to gather information on the native tongue of Madiso-
regions and their connections. In classical terms, lan- nians. Like any good phonetician, the space travelers
guage is said to be “localized” in the nervous system. use their highly evolved recording instruments which
have become a permanent part of their brains; their “g” and “k.” To offer an example with another sound
ears serve as microphones, covered by only weakly pair, in English the “s” and “sh” sounds are phonemes,
effective, cartilaginous windscreens to collect a large as easily shown by such word pairs as “so”-“show”
speech sample from several native speakers. Because and “sip”-“ship.” The space-traveling phoneticians
there is a coffee shop on every corner of Madison, Wis- learn the changes in word meanings with these sound
consin, they have no trouble finding people willing to exchanges by asking the humans what the words mean
talk at length and provide an extensive sampling of when the sounds are exchanged; this is how they learn
the sounds used in their language. The space travelers the phonemes of Madison English — identifying the
examine the collected speech samples carefully, using sounds from the phonetic inventory that result in the
narrow phonetic transcription to identify every sound change of a word meaning.
heard in the recordings. “Narrow” transcription means The space travelers jet off to Amsterdam, in the
they record very fine details of the sound production. Netherlands, to identify the phonetic inventory and
These aliens have superhuman skills in phonetic tran- phonemes of the local Dutch dialect. They use the same
scription — they will get it right. methodology as in Madison — sitting in coffeehouses
When the space travelers have finished their tran- and engaging people in conversations, to determine if
scriptions and analyzed them, they construct a chart the phonetic patterns of Dutch are the same or different
showing all the sounds they have transcribed. The chart from those in English. Dutch, they learn, has the “k”
is a record of the phonetic inventory of the language and “s” sounds but these do not contrast with “g” and
spoken in Madison, Wisconsin. The chart records about “sh,” respectively. Although these sounds may some-
a dozen vowel sounds, as well as glides, stops, frica- times be heard in Amsterdam Dutch, there are no “k”-
tives, and affricates (see Chapter 12). An interesting “g” or “s”-“sh” exchanges in Dutch that change word
feature of their analysis is the case of stop consonants meanings. That is, “g” and “sh” do not function as pho-
(such as “p,” “t,” “k”), for which the space travelers nemes in Dutch. When they occur, they are phonetic
find very small but noticeable differences for a specific variants of the “k” and “s” phonemes.
sound. A good example is the “t” sound, which some- Consideration of these examples leads to an inter-
times sounds as if it is produced with a strong burst of esting insight: the phonetic inventory and phonemes
air; they transcribed this sound from words like “top,” of a language are not the same thing. Here is another
“type,” and “attack,” but they do not know the words example, based on the space travelers’ recording of
yet, they are just listening to the sounds. At other times Madison English “t” sounds. In a word such as “light,”
the “t” sounds as if it lacks this burst (words like “mitt” the space travelers may hear the word-final “t” pro-
and “stop.” They also record a stop sound that sounds duced with (a) no burst of air, (b) a burst of air, or (c) a
as if the speaker suddenly closed the vocal folds for a sound made as if the vocal folds were being closed
short time before suddenly releasing a puff of air. This tightly for a brief time, and then released. These pho-
sound resembles “t” in some ways, especially ones netic variants can be heard with training, yet the word
without a burst of air, but the space travelers have suffi- “light” can be spoken with each of the variants with-
ciently good transcription skills and can hear the sound out changing its meaning. The variants are separate
as slightly different from the other “t” sounds. entries in the phonetic inventory but are all versions of
When the phonetic inventory is established, the the phoneme category “t.” Phonetic variants of a single
space travelers must determine which of those sounds phoneme category are called allophones. Languages dif-
function as phonemes. A phoneme can be defined as a fer widely in their phonetic inventories as well as in the
speech sound category that can change the meaning of way those sounds are used as phonemes.
a word, when it is exchanged for another speech sound Here is a very important point. Phonemes are the
category. The most straightforward example of this minimal sound unit that can change word meaning,
sound exchange is when they both appear in exactly but phonemes have no meaning of their own. Pho-
the same location within the words. Two simple exam- nemes are therefore not the minimal unit of meaning
ples will make this definition clear. In English, the “k” in languages. Minimal units of meaning are discussed
and “g” sounds are both phonemes, because they can in the section titled, “Morphemes.”
change the meaning of a word when substituted for
each other. The word “duck,” for example, is changed Phonotactic Rules. Phonotactic rules specify the
to “dug” simply by exchanging the word-final “k” with allowable sequences of phonemes for word formation,
a “g.” Native speakers of English above the age of 6 or as illustrated by the three following examples. First, in
7 years know that these are two different words — they English, words cannot start with a velar nasal sound
have different meanings. Similarly, “girl” and “curl” (the sound at the end of the English word in “sang”),
are distinguished in meaning by the exchange of the but they can in several languages of the world (e.g.,
Burmese: Ladefoged, 2001). Second, English allows “wait.” The words “dogs” (dog + s) and “waited” (wait
syllables to be initiated by a fricative followed by a + ed) therefore consist of two morphemes, one free
stop (as in the word “stop”), but many languages (e.g., and one bound. Other familiar bound morphemes in
Japanese: Avery & Ehrlich, 1992) do not allow syllables English include “-ing” (running), “-er” (taller), “-able”
initiated by consonant clusters (such as the “st” in (laughable), “pre-” (prenuptial), “un-” (unusual), and
“stop”) to initiate syllables. Third, English syllables can “-ish” (foolish).
take a variety of forms for words, including ones that Morphemes are important in language develop-
end with a final consonant (CVC [consonant-vowel- ment, showing a typical pattern (on average) of mas-
consonant] form, as in the word “kick”). In Mandarin tery as children develop and gain levels of language
Chinese, however, all syllables end in vowels — there sophistication. Morphemes also have significance in
are no CVC, or closed syllable forms (i.e., syllables certain delays and disorders of language development,
“closed” with a consonant2). These examples illus- as discussed in Chapters 7 and 8.
trate phonotactic rules (also known among linguists as
phonotactic constraints) in different languages, or restric-
Syntax
tions on the sequences of sounds that are permitted to
form words. Syntax specifies the rules for ordering of words to form
“legal” sentences. For example, in English, “The cat ate
the mouse” is a legal sentence, whereas “Ate mouse the
Morphology
cat” is not, because the verb must follow the subject of
Morphemes are the smallest meaningful units in lan- the sentence. This is true for English even though the
guage. Morphology is the study of the rules applying meaning of “Ate mouse the cat” can be worked out. As
to morphemes and how they are used and modified noted, some languages have very strict rules about the
in communication. There are free morphemes and bound positions of verbs, adjectives, and nouns in sentences,
morphemes. Table 3–1 shows examples of both types. whereas others do not.
A free morpheme is a minimal unit of meaning that Another example of a syntactical rule in English
stands alone; words such as “dog,” “wait,” “run,” is the requirement for noun phrases to link with adjec-
“hoot,” and “giraffe” are examples of free morphemes. tives by means of the “to be” verb and its variants.
A bound morpheme is a minimal unit of meaning that “John is happy” is legal, “John happy” is not.
cannot stand alone but must be attached to a free mor-
pheme to implement its meaning. As shown in Table
Content
3–1, the plural “s” is a bound morpheme whose mean-
ing is “more than one” when attached to a free mor- The content component of language refers to meaning.
pheme such as “dog.” Similarly, the past tense “ed” Morphemes have already been identified as the mini-
is a bound morpheme whose meaning is “occurred in mal units of meaning in language, but here we are con-
the past” when attached to a free morpheme such as cerned with the nature and organization of all meaning
units. The branch of language science concerned with
meaning is called semantics.
Table 3–1. Free Morphemes and Bound Morphemes Semantics includes not only the meaning of words,
and how sets of words may be similar or different,
Free Morphemes Bound Morphemes but also the meaning of phrases. As noted earlier, words
dog -s
and their meanings are established arbitrarily and
become useful when the linguistic community agrees
wait -ed on the correspondence between them. Semanticists are
perfect im-
interested in how speakers employ and organize the
words they produce, and how listeners interpret those
certain un-; -ly; -ty words. Readers may wonder, “Why should there be a
difference in the meanings of a word or phrase when
run -ing
users of a language agree on them?” This is one of the
laugh -able many interesting aspects of semantics — that language
users may agree in general on semantics but differ on
fool -ish
specifics.
2
There is an exception to this phonotactic rule in Mandarin: CV (consonant-vowel) syllables can end in the velar nasal sound mentioned in
the text.
Big Bits of Language

The example in text of the bound morpheme “-able” in the
word “laughable” (which transforms the verb “laugh” to an adjective as
in, “His novel is laughable” or as in “What a laughable novel”) offers an
opportunity to illustrate the complex nature of morphology, and the tricky
relationship in English between orthography (the printed representation
of a word) and its spoken version. In “laughable,” the same orthographic
sequence (able) is also a free morpheme, but pronounced differently:
“uh-bl” when it functions as a bound morpheme, “ay-bl” for the free
morpheme; the meaning of both the bound and free versions is essentially
the same. In “laughable,” the addition of the bound morpheme changes
the verb (laugh can also be a noun) to mean, “able to evoke laughter
(or laughs).” When the free morpheme “able” (“ay-bl”) is added to the
free morpheme “laugh,” note how its pronunciation changes from the
free-morpheme version. If you told someone, “I’m uh-bl (able) to meet
that deadline,” you would receive a strange look; your listener might not
understand what you mean. You may get an equally strange look if you
said, “That’s laugh-ay-bl,” but in this case, your listener is likely to know
what you mean, even if recognizing there is something wrong with your
pronunciation. These changes in sound when morphemes are combined
are called morphophonemic alternations. They are often rule based (the same
change from “ay” to “uh” occurs for words such as “portable,” “notable,”
“changeable,” and “stackable”). A nicely complicated example of morpho-
phonemic alternation is the spoken versions of the words “harmony”
expanded to “harmonic,” “harmonious,” and “harmonizing.” As the
morphology changes, notice how the sound “ee” after the “n” in the “base”
morpheme (“harmony”) changes to “ih” in “harmonic,” back to “ee” in
“harmonious,” and to “eye” in “harmonizing”; in “harmonious” the sound
after “m” changes to “oh.” An additional piece of this morphology puzzle is
that the stressed syllable in each word (bolded in the orthographic repre-
sentation that follows) depends on the morphological structure. These shifts
in stress can affect the pronunciation of the vowel following the “n.”
The phonetic transcriptions of this sequence of morphophonemic
alternations have been included with a guide to sound symbols that are
different from English orthography.
harmony /hɑrməni/ phonetic symbol /ɑ/ = “ah”; /i/ =

“ee”; /ə/ = very short “uh” (schwa)
harmonic /hɑrmɑnIk/ phonetic symbol /I/ = “ih”
harmonious /hɑrmoniəs/ phonetic symbol /o/ = “oh”
harmonizing /hɑrmənɑIzIŋ/ phonetic symbol /ɑI/ = “eye;
/ŋ/ = “ing”
Some scientists believe these subtle differences in gender of the question-asker (Tannen, 1994). A more
word meanings can wreak havoc in certain communi- subtle example, one in the arena of the social use of
cation settings. For example, most people would agree language (discussed later in this chapter), is the mean-
on the general meaning of the word “fine” as a response ing of a word such as “adorable.” The obvious content
to the question, “How do I look?,” but the more spe- of this word is clear, as applied to a child, perhaps, or
cific interpretation of the response may depend on the a small, cuddly pet, but a deeper level of meaning may
be implied by the gender of the speaker. Semanticists, requirements for communication success, but we all
therefore, explore all sorts of meaning aspects of words know people who have trouble with these require-
and phrases, and the nuance of meanings depending ments. It is as if these individuals do not recognize
on who is speaking, the context of the conversation, the mismatch between their conversational techniques
and so forth (Tannen, 1994). and those of the majority of people with whom they
In speech and language development, vocabulary communicate. Pragmatics, like other aspects of lan-
is of interest as children begin to understand the hierar- guage, must be learned. These learned pragmatics
chical structures of meaning. “Hierarchical” means that skills can be undone by certain neurological diseases.
certain meanings are subsumed, or embedded, within The deterioration of pragmatics skills is part of a lan-
larger meanings, like subcategories of more general guage disorder.
categories. At an early stage of development, “dog” Pragmatics is a complex, culture-bound aspect
may mean only Muffy, the family dog, but as the child of language. The domain of pragmatics extends well
accumulates life and language experience, “dog” takes past actual spoken conversation to such things as body
on a more global lexical status. “Dog” may include language, gestures, patterns of eye contact during con-
any of those sniffing four-legged creatures attached to versation, distances between speakers engaged in a
humans by a rope, and especially that nasty slobbering conversation, and choice of vocabulary. For example,
Spike who scares poor Muffy when they meet on the the simple act of waving a friendly goodbye to some-
street. Lexical (vocabulary) development is covered in one with an open hand, fingers spread, and palm facing
more detail in Chapter 6. the person may be interpreted in Greece as an insult. In
certain cultures, the way in which someone leaves a
room after concluding a conversation with a superior
Social Use of Language (a boss, a teacher, a parent) may differ in accordance
(Pragmatics) with pragmatic rules (e.g., backing out of a room ver-
sus turning and leaving). Just as pragmatic rules may
Social use of language is distinguished from the form be variable across different cultures, where cultures
and content components of language partly because are defined geographically, the rules may vary within
it does not have easily identified units (such as pho- cultures according to age group, ethnic and racial back-
nemes, morphemes, words) or rules (as in the case of ground, and so forth.
syntax), and partly because it concerns communica-
tion on the “big stage” of social interaction. Social use
of language is sometimes discussed under the term Language and Cognitive
pragmatics, sometimes under the term speech acts. Even Processes
if units and rules of language pragmatics are hard to
specify because they are less obvious than the units and This chapter reviews the components of language
rules for form and syntax, language pragmatics is not as separate entities. The presentation seems to make
random or unguided by convention. For the following sense — an adult can accept the idea of phonemes,
discussion, the term “pragmatics” will be used to des- morphemes, words, syntax, and even pragmatics as
ignate the social use of language. components of language. The definition of these com-
Pragmatics often involves aspects of communica- ponents, however, does not include discussion of how
tion that are implicit to the language user’s communi- the components interact in both language development
cation skills. People know successful communication and communication among people with well-devel-
when they see it, but if asked to identify characteristics oped language skills (e.g., individuals at least 5 years
of successful communication, they are not necessar- of age). It also does not discuss why language develops
ily able to verbalize them in explicit terms. Never- so rapidly, how language develops, and when compo-
theless, pragmatics are an essential part of successful nents of language and their interactions are mastered
communication. throughout language development. Each of these is
Conversations are maintained between two or considered in turn in the following sections.
more people by observing these implicit rules. Speak- As with the components of language, the why,
ers engaged in a conversation cooperate to maintain it how, and when of language development and mas-
by sticking to the topic under discussion or using the tery are not independent. The discussion of “how” and
appropriate means to change the topic, by observing “when” is expanded in Chapter 6. These questions are
turn-taking rules, and by speaking in a voice appro- important because they have direct implications for the
priate to the setting (e.g., not overly loud). Coopera- understanding, diagnosis, and treatment of speech and
tion and politeness in conversation seem like obvious language disorders.
Why of bound morphemes (“want”-“wanted”), through

more sophisticated syntax, and on through more com-
The question of why language develops, and so rap- plex levels of language usage. Development of the
idly, is answered in different ways depending on form, content, and usage (pragmatics) components
(among other things) the initial premise of the ques- of language are most notable between ages 1 and 9 or
tion. If the premise is the existence of an innate speech 10 years, with especially rapid advances in the first
and language property of the human brain, language five years after the first word. Included in this rapid
develops because it is driven by brain tissue dedicated advance is complete mastery of the sound system of
to it. In this view, there is a human brain mechanism language by no later than 8 or 9 years of age. The suc-
for speech and language, different from the commu- ceeding levels of language development overlap in
nication brain mechanism of any other animal. At time; vocabulary, for example, grows rapidly as utter-
some point in the first year of life, the mechanism is ance length (multiword sentences) expands.
“turned on,” to initiate speech and language develop- Language development includes both compre-
ment. Language advances so rapidly from first words hension and production (expression) skills. Language
through morphology, vocabulary, and sentence struc- comprehension and expression are interdependent
ture (syntax), presumably because it is coded to do so processes in language development. Comprehension
by this mechanism. This view has been championed by and expression can also be understood as partially (or
the famous linguist Noam Chomsky (Chomsky, 1957, sometimes wholly) independent. For example, in the
1975). He called this brain mechanism a language acqui- preverbal stage of language development, comprehen-
sition device (LAD). sion skills are more advanced than skills of expression.
The idea of an innate, human-specific language The preverbal stage for expression is largely phonetic
mechanism has been disputed and rejected by many (as in babbling), rather than phonemic (because there
scientists. One compelling argument against an innate are no true words), but comprehension of language
mechanism for language is the large variability among and sophisticated speech perception skills for phone-
typically developing children in the rate of language mic contrasts are present. For example, a 6-month-old
learning. Across typically developing children, first infant is typically on the verge of comprehending her
words may be spoken over a wide range of months; first words but is (on average) an additional 6 months
at 2 years of age, there is significant variability across away from producing a first word.
children in vocabulary size and utterance length (e.g., Language development proceeds well into the
single word versus multiword utterances). If an innate teenage years, and even into adulthood (e.g., expan-
mechanism guides language development, why is sion of vocabulary, or understanding the meaning of
there so much age variability among children in the subtle language use as in humor). Reading skills, as
mastery of language development? one example of language development, are affected
Many scientists argue that the human brain does significantly by language comprehension skills. The
not need a specialized mechanism for language devel- development of language may therefore be influenced
opment. The brain has an extraordinary number of by cross-modality skills (in this case, auditory and
interconnections between its approximately 100 billion visual modalities, see Hogan, Adlof, & Alonzo, 2014).
neurons (brain cells that transmit information); the
brain can function like a supercomputer. The mastery
of language is powered by this massive biological com- When
puting ability, which processes an enormous volume
of speech and language data, organizes patterns, and The “when” of language development is different for
learns from these data how to use language compo- each child. Average-age benchmarks for mastery of dif-
nents for effective communication (see Kidd, Donnelly, ferent components of speech and language are sum-
& Christiansen, 2018, for a review of these issues). marized here and discussed more fully in Chapters 5,
6, and 13.
First words are expected around 1 year of age, a
How 50-word vocabulary at 18 months of age, two-word
utterances at age 2 years, longer utterances between
As discussed more completely in Chapters 5 and 6, the ages of 2 and 3 years, as well as the comprehen-
language development moves from a preverbal stage sion and expression of adjectives, verbs, and free mor-
(prior to first words) through simple language skills phemes. Bound morphemes are usually mastered by
(single words, “dog”), which are later expanded into age 4 or 5, and longer, more complex utterances from
multiword utterances (“dog eat”) to use and mastery age 5 years onward. Complete mastery of phonol-
ogy ranges between the ages of 5 and 8 years. Finally, One view of speech and language development
production and comprehension of complex language is that there is a language acquisition device specific
usage — jokes, irony, and other abstract language phe- to the human brain, that the device “turns on” around
nomena — may not be mastered before the teenage 1 year of age and initiates a series of increasingly
years. These age benchmarks for language components sophisticated language steps.
and the very large variation across children in the time Another view is that language development is
of their appearance are discussed more fully in Chap- guided by cognitive processes (e.g., memory, process-
ters 5 and 6. ing speed, attention) that become increasingly skilled
This short summary of “when” is introduced in and complex with development.
this chapter because of the systematic nature of lan- These cognitive processes are not specific to lan-
guage development across childhood years. Why are guage but are well suited to the organization and
first utterances single words? Why are most of these manipulation of the massive amount of speech and
initial words nouns? Why does vocabulary develop- language data to which a child is exposed.
ment accelerate dramatically when two-word utter- Three questions can be asked about the develop-
ances appear in the child’s expressive skills? A person ment of language: why, how, and when.
who subscribes to Chomsky’s position imagines these “Why” is answered by language in the service of
sequential elaborations of language form, content, and communication: language develops to serve a critical
usage as a set of switches that are turned on throughout need of humans, to communicate complex ideas and
development. The switches are a characteristic of the actions with flexible use of semantics, sentences, and
language-acquisition device. pragmatics.
A person who views learning as the basis for lan- “How” language develops is by starting out very
guage development is interested in the development simply (babbling and first words) and adding new lay-
of cognitive skills as a foundation for increasingly ers of complexity to language skills as a child develops.
sophisticated language skills. Nonlanguage cognitive “When” language develops refers to an age-related
skills (memory, organization, speed of processing, and sequence of steps of language mastery at succeeding
so forth) mature throughout childhood. If nonlinguis- levels of complexity; mastery of each stage of language
tic cognitive skills are employed to power the devel- proceeds through systematic phases for the “average,”
opment of language skills, the schedule for increasing typically developing child, but from child to child there
language sophistication can proceed in parallel with is substantial variation in the ages at which mastery of
and, in fact, influence, nonlanguage cognition, which a specific aspect of language is accomplished.
in turn influences language development. Brain devel-
opment continues well into childhood and the teenage
years, providing increasing brain power to develop References
very sophisticated language skills. In this view, spe-
cial brain mechanisms are not required for language Avery, P., & Ehrlich, S. (1992). Teaching American English pro-
development. nunciation. Oxford, UK: Oxford University Press.
Chomsky, N. (1957). Syntactic structures. The Hague, Nether-
lands: Mouton.
Chomsky, N. (1975). Reflections on language. London, UK:
Chapter Summary Fontana.
Crystal, D. (2004). The stories of English. New York, NY: Over-
look Press.
Human language is unique not only because of its con- Fisher, S. E. (2017). Evolution of language: Lessons from the
ventional, dynamic, and generative nature, but also genome. Psychonomic Bulletin and Review, 24, 34–40.
because it seems to require and make use of specific Fisher, S. E., & Vernes, S. C. (2015). Genetics and the language
brain regions. sciences. Annual Review of Linguistics, 1, 289–310.
Hogan, T. P., Adlof, S. M., & Alonzo, C. N. (2014). On the
It is convenient to separate the components of lan-
importance of listening comprehension. International Jour-
guage into the categories of form, content, and use. nal of Speech Language Pathology, 16, 199–207.
Form includes the phonological, morphological, Kidd, E., Donnelly, S., & Christiansen, M. H. (2018). Individ-
and syntactic components of language, content the ual differences in language acquisition and processing.
semantic component, and use the pragmatic compo- Trends in Cognitive Sciences, 22, 154–169.
nent (social use of language). Ladefoged, P. (2001). Vowels and consonants. An introduction to
Both comprehension and expression (production) the sounds of languages. Oxford, UK: Blackwell.
of speech and language development are important Tannen, D. (1994). Gender and discourse. New York, NY: Oxford
skills in the mastery of communication. University Press.
4
Communication in a
Multicultural Society
Introduction societies. It is easy to see how communication, and

more specifically speech and language, fit with these
If asked to define “culture,” how would you respond? concepts of culture. The first two definitions imply
“Culture” is a concept most people understand but the major role of communication in culture with the
find it difficult to define in precise terms. Here are a phrases, “a set of learned behaviors . . . ,” “way of life
few (admittedly academic) definitions of “culture” shared by members of society . . . ,” and “accumulated
selected from a search on the Internet: habits . . . that define for them their general behavior
and life.” The final two definitions make an explicit link
n “a set of learned beliefs, values and behaviors; between culture and communication by saying that
the way of life shared by members of a society” culture consists of “symbol systems that are acquired,
n “The accumulated habits, attitudes, and beliefs preserved, and transmitted by a group of people . . . ,”
of a group of people that define for them their and, “includes their . . . languages . . . ” (emphasis added).
general behavior and way of life” Language, as shown in Chapter 3 is a conventional set
n “understandings, patterns of behavior, practices, of symbols used by members of a community to com-
values, and symbol systems that are acquired, municate. “Conventional,” in this description of lan-
preserved, and transmitted by a group of guage, means “arbitrary” but agreed upon by members
people . . . ” of a group.
n “Learned behavior of a group of people, which Consider the following anecdote, from the author’s
includes their belief systems and languages, experience. I came to Madison, Wisconsin, in August
their social relationships, their institutions and 1972, never having ventured west of State College,
organizations . . . ” Pennsylvania, where I earned my undergraduate and
master’s degrees. When admitted to the doctoral pro-
These four statements define culture in similar gram at University of Wisconsin–Madison, I was fortu-
ways. The common threads among them are that cul- nate enough to have been awarded a fellowship. This
ture is learned, shared among a group of people, and award required me to fill out some paperwork at a uni-
determines the way they behave and construct their versity administration building, which after arriving
45
in town I located (with difficulty). I walked into the I knew, and was unaware of as being defined by the
lobby and had no idea where to go, but there, at an four bullet points previously presented.
information desk, sat an obviously sightless gentle-
man, wearing dark glasses, who was likely to know
where I should go.
I approached him and asked, “Can you tell me Dictionary of American
where I can find Window 20?” Regional English
He responded, “Walk down the hallway and turn The “Dictionary of American Regional English”
right at the bubbler.” (DARE: http://dare.wisc.edu/, now in six
A bubbler? I had no idea what a “bubbler” was, volumes) lists the many regional variants of
but I imagined a decorative, in-ground water feature — words and phrases that people in specific
I knew water must be involved — perhaps with small geographical regions of the United States have
stone figures of angels or ducks bathing in gently per- agreed to understand as having a specific
colating water emerging from an elegant spout in its meaning, and that people in other regions of
center. I walked down the long hall and saw nothing the country are likely not to understand. This
so peaceful or watery, not even anything vaguely close dictionary is like a manual for the concept of the
to this image that had popped into my head when the arbitrary meanings of words.
gentleman said, “bubbler.”
Back at the information desk, head bowed and
feeling, well, incompetent, I said to the gentleman,
“Okay, I’m really sorry, I don’t see a bubbler or maybe The linkage between culture and communica-
I don’t even know what that is.” tion has been studied for many years. The American
The sightless man, who I would learn through anthropologist Ruth Benedict, in her famous (and still
repeated contacts over the next several years was not a controversial) 1934 book Patterns of Culture, noted the
smiler, grinned ever so slightly and said, “Son, where arbitrary nature of cultural customs and the error of
are you from?” thinking, common among anthropologists who were
“Philadelphia, sir.” her contemporaries, that Western Society was a “ref-
He allowed the grin to turn a tad more obvious and erence” culture to which all others could and should
said, “Down the hall, right at the drinking fountain.” be compared (see Huntington, 1996, for an extended
I immediately, of course, made the connection consideration of this idea). For Benedict, and for many
between “bubbler” and “drinking fountain.”1 I had subsequent scholars of linguistics and anthropology,
never heard the word “bubbler” used in this way, this idea of a “reference” culture extended to speech
even the term “drinking fountain” was slightly for- and language. Benedict pointed out that the speech
eign sounding; in Philadelphia, we called these things sounds used by a specific language are a very small
“water fountains.” The point is not that the use of subset of the total number of speech sounds that can
the term “bubbler” was an impossible hurdle to my be produced by the human speech mechanism. The
understanding of Wisconsin talk — like most adults, speech sounds used in a specific language serve the
I learned what this new label meant after a single trial. functional role of communication. These speech sounds
Rather, this strange spoken label for “water fountain” and the language forms (e.g., words) they create are a
made me feel — different. Over the next several weeks, critical component of a culture. There is nothing inher-
I would have this same experience many times, learn- ently special or “correct” in the pronunciation of Eng-
ing terms such as “stop-and-go light” and “hoser” as lish spoken by Caucasians, in, say, Chicago, Illinois.
alternate expressions for the vocabulary items I used to The pronunciation of speech sounds, the word choices,
designate a traffic signal and a gentle insult directed at the idioms, even the distance maintained by a talker
males, respectively. I had never considered the words from his or her listener — these communication behav-
I grew up with for these items or people as merely a iors reflect culture, something shared among a group of
collection of arbitrary communication signs. I was in people that reflects their way of life.
a different culture, where communication was subtly The biological anthropologist Terrence Deacon
and sometimes not-so-subtly different from the culture summed up this issue by saying, “If symbols ulti-
1
o the best of the author’s knowledge, the term “bubbler” was coined by one Harlan Huckleby, whose idea for a drinking machine that shot
T
water upward toward the drinker’s mouth was patented in 1888 by the Kohler company of Kenosha, Wisconsin. The word may also be used
in parts of New England, Michigan, and Australia.
mately derive their representation power, not from the Everyday life includes health conditions, many of
individual, but from a particular society at a particular which affect a person’s ability to communicate. Even
time, then a person’s symbolic experience of conscious- perfectly healthy persons who enter the United States
ness is to some extent society-dependent — it is bor- for family, work, or educational reasons, who speak
rowed” (Deacon, 1997, p. 452). The “symbols” include English as a second language, may feel as if their com-
the arbitrary linguistic forms and usages as previously munication skills are impaired. At some point in their
mentioned. More importantly, Deacon claimed that an lives, many of these individuals may seek the services
individual’s language experience is really a cultural of a speech-language pathologist (SLP) and/or audi-
experience of his or her society. Deacon says this expe- ologist (AuD). Here is one good reason for special-
rience is “borrowed” because the language/culture ists in Communication Disorders, whether clinicians
matrix is constantly changing, even within the same or college professors who help train clinicians and
culture. An individual’s perspective on life is medi- researchers, to be well-versed in the influence of cul-
ated — some may even say dictated — by the language/ tural variation on speech and language behavior.
culture in which he or she is raised. Clearly, an SLP or AuD cannot possess the entire
Many cultural factors influence communication. range of multicultural knowledge and skill required for
A partial list is provided in Table 4–1. equal effectiveness among the many diverse groups in
the United States. It is unreasonable, for example, to
expect a professional in Communication Disorders and
Why It Matters Sciences to understand all aspects of African American,
Asian, Hispanic, and Native American cultures, includ-
These introductory comments may seem far afield ing their speech and language usage. The sheer variety
from the topic of Communication Sciences and Disor- of cultures and their communication components form
ders, but in fact they are highly relevant to much of the an intimidating body of knowledge; for speech and
material in this text. As pointed out in several publica- hearing professionals, “Becoming competent cross-
tions (e.g., Larroudé, 2004), the United States is becom- culturally is among the greatest of challenges” (Cheng,
ing increasingly diverse in its racial and ethnic identity. 2000, p. 40). Rather than trying to master this impos-
Population growth in the United States is dispropor- sibly large amount of information, a multiculturally
tionately accounted for by nonwhite persons, point- competent professional in speech and hearing should
ing to group diversity as a long-term characteristic of master a “multicultural framework.” This framework
American society. This diversity includes a multitude focuses on principles that are applicable across persons
of cultures. If language is a prominent feature of cul- from the many different cultural groups a speech and
ture and not separable from it, a panorama of commu- hearing professional is likely to encounter in his or her
nication styles can be expected as part of everyday life practice (see Cheng, 2001, p. 125). A few of these prin-
in the United States. ciples are discussed later in this chapter. An important
“super principle” of this framework is discussed here
to illustrate why multicultural sensitivity and knowl-
Table 4–1. Selected Cultural Factors That Can edge are important to the SLP and audiologist. This
Influence Communication principle involves the distinction between a difference
and a disorder.
Race and ethnicity
Social class, education, and occupation Difference Versus Disorder
Geographical region Multicultural competence for a speech and hearing
Gender professional is especially important when a patient has
communication behaviors that, in one or more environ-
Sexual orientation ments (such as the classroom), call attention to them-
Situation or context selves. A principle of multicultural competence is the
recognition of the distinction between a communication
Peer group association/identification
difference and a communication delay or disorder. This dis-
First language community/culture tinction can be clarified with an example from the area
Relationship between speaker and listener
of typical language development in children.
Language scientists have studied in great depth
Source: Based in part on Giri, 2006. various aspects of typical language development,
tation to convey the meaning of a male in the act of

Ebonics running (more precisely, it is “He runnin’”). When a
4-year-old African American child who is a speaker of
“Ebonics,” a term blending “ebony” AAVE says, “He runnin’,” there is no language delay
and “phonics,” is generally understood today to (as evidenced by this particular utterance) because the
be a largely historical term for African American child uses the correct grammatical form for his cul-
Vernacular English (AAVE), the dialect used tural/linguistic community (see Rickford, 1997). In
by many North American African American other words, the language community in which the
people. Ebonics became a national issue in 1996 child is developing speech and language skills recog-
when the Oakland Public Schools, California, nizes “He runnin” as typical development, and as a
passed a resolution recognizing it as a legitimate good match to fully developed language among adult
language system. The resolution had the aim speakers of AAVE.
of obtaining federal programs and support for
instruction in Ebonics. Learning of standard
English, it was argued, could be facilitated by
Standardized Tests and
approaching instruction through the child’s
“home” language — Ebonics. This resolution, and
Sample Size
its educational implications, provoked spirited In the discipline of Statistics, there is the concept
controversy among public figures, politicians, of a “population” and of a “sample.” When a
and linguists. Whatever the fine points of the scientist chooses a sample of people to partici-
controversy might be, there is widespread pate in an experiment, she hopes the sample
agreement among linguists that AAVE has is sufficiently representative of the population
language forms, content, and use employed in a so that her results can be generalized (that
rule-based system for effective communication. is, not restricted to the specific participants
Often these rules are not the same as the rules she has studied in her experiment). There are
we associate with “Standard American English” several approaches to creating a sample that
(whatever that might be). McWhorter (2001) and is representative of the population. Among
Kretzschmar (2008) have written histories of the these is the selection of a sample consisting of
controversy, as well as differing opinions about many participants (as compared to relatively
the legitimacy of Ebonics as a language system. few participants). The thinking is, the larger
the sample, the more the results approach the
“true” population characteristics. Based on this
including the learning of grammatical rules. As dis- principle, most standardized tests are based on
cussed further in Chapter 6, during the course of typi- data collected from relatively large numbers of
cal language development, it is not unusual for children participants. Few people trust the results of an
aged 2 years (or a little older) to produce simple sen- age-normed, standardized test based on data
tences such as “He running.” Children communicate collected from, say, 10 children at each age. The
the idea of a boy who is currently in the act of run- precise number required to make the sample
ning with a sentence lacking the “to be” verb (in this a good estimate of population characteristics
case, the word “is” or the contracted form, “He’s”). “He depends on many factors, but the larger the
running” is considered a “typical” sentence form for a sample, the more likely is the generalizability of
child in the early stages of language learning, but the the results.
same sentence produced by a child aged 4 years may
create an impression of language delay. This is because
the “typical” course of language development involves
a fairly quick transition from “He running” to “He is Standardized Testing
running” (or, “He’s running”). and Language Difference
This is all well and good, but this “typical” course Versus Disorder
of verb development does not apply to all dialects spo-
ken in the United States. A relevant case is AAVE,2 in The example in the Box above can be extended to for-
which “He running” is a proper grammatical represen- mal, standardized testing of speech and language skills. In
2
ome refer to this dialect as African American English (AAE), or Black English, but in this chapter AAVE is used. See the Box
S
on Ebonics.
the case of speech and language development in chil- ing children is much greater than it is in Wisconsin.
dren, it is useful to have knowledge of age milestones For example, Illinois and Florida have a substantially
for specific events. For a typically developing child, higher proportion of African-Americans than Wiscon-
how many words are included in the spoken vocabu- sin, and a greater number of speakers of AAVE
lary at age 2 years? At age 3 years? Or, at age 2 years, The most obvious use of an aged-normed test is
which verbs are understood and/or spoken by a typi- clinical, in which a diagnosis can be made concerning
cally developing child? The answers to these questions a possible delay in speech and/or language develop-
are clearly relevant to clinical diagnoses of delayed or ment. The issue posed concerning an African American
disordered language development. How are answers child’s use of AAVE exposes one major, potential prob-
to these questions determined, and how are they used lem with age-normed tests: the results, and their inter-
in clinical settings? pretation, depend entirely on how the test was normed.
Age-normed, standardized tests provide one Our hypothetical test of verb performance was devel-
way to answer these questions. The purpose of these oped from data collected in Wisconsin, a state with a
tests is to generate an accurate estimate of the age at much lower proportion of African Americans than,
which most typically developing children master say, Illinois or Florida (http://www.wadsworth.com/
a particular speech or language skill, and to express sociology_d/special_features/ext/census/african
the test scores in a way that permits direct compari- .html). If the test considers “standard English gram-
sons across ages. The results of an age-normed, stan- mar” as the requirement for correct verb performance,
dardized test can be used to document the amount the norms are likely to misrepresent “typical” language
of language delay, expressed as the number of years development depending on the geographical location
behind performance expected based on the child’s of data collection. A “normative” test can reflect sub-
chronological age. stantial cultural biases.
For example, a four-year-old child who has lan- Here is a hypothetical example of cultural bias
guage scores that are at the mean (average) of the in normative testing. An African American child who
distribution of scores obtained from the three-year uses AAVE in the home and with his friends is referred
old, normative sample, is said to be a year behind in by his preschool teacher to a speech-language clini-
language development. And, these normative distri- cian for language evaluation. The teacher and the SLP
butions can be used to track a child’s progress during are not well informed about potential cultural bias in
speech-language therapy. In the example immediately standardized testing. These well-meaning individuals
above, the desired outcome of the therapy is to move were educated many years ago, before our discipline
the child’s standardized score in the direction of the had a clear sense of the professional implications of
age-appropriate distribution (i.e., the four-year-old cultural sensitivity. A 5-year-old child is given stan-
distribution). dardized tests based on samples from primarily white
The standardized scores of typically developing populations, and his scores are like those of 3-year-old,
children at any age depend on the nature of the sample typically developing children who contributed to the
used to construct the test. The nature of the sample is sample on which the test was based. In the absence of
critical to understanding the limitations of a standard- cultural sensitivity and the strong link between culture
ized test in making clinical decisions; the results of the and language, the child’s score may be interpreted as
test, and their interpretation, depend entirely on how an indicator of developmental language delay. The
the test was normed. child is scheduled for sessions with the SLP to address
Consider the four-year-old, AAVE-speaking child, this clinical diagnosis. The child’s parents are confused
described above, who produces verbs consistent with when informed of this diagnosis, because they have
adult use in AAVE (“He runnin”). Let’s assume we not noticed a problem with the child’s ability to com-
have a standardized test of verb development, based municate. Their child’s language abilities seem age
on samples of children living in the state of Wisconsin. appropriate, judging from their experience with other
The sampling takes account of the proportions of Cau- children in their family, and in their community. For-
casian, African-American, Hispanic, Asian, and other mal or informal testing of this child’s language skills
racial and ethnic groups in the state population. The in AAVE would show, in fact, that the child’s language
method used to develop this standardized test may fol- development is age appropriate.
low excellent principles of test development, includ- The SLP must have cultural competence, includ-
ing the proportionate sampling of children within the ing cultural sensitivity, to determine if a child’s speech
state population. However, the norms in the test are not and language skills represent a difference from other
likely to reflect “typical” performance in other states cultural expectations, or a true delay or disorder.
where the proportion of (for example) AAVE-speak- A speech/language difference is not something to be
treated (unless the patient or his parents request such issues associated with age, gender, and socioeconomic
treatment: see later in chapter); a delay or disorder status, to name a few.
should be treated.
This rather long example is the most obvious case
where unintentional cultural biases may influence the Accent, Dialect, and Culture
interpretation of formal, standardized tests. In the eval- Some aspects of cultural variation are plainly obvious.
uation of all aspects of language, including nonverbal For example, differences between Japanese and Ameri-
language, social use of language, and many other fac- can culture, or between German and French culture, are
tors, SLPs and audiologists must have cultural compe- indisputable. A clear and dramatic difference between
tence to be most effective. these cultures, of course, is language. Even within the
Cultural variations in all components of speech same language, however, dramatic cultural differences
and language are too numerous to document here, or to may exist. The “face” of these within-language dif-
be known by an individual SLP or AuD. The key is for ferences is often speech and language patterns, even
professionals in our field to be aware that differences when the actual face does not suggest a difference.
from their own ideas of communication normality may Good examples are British versus American English,
be cultural, not clinical. This awareness should include the Chinese spoken in Taiwan versus Mainland China,
the knowledge and skills to identify sources — in the and Egyptian versus Jordanian Arabic. Accent and/or
clinical and research literature, or of the human vari- dialect are often strong identifiers of geographic, ethnic,
ety — to assist in the evaluation of communication and even cultural roots. Accent and dialect are some-
behaviors as cultural differences versus communica-
times used interchangeably, but as noted in the next sec-
tion delays/disorders. tions, there is an important distinction between them.
A framework for knowledge and skills in cultural
sensitivity is now incorporated into the training of
SLPs and AuDs. More specifically, when an individual Accent
SLP is working with many patients who identify with
a variety of cultures, whether it is African American, In lay terms, accent refers to how a person “sounds”
Hispanic, Asian, Native American, Deaf, or any other when he or she speaks. More technically, accent refers to
group, development of a solid understanding of com- how the sounds of speech and the melody and rhythm
munication characteristics within the group is critical. of speech (prosody) are heard. The statement, “Bill has
The SLP or AuD may not have a full range of a Boston accent,” refers to the way Bill says certain
cultural knowledge for each group, but based on the sounds, and perhaps the melody and rhythm of his
multicultural framework that is part of SLP and AuD speech. “Susan has a Southern accent” means the same
training, know how to identify and obtain the knowl- thing, and distinguishes the sound of her speech from
edge relevant to any child or adult seeking diagnostic Bill’s. The way these two speakers sound allows almost
or management (therapy) services. any speaker of American English to identify their
The questions about speech-language and audiol- general geographic origin within the United States.
ogy specialists and the possible intertwining of their Accent is often thought to be conveyed primarily by
professional competencies and cultural identities can the way vowels are spoken, although consonants and
be stated more broadly. Can Caucasian SLPs be effec- prosody can contribute in important ways to a regional
tive in a school setting where the majority of students accent.
use AAVE to communicate, or vice versa? Can an SLP How many distinguishable accents exist in the
who is not familiar with Hispanic culture and language United States? This is a difficult question to answer,
usage diagnose and treat speech and language delays but Dr. William Labov of the University of Pennsylva-
or disorders in a school where the students are primar- nia, who has studied regional accents in the lower 48
ily of Hispanic heritage (or vice versa)? states, identifies six major accent regions (Labov, Ash,
The steps taken by ASHA to ensure that all per- & Boberg, 2006). These regions can also be thought of
sons being trained as SLPs and AuDs receive instruc- as cultural regions. For example, among the six accent
tion in cultural competence are likely to address these regions are a Western accent and a Southern accent.
questions in a positive way. ASHA’s perspective on Most Americans would agree that Western culture
cultural competence is summarized in an overview (as found in, say, California or Washington State) dif-
statement on their Web page (http://www.asha.org/ fers from Southern culture (as found in Louisiana or
Practice-Portal/Professional-Issues/Cultural-Compe Georgia, for example). These same Americans would
tence/). Cultural competence is viewed by ASHA as an not expect much difficulty in identifying speakers with
issue much broader than the few examples provided in Western versus Southern accents. The person with
this chapter. Cultural competence may involve cultural an easily identifiable southern accent probably con
siders himself a southerner, and not simply because of by educated people who are likely to be from the Lon-
his accent. don area or have learned London-accented English
The six major accent regions identified by Labov (Müller et al., 2000). In the United States, the debate
and his colleagues are almost certainly an oversimpli- is not as public as it is in England, but the tendency
fication of accent variation in the United States. For among some accent groups to regard other accents as
example, the “North” accent in this system includes “substandard” or to associate certain accents with cer-
speakers in Minnesota, Wisconsin, Chicago, Michi- tain personality traits cannot be denied (see review in
gan, and western New York. Many people who have Fridland, 2008). The issue of whether or not a standard
grown up in Wisconsin can hear the difference between accent exists or should exist continues to be a matter
a native Wisconsinite and a native Minnesotan, and the of debate among linguists and educators. In this text,
Chicago accent is (to the author’s ear) different from the position is taken that accent is such an important
the typical Wisconsin accent. Labov and his colleagues part of regional identification and culture that any one
include 11 states and parts of two others in their “West- accent cannot be viewed as “better” or more desirable
ern” accent, but speakers from Colorado and Southern than another. This is consistent with the acceptance
California, included in the “Western” accent group do of all cultures, and specifically all the subcultures in
not sound alike (at least to these ears). America, as equally worthy and equal partners in the
The accent regions identified by Labov and his col- creation and shaping of the American social landscape.
leagues serve an important purpose even though each
of the categories contains accent variation. The aver-
age person walking on the street in Madison, Wiscon- Dialect
sin, who is introduced to someone from Mississippi,
Georgia, or South Carolina is likely to hear their accent Labov and his colleagues actually called the six accent
as “Southern.” The Georgian may hear the difference regions described above “dialect” regions. As men-
between her accent and that of the South Carolinian or tioned earlier, “accent” refers to an impression of the
Mississippian. But it is unlikely for a Georgian with- sound of speech, which is what Labov meant when
out special training to detect the differences between he enumerated the six “dialect” regions. But the term
a Wisconsin, Minnesota, and Michigan accent; they all “dialect” is technically different from “accent” because
sound “Northern.” So, the variation within any one of dialect includes aspects of language that go well past
the six accent groups is likely to be detected by some- the sound of speech. Dialect is defined as a language
one within the accent group but not by someone from variant, typically associated with a geographic region
a different accent group (see Clopper, Levi, & Pisoni, or group of people. A dialect may include unique sound
2006, for experimental work on perceptual identifica- and prosodic characteristics — that is, an accent — but
tion and discrimination of regional accents). may also include unique vocabulary items, gram-
Talker accents often highlight “us” and “them” matical structure, and even rules for how people com-
(Müller, Ball, & Guendouzi, 2000). Accents are an iden- municate. Accent is therefore a component of dialect.
tifier of geographical/cultural allegiances and associa- The Wisconsin dialect, for example, not only includes
tions. Regional accents can contribute powerfully to different-sounding vowels than the Philadelphia dia-
how we are perceived, and how we perceive ourselves. lect but as described earlier has vocabulary items such
When I first came to Madison, Wisconsin, it was not as “bubbler,” “hoser” (probably borrowed from Can-
only my initial inability to identify a “bubbler,” or to ada), and “pop” (as in, soda), which are typically not
understand what it meant when someone handed me a part of the Philadelphian’s vocabulary.
hammer and said, “Hold this once,” that made me feel A less subtle dialect difference in the United States
different. I also sounded funny when I spoke. Within is between AAVE and the several accent and dialect
10 minutes of driving into town and interacting with variants of American English spoken by white persons
several people as I signed a lease and got something to around the country. In addition to the phonological
eat at the (now defunct) Marc’s Big Boy, I was acutely characteristics of AAVE which are different from most
aware of my funny-sounding vowels. white regional dialects, AAVE has vocabulary items not
Among all of these many accents, can one be heard in white American English, and may also have
defined as “standard”? And what does it mean to different rules for social communication. AAVE is a
have a standard accent? This question and its possi- dialect of English, not simply an accent difference from
ble answer have as much, if not more, social-political the several accent varieties of white American English.
meaning than linguistic importance. In Great Britain, What does it mean to say that a language such
for example, there has been a long-standing debate as American English has several different dialects?
concerning the advantages of a standard accent. This Dialects of a language are typically mutually intelli-
accent is defined as that heard in the speech produced gible; speakers of different dialects can communicate
effectively, even if occasionally they are confused by dialect variation may be sufficient to make it unintel-
a word or grammatical usage, or if they find their ligible to other users of the parent language. The dialect
communication partner’s concept of personal body difference then becomes a language difference.
space a little odd. This idea of mutual intelligibility, The relationship between dialect and accent is
even when accent, vocabulary, grammatical form, and summarized in Figure 4–1. Accent is shown as a com-
other aspects of language usage vary quite a bit, is the ponent of dialect but as separate from other compo-
usual standard for differentiating a dialect difference nents of language that contribute to dialect variation
from a language difference. When two people talk and (e.g., morphology, discussed in Chapter 3).
cannot understand each other, it is likely they are
speaking different languages, not different dialects of
the same language. Code Switching
The mutual intelligibility criterion for distinguish-
ing a dialect difference from a language difference is When dialect differences exist between two groups of
a technical distinction that does not always fit eas- people who have extensive contact as a result of com-
ily into real-world experience. Most Americans have mon neighborhood, common workplace, or friend-
had the experience of listening to rapidly produced ship (among other factors), speakers in one group may
British or Australian English — in a movie theater, for develop the skill of switching to the dialect of the other
example — and having great difficulty following the group. This skill is referred to as code switching. Lan-
dialogue. The languages spoken in various parts of the guage is a code, and the ability to switch between dif-
United Kingdom, Canada, Australia, New Zealand, ferent versions of the code is valuable. An SLP who can
and parts of India are surely English, technically are code switch among children with different native dia-
dialects of English, but they are not always mutually lects or languages has a distinct advantage in the plan-
intelligible. If you travel to Manchester, England, and ning and execution of (for example) language therapy.
ask someone on the street for directions and her reply This advantage may enhance the language learning
seems unintelligible, is she speaking a dialect of your process among the children.
language? The answer seems to be “no” when “mutual Code switching takes place for all aspects of lan-
intelligibility” is the criterion for different speech pat- guage: for phonetics, phonemes, lexicon, morphology,
terns/styles to qualify as dialects of the same language. word order in sentences, and even pragmatics. The
But the Manchester native is speaking English; how do role of code switching in American society is becom-
we resolve this? ing more important as multiple-language homes are
Perhaps the resolution is to admit the uncertainty increasing, and parents are emphasizing among their
of a dividing line between different dialects and differ- language-learning children mastery of the language
ent languages (Backus, 1999). Languages evolve — they spoken in the home as well as that of the majority lan-
are constantly changing — and over time a changing guage of the society in which they live.
The Angel’s Share

When whisky is aged in oak barrels, the fluid may fill the cask
nearly to the top. After aging, when the barrel is finally opened for bottling,
the volume of the whisky is less than it was when originally poured for
aging — the fill line has decreased. The evaporated whisky, lost during the
aging process, is called the “angel’s share” — people on earth will have
plenty to drink, they won’t miss this small tribute to those on a different
plane. The 2012 film The Angel’s Share is a story about a whisky heist by a
small group of Scottish men, looking to profit from the removal of a priceless
barrel of aged whisky from a famous distillery. The main actors are Scottish,
and the one British actor does a spot-on Scottish brogue; they all speak
English throughout the film. If you see the film (highly recommended), you
may be surprised to see subtitles. It is English, right? Why the subtitles?
After the first two or three minutes of the film, you understand the use
of subtitles perfectly: the dialogue is nearly unintelligible to the typical
American ear. Dialect or language difference? Watch the film; you decide.
Dialect
Accent Other Language Components
Grammar Morphology Vocabulary
Figure 4–1. Dialect and accent.
they provide phoneticians (people who study the

Language Components sounds produced in different languages of the world)
and phonologists (people who study speech sound sys-
The language units referred to in
tems — the rules governing the use of speech sounds for
the text are described in Chapter 3 and appear
purposes of communication) with the opportunity to
frequently in subsequent chapters of this
ask, “How do the speech sounds of one language affect
textbook. By way of review, single-sentence
a person’s ability to produce the speech sounds of a
definitions of each component of language are
second language?” To illustrate, consider the Swed-
as follows: (a) phonetics designates the speech
ish, Greek, and American English languages. Swedish
sounds of a language; (b) phonemes designate
has a more complicated set of vowels as compared to
the speech sounds of a language that, when
English, which in turn has a more complicated set of
exchanged in the same position of a sequence
vowels than the relatively simple, five-vowel system
of sounds (e.g., the /k/ and /g/ of the words
in Greek. When a Swede and a Greek are attempting to
“coat” and “goat”) change the word meaning;
learn American English, does the relative complexity
(c) morphology designates the meaningful units
of their native vowel systems affect the way they learn
of speech that “inflect” words and change their
English vowels? The best answer we can give today
grammatical identity (e.g., making a word plural
for this question is: yes, the relationship between the
as in dog versus dogs, or indicating past tense
vowel systems of two languages influences the ability
as in “want” versus “wanted”; (d) vocabulary
to learn the nonnative system, but the specifics of this
(the lexicon) includes the word forms people
influence are complicated.
implicitly agree upon as having specific mean-
This scientific question, of how the sound sys-
ings; (e) syntax designates the grammatical rules
tems of two languages influence the ability to learn the
of language that permit certain word orders but
sounds of a second language, is relevant to the practi-
not others for the formation of sentences (e.g.,
cal issue of modifying a foreign accent. SLPs with rel-
in English, “Big dog” follows grammatical rules
evant training provide services to people who want to
but “Dog big” does not); and (f) pragmatics
modify a foreign accent. Modification of foreign accent
determine the social use of communication that
is also a rather controversial aspect of our field (see,
depend on factors such as age, gender, ethnic
e.g., Fitch, 2000; Müller, Ball, & Guendouzi, 2000; Pow-
group, and so forth)
ell, 2000; and Winkworth, 2000). One reason for the
controversy is found in the potential for independence
between accent severity and speech intelligibility.
Foreign Accent A speaker may have a substantial accent when speak-
ing a nonnative language — that is, the speaker can
Scientists and clinicians are interested in the character- clearly be identified as a nonnative speaker by a native
istics and possible modification of foreign accent. The speaker — yet be perfectly intelligible. If, for example, a
characteristics of foreign accent are interesting because native speaker of Mandarin Chinese produces accented
but perfectly intelligible English, why seek modifica- Bilingualism and

tion of the accented English? Intelligibility is often the Multilingualism
main concern of people who seek to reduce their for-
eign accent; after all, being understood is the primary The deep connection between language and culture is
goal of communication. a professional challenge when language development
Some speakers may seek the help of an SLP, or a takes place in two or more languages. A child who has
specialist in English as a foreign language, to reduce roughly equal exposure to multiple languages (and,
their foreign accent simply to make them sound less likely, cultures) as she learns language from birth is a
“different.” Speakers may also recognize that their for- “simultaneous” bilingual (or, more rarely, trilingual,
eign accent does not compromise their intelligibility or however many languages the child is immersed in).
but still causes their listeners to work harder to extract Children who learn one language first and around the
the fully intelligible message (Floccia, Butler, Goslin, age of 3 or 4 years are immersed in a second language,
& Ellis, 2009). Accent reduction therapy is most com- and develop roughly equal competence in both lan-
monly initiated by the person seeking to reduce an guages, are called sequential bilinguals. Bilingualism
accent (rather than by a health care or education spe- is not restricted to oral language. A person who is flu-
cialist). There may be certain situations, however, in ent in both an oral and American Sign Language (ASL)
which a speaker’s accent is judged as a potential chal- is considered bilingual.
lenge to his or her professional success, and a recom- Bi- or multilingual language development and
mendation is made to seek accent reduction therapy. multilingualism in adults raise questions for the SLP
For example, students enrolled in a master’s-level SLP and audiologist. For example, in the early stages of lan-
training program who are not native speakers of Eng- guage learning, does equal exposure to two languages
lish may have accents that interfere with aspects of affect language development, either in a positive or a
therapy for articulation disorders in children or adults. negative way (i.e., is typical bilingual language de-
At least part of articulation therapy involves clinician- velopment more or less the same as monolingual lan-
produced models of the speech sounds being trained. guage development)? If a child has language delay
A clinician (the student with the accent) who produces in one of the languages but not the other, is speech-
a sound model that is insufficiently native-like may be language therapy appropriate for the language with
recommended for accent reduction therapy. delay? When speech-language therapy is indicated for
Part of the controversy surrounding the idea developmental delay in both languages, does it make a
of accent (or even dialect) reduction therapy is who difference which language is used by the therapist for
decides what the “reference” accent or dialect is, and language stimulation? Many other questions have been
even if there should be such a reference. The interplay asked about the role of multilingualism in language
of accent, dialect, culture, and professional training is development (see Goral & Conner, 2013, for a review of
complicated. A few years ago, the author had a chance these issues).
to visit a Communication Sciences and Disorders train- Scientists and clinicians have also been interested
ing program at a southern university, where the major- in the effect of bilingualism on speech and language
ity of graduate students in clinical training had a strong perception and comprehension. For example, is a lis-
southern accent (to these northern ears, and I suspect tener’s comprehension (or the cognitive processes that
most northerners). I asked myself how these students, support comprehension) affected when a speaker uses
having completed their clinical training program and the same language as the listener, but with a mild to
seeking jobs, might fare in an interview at a school moderate accent? Let’s say that a monolingual, Eng-
or hospital in a northern state. I reached a tentative lish-speaking child listens to a native speaker of Span-
conclusion that two candidates for a job as an SLP in ish who is speaking accented English; is the listener’s
a northern school or hospital with equivalent creden- comprehension affected by the accent, compared to lis-
tials, graduate school success, and comparable letters tening to a native speaker of English?
of recommendation, who both performed well at an At first glance, the role of accented speech in lan-
interview, but who differed by their regional accents, guage comprehension may seem no more than an aca-
would not be viewed as equally qualified for the job demic, laboratory exercise. The implications for clinical
because of the accent difference. (I suspect the same practice, however, are potentially substantial, as illus-
situation would apply for the same hypothetical situ- trated by the following two questions. First, when
ation except with the regions reversed — native north- language stimulation services are provided by an SLP
erner and southerners applying for an SLP position in with accented English, for an English-speaking child
a southern school or hospital.) with language delay or an adult with a comprehension
deficit resulting from a stroke, does the accent result the latter including variations in the melody of speech
in poorer comprehension as compared to speech with- (intonation), loudness of speech, and rhythm of speech.
out an accent? Second, when an audiologist performs Accent may refer to regional accent (varying
speech perception tests as part of a diagnostic workup accent among native-born speakers of one language) or
for a possible hearing disorder, are the test results to foreign accent (accented speech of a speaker having
affected by the accent of the speaker who produced one native language who speaks a second language).
the words or sentences used in the testing (Shi, 2014)? Dialect includes accent, but also word and phrase
Based on research to date, the answer to both questions choices, the order of words in sentences, and the use of
seems to be “yes” even if all the details of accent influ- minimal units of meaning in language (morphemes).
ence have not been determined (Harte, Oliveira, Fri- Regional accent, dialect, and foreign accent may
zelle, & Gibbon, 2016). affect speech, language, and hearing testing and manage-
The study of the effect of accented speech on lan- ment, depending on the similarities between the thera-
guage comprehension is worth the effort. It is a sig- pist or tester accent/dialect and the accent/dialect of the
nificant aspect of consideration of multicultural and person receiving management services or being tested.
multilinguistic influences in a speech and hearing
clinic. Accent and its potential relevance to speech and
language therapy may also apply to accent variation References
within a language (e.g., the effect of New England–
accented English on comprehension in speakers who Backus, E. (1999). Mixed native language: A challenge to the
hail from the Pacific Northwest, or of Irish-accented monolithic view of language. Topics in Language Disorders,
English on American Southern–accented English). 19, 11–22.
Benedict, R. F. (1934). Patterns of culture. New York, NY: Hough-
ton Mifflin.
Cheng, L-R. L. (2000). Children of yesterday, today, and
Chapter Summary tomorrow: Global implications for child language. Folia
Phoniatrica et Logopaedica, 52, 39–47.
Cheng, L-R. L. (2001). Educating speech-language patholo-
Culture can be defined in many ways, but each defini- gists for a multicultural world. Folia Phoniatrica et Logopae-
tion mentions beliefs, behaviors, and symbol systems dica, 53, 121–127.
that are shared and agreed upon by members of the Clopper, C. G., Levi, S. V., & Pisoni, D. B. (2006). Perceptual
cultural community. similarity of regional dialects of American English. Journal
Language is intertwined with culture in the agree- of the Acoustical Society of America, 119, 566–574.
ment among members of the connection between sym- Deacon, T. W. (1997). The symbolic species. New York, NY:
bols and meaning; language is conventional. W. W. Norton.
The population change under way in the United Fitch, J. (2000). Accent reduction: A corporate enterprise.
States, and in many other countries, requires SLPs and Advances in Speech-Language Pathology, 2, 135–137.
Floccia, C., Butler, J., Goslin, J., & Ellis, L. (2009). Regional and
audiologists to understand different cultures and the
foreign accent processing in English: Can listeners adapt?
influence of these cultures on speech and language Journal of Psycholinguistic Research, 38, 379–412.
behaviors. Fridland, V. (2008). Regional differences in perceiving vowel
A major consideration in diagnosing and manag- tokens on Southerness, education, and pleasantness rat-
ing speech, language, and hearing disorders is recogni- ings. Language Variation and Change, 20, 67–83.
tion that there is no dominant language, meaning that Giri, V. N. (2006). Culture and communication style. Review of
evaluation of a potential communication disorder must Communication, 6, 124–130.
account for cultural differences. Goral, M., & Conner, P. S. (2013). Language disorders in mul-
Standardized tests of speech, language, and hear- tilingual and multicultural populations. Annual Review of
ing disorders that are normed on a group of children Applied Linguistics, 33, 128–161.
or adults from one culture are not likely to be valid as Harte, J., Oliveira, A., Frizelle, P., & Gibbon, F. (2016). Chil-
dren’s comprehension of an unfamiliar speaker accent:
assessment tools for children or adults from another
A review. International Journal of Language and Communica-
culture. tion Disorders, 51, 221–235.
In the evaluation of a possible communication dis- Huntington, S. P. (1996). The clash of civilizations and the remak-
order, a cultural difference must not be confused with a ing of world order. New York, NY: Simon and Schuster.
communication disorder. Kretzschmar, Jr., W. A. (2008). Public and academic under-
Accent refers to the “way people sound” when standings about language: The intellectual history of
they talk; accent includes speech sounds and prosody, Ebonics. English World Wide, 29, 70–95.
Labov, W., Ash, S., & Boberg, C. (2006). Atlas of North Ameri- Pullem, G. K. (1999). African American Vernacular English is
can English. Berlin, Germany: Mouton de Gruyter. not standard English with mistakes. In R. S. Wheeler (Ed.),
Larroudé, B. (2004). Multicultural-multilingual group ses- The workings of language (pp. 39–58). Westport, CT: Praeger.
sions: Development of functional communication. Topics Rickford, J. R. (1997, December 1). Suite for ebony and pho-
in Language Disorders, 24, 137–140. nics. Discover Magazine. Retrieved from http://discover
McWhorter, J. (2001). The power of Babel: A natural history of magazine.com/1997/dec/suiteforebonyand1292
language. New York, NY: Times Books Shi, L-F. (2014). Speech audiometry and Spanish-English
Müller, N., Ball, M. J., & Guendouzi, J. (2000). Accent reduc- bilinguals: Challenges in clinical practice. American Journal
tion programmes: Not a role for speech-language patholo- of Audiology, 23, 243–259.
gists? Advances in Speech-Language Pathology, 2, 119–129. Winkworth, A. (2000). Promoting intelligibility not terminol-
Powell, T. W. (2000). The turn of the scrooge: One Yank’s per- ogy: The role of speech-language pathologists in accent
spective on accent reduction. Advances in Speech-Language reduction programmes. Advances in Speech-Language Pathol-
Pathology, 2, 145–149. ogy, 2, 139–143.
5
Preverbal Foundations of Speech
and Language Development
Introduction ment. The alternative viewpoint regards speech and

language skills strictly as things to be learned, and the
“Preverbal speech and language development” refers first year of life as an intensive, immersion crash course
to the set of communication skills developed by an in language learning. Of course, biological maturity
infant roughly between birth and the production of the plays a role in the efficiency and success of this learn-
first word, typically around 1 year of age. This straight- ing, but the learning perspective typically rejects the
forward description does not do justice to the many idea of a special speech-language device in the brain
controversies surrounding exactly how and why chil- that “turns on” the ability.
dren move from producing no words and understand- There is a substantial body of facts concerning the
ing little at birth to uttering their first word around development of speech and language skills during the
12 months of age and at the same time understanding first year of life. Scientists have made many observa-
many more words. tions concerning preverbal speech and language skills
An understanding of preverbal speech and lan- of babies, and written pages and pages of theory to
guage development requires knowledge of both explain their results. In this chapter, the facts of preverbal
emerging production, perception, and comprehension skills — those specifically relevant to speech and language
skills. Issues of motor maturity (production), auditory development, are presented chronologically, as they
perceptual skill (perception), and ability to represent evolve through the first year of life. This chronology is
auditory percepts for the purposes of linguistic catego- supplemented by some ideas about how speech and lan-
ries including those for meaning (e.g., words), are all guage development following the preverbal year — after
relevant to preverbal language development. 1 year of age — emerge from this preverbal skill develop-
The issues in preverbal language development ment. This is very important, because the way in which
are controversial. In the Chomsky view, speech and the skills underlying speech and language develop dur-
language is not “learned” in the traditional sense but ing the first year of life is not strictly logical — the way it
rather triggered biologically at some point in develop- does happen is not the only way it could happen.
57
Preparatory Notes on year of life, language comprehension skills are typically

Developmental Chronologies more advanced than expressive (production) skills.
Typical speech and language development in the first

year of life is chronicled in this chapter according to 0 to 3 Months: Expression
three broad time periods: 0 to 3 months, 3 to 8 months, (Production)
and 8 to 12 months. These time periods are “loose”; they
represent average developmental sequences and are Babies cry a lot in the first few months of life; this is
by no means applicable to every typically developing hardly a surprise. Crying in very young infants is a
baby. Among children who are developing without dis- reflexive vocalization to hunger and other forms of
ease or obvious delay resulting from an undiagnosed discomfort (e.g., being too cold or too hot, being in
problem, there is substantial variability in the chronol- pain because of gas, and so forth). Most experts do not
ogy of development. The notion of “typical” develop- regard crying among very young babies to have propo-
ment recognizes this variability by understanding the sitional value, in the sense of the vocalization having
emerging skills to fall within a fairly large range. An meaning. Infant cries and the variation of their qual-
understanding of this variability also explains the use ity clearly affect parent perception of a baby’s comfort
of rather broad time intervals for stages in the chro- level and needs (Lagasse, Neal, & Lester, 2005).
nology (for example, 3 to 8 months). In addition, the At approximately 2 months of age, babies may
discussion within each of the three time periods often produce what Oller (1980) called quasi-resonant nuclei.
refers to developmental processes in one of the other These are clearly not reflexive expressions of discomfort
time periods. For example, preverbal skills in the 0- to and may occur as apparent “happy responses” when a
3-month period are presented along with the implica- parent talks to the baby. Oller called them quasi-reso-
tions for preverbal skills in the 3- to 8-month and 8- to nant nuclei because they give the impression of slightly
12-month periods. muffled, nasalized vowels that often seem to be pro-
Throughout this chapter, keep in mind the vari- duced with the lips closed. If you have held a baby at
ability in preverbal language skills across typically this age and heard these kinds of sounds, you will rec-
developing children, as well as the links between ear- ognize them from this description, and may remember
lier and later preverbal skills. The chronologies are first thinking, “How is the baby making a sound that seems
separated into production and comprehension skills, vowel-like even though her mouth is closed?”
followed by additional information on interactions Toward the end of this period, the baby may pro-
between the two. duce vocalizations called “coos” and “goos.” The range
Children in the first year of life must learn to use of speech sounds in these early, nonreflexive vocaliza-
their lungs, larynx, tongue, lips, and jaw (as well as tions is limited and often includes the vowels “ah” (as
other structures of the head and neck) to perform the in “hot”) and “oo” (as in “boot”), sometimes with a
motor skills required to make speech sounds; or they consonant-like sound resembling a “k” or “g.” Most
reach an age where speech motor ability becomes likely, coos and goos are not intentional; the baby does
available to them for the purpose of producing speech not intend to communicate some meaning with these
sounds (for this subtle distinction, see the section, “3 to vocalizations.
8 Months: Production,” on babbling). This is the pro- An issue in the initial months of preverbal sound
duction part. Likewise, children must learn to make per- development concerns the interaction between the
ceptual distinctions relevant to their native language, to baby’s anatomical structures and sensorimotor capa
associate specific sequences of sound distinctions (e.g., bilities,1 and the sounds the baby produces. When a
words) with meaning as a linguistic representation, gesture is produced, such as hand waving for “bye-
and to use their memory skills to access the link be- bye” or shaping the oral and throat cavities with a nar-
tween the acoustic signals and their meanings. Or, from row oral constriction between the tongue and the front
the perspective of the Chomsky view of language devel- of the hard palate (bony roof of the mouth) and a wide
opment, they must reach an age where the ability to per- throat passageway (as in the vowel “ee”), specific pat-
form the linguistic interactions between perception, terns of muscle contraction must be produced. In addi-
mental representation, and memory is “turned on” tion, sensations from these contractions (such as the
by maturation of brain mechanisms dedicated to lan- feel of the sides of the tongue against the teeth when
guage. This is the comprehension part. During the first the front constriction is made) are part of the package
1
“ Sensorimotor,” as used here, denotes the brain mechanisms used to control movement. The inclusion of both “sensory” and “motor” in the
term reflects the role of sensory and motor capabilities, and the integration of the two, in movement control.
5 Preverbal Foundations of Speech and Language Development 59
of information used to verify the “correctness” of the tributes to the sound of the “quasi-resonant nuclei,”
gesture. These sensorimotor skills mature in the first as previously described. This is because the airway is
year of life, but their relative immaturity in the first continuous from the vocal folds through the nasal cavi-
several months is one limiting factor on the kinds of ties in the newborn — in the adult a clear airway path
sounds produced by a baby. from the vocal folds through the mouth is more avail-
Of equal interest is the effect of vocal tract growth able. Throughout the first year of life, a major growth
on sound production during the first year of life. The pattern of the infant vocal tract is a lengthening and
vocal tract is the air passageway between the lips and descent in the neck of the pharynx. As the pharynx
the vocal folds (often called the vocal cords). The shap- lengthens and the vocal folds at the bottom of this tube
ing of this air passageway by movements of the lips, move down and away from the velum, the pharynx
tongue, jaw, and pharynx (the throat) determines which rotates relative to the mouth, creating the 90° bend seen
speech sound is produced. Figure 5–1 shows an art- in adults (see Vorperian, Kent, Lindstrom, Kalina, Gen-
ist’s rendition of two vocal tracts from a side view. The try, & Yandell, 2005, for measurements of patterns of
drawing on the left shows a vocal tract for a newborn vocal tract growth from birth to nearly 7 years of age,
and on the right for a young adult. The vocal tracts are and in the adult years).
shaded light blue in these drawings, making it easy to Why are the shape differences between the new-
see not only the age-related difference in length but born/early infancy and adult vocal tracts interesting
also in shape. The shaded area is an air-filled, flexible with respect to sound production? The vocal tract is an
tube. Note the shortness of the newborn’s pharynx air-filled tube with resonant frequencies that vary by
(the distance from the posterior tip of the soft palate length and shape. “Resonant frequency” is an acous-
to the vocal folds) in comparison to the adult’s. This tic term that denotes a frequency (rate of vibration) at
can be best appreciated by looking at the near-contact which the amplitude of vibration is maximum. This
of the posterior tip of the velum and upper edge of description may gain some clarity by considering pipes
the epiglottis in the newborn as compared to the clear in concert organs. The pipes of the instrument vary in
separation between these structures in the adult. The length. The longer the pipe, the lower is its resonant
shape differences are further highlighted by showing frequency. In the human vocal tract, the tube not only
the bend of the two vocal tracts, from the oral to throat can be of a different length (e.g., the difference between
cavities, with simple straight lines. In the adult, the the short length of a baby and the long length of an adult
pharynx (throat) cavity is oriented roughly at a right male), but because it is flexible, it can also change shape.
angle to the mouth cavity. In the newborn, the two cav- Changes in positions of the articulators create different
ities form a more open angle, and thus have a gentler vocal tract shapes, which result in changes in resonant
transition between them. frequencies of the vocal tract. The different resonant
The close approximation of the epiglottis and soft frequencies for different vocal tract shapes are recog-
palate in the very young infant most certainly con- nized as different vowels. Chapter 11 provides more
Infant Adult
Figure 5–1. An artist’s rendition of a newborn (left image) and adult

(right image) vocal tract, as seen from the side with one side of the head
removed. The blue lines on both vocal tracts show the angle of the mouth
(oral) and throat (pharyngeal) cavities.
information on how changes in vocal tract shape result As an example, it is entirely possible for a baby to have
in changes in resonant frequencies of the vocal tract. reasonable perceptual skills but poor comprehension,
For many years, the gently curved vocal tract or to be able to comprehend well even with nonoptimal
shape of newborns was thought to limit the kinds of perceptual skills.
vowels produced in cooing and gooing, and even in This distinction between perception and compre-
babbling behaviors occurring later in the first year of hension is illustrated by the baby’s skills in the 0- to
life. The frequent occurrence of vowels such as “ah” 3-month period. It has been known for many years that
and “oo” in coos, therefore, was thought to be as much, babies as young as 1 month of age can discriminate
if not primarily, a result of the baby’s anatomy as of the between very similar sounds (such as “p” and “b,” or
baby’s limited sensorimotor control. Even if the baby’s “s” and “sh”) in much the same way as adults (Eimas,
sensorimotor control was adult-like, the story went, Siqueland, Jusczyk, & Vigorito, 1971).2 Scientists agree,
infant’s vocal tract size and shape prevented the cre- however, that the ability of infants to comprehend
ation of vocal tract shapes required for certain vowels. speech, to extract meaning from communicative situ-
Based on recent research, it seems this interpreta- ations, is extremely limited.
tion is only partly true. The infant vocal tract does not Babies’ abilities to discriminate subtle phonetic
actually prevent the occurrence of certain vowels, but differences in much the same way as adults may
the nonadult anatomy may promote the production of reflect general auditory skills, rather than skills spe-
a limited type of vowel such as the “eh” in “bed” and cific to speech perception. In other words, the baby’s
the “ah” in “hot” (Menard, Schwartz, & Boë, 2004). detection of “p” and “b” as different auditory events is
Other vowels can, in theory, be produced by the baby not relevant to “p” and “b” as linguistic events — that
vocal tract, but during the first three months it is easier is, as phonemes. Some scientists believe the newborn
to produce a limited set. auditory system3 is equipped for auditory distinctions
The ability to make certain speech sounds dur- just like the adult auditory system. In fact, very young
ing the first year of life, and especially the first three babies can discriminate virtually all phonetic distinc-
months of life, is therefore related to a host of factors. tions that are used in languages of the world (Vihman,
These include the child’s anatomical, sensorimotor, and 2017). The auditory capability for any distinction is
cognitive maturity. The development of speech sounds available early in the first year of development. How
also depends on the child’s ability to hear speech do babies begin the process of learning the distinctions
sounds and distinguish them from each other. that are relevant to their native language?
There are theories that address this issue. A useful
theory must explain an additional phenomenon that
0 to 3 Months: Perception unfolds during the first year of life: the almost univer-
and Comprehension sal set of phonetic distinctions in the infant’s auditory
repertoire gets “pared down” to only the ones used in
In this chapter, the term “perception” refers to the abil- the native language. Vihman (2017) said that progress
ity to detect an auditory feature in an acoustic signal or in phonetic perception is best defined as loss of the abil-
to discriminate one acoustic signal from another. The ity to discriminate contrasts that are not relevant in the
focus is on acoustic signals and the auditory abilities native language (Kuhl et al., 2006).
required to hear and process them, because these are As the baby develops, auditory distinctions rel-
most relevant to preverbal speech and language skills. evant to phoneme distinctions in the baby’s language
One can just as easily imagine cases in which percep- are “tuned up” by exposure, whereas distinctions not
tual skills for visual and tactile signals are important relevant to the native language weaken and then dis-
for communication. appear. The baby hears a huge amount of native con-
The combination of all perceptual skills plus cog- trasts, creating a special sensitivity for them. Over time,
nitive processes (such as memory) must be considered the nonnative contrasts cannot compete for the baby’s
in the meaning of “comprehension,” which is the abil- attention with native contrasts; the ability to discrimi-
ity to understand communicative intent and meaning. nate the nonnative contrasts disappears.
2
he adult data on discrimination of phoneme contrasts were based on volitional responses (writing down the phoneme heard, or pushing a
T
button labeled with the phoneme heard). Infants obviously cannot make the same kinds of responses, so Peter Eimas and his research group
exploited a well-known baby skill — sucking on a nipple — to demonstrate the baby’s ability to discriminate between two sounds having just
slightly different acoustic characteristics.
3
he auditory system includes all auditory structures from the external ear (the part attached to the side of the head) to the cortex of the cerebral
T
hemispheres; see Chapter 22.
We are a little bit ahead of the chronology that in the absence of true comprehension, turn-taking may
organizes this chapter. The paring down to perceptual provide a model for the baby’s learning about commu-
sensitivity for acoustic contrasts used in the native lan- nicative interaction.
guage and the disappearance of sensitivity to other
acoustic contrasts is complete by 10 to 12 months of
age. (The research by Segal, Hejli-Assi, and Kishon- 3 to 8 Months: Production
Rabin [2016] is an example of this kind.) The process
begins, however, in the first three months with a goal Most babies do not produce true babbling until 6 or
at the end of the first year of life of maximal sensitiv- 7 months of age. What does “babbling” mean? The
ity to relevant phonetic contrasts — the establishment term is reserved for those vocalizations in which con-
of phonemic categories. As outlined later, these catego- sonants and vowels are clearly recognized but do not
ries are essential to word learning and word produc- form words. Early in the 3- to 8-month period, the coos
tion as well. and goos have a few vowels, but as noted above, the
One apparent skill possessed by infants as young consonant-like sounds may have only a vague resem-
as 4 days of age that cannot be explained as part of gen- blance to the real thing. Prior to the onset of babbling,
eral auditory mechanisms is the ability to distinguish there is an expansion of the baby’s vocal repertoire that
utterances spoken in their native language from utter- may be partially supported by an increasing ability to
ances spoken in a foreign language. Jacques Mehler mimic vocal behavior. As the baby moves toward the
and his colleagues (Mehler et al., 1988) demonstrated first half-year of life, she is likely to produce squeals,
this ability in 4-day-old French infants who appeared growls, yells, and Bronx cheers (“raspberries”), all of
to be sensitive to the difference between French and which seem like vocal play, practice, and exploration.
Russian utterances, and in 2-month-old American When the baby starts producing consonants, they are
infants who gave evidence of hearing the difference likely to be labials (p, b, m) and those made with the
between Italian and English utterances. Mehler and front of the tongue (t, d, n). “Back” consonants such as
his colleagues believed the infants’ remarkable ability “g” and “k” typically come later as a regular feature
to distinguish the languages was based on knowledge of babbling.
of the prosodic characteristics of their own language, Babbling typically begins between 6 and 8 months
as compared to other languages. This knowledge may of age but in many cases of typical development may
have been gained by native language exposure both in not begin until 10 months or a bit later. Babbling has a
the womb and after birth. Nonnative prosodic patterns specific form. Its basic unit is a consonant-vowel (here-
that do not match the native language may be detected after, CV) syllable, where the consonant is likely to be
as a “new” event. No matter how recent the exposure, a “b,” “m,” or “w,” and the vowel is an “eh” like in
babies appear to attend to and retain the melodic and the word “bet” and “uh” as in “but,” or an “ih” as in
rhythmic characteristics of the language used in their “bit.” The syllable may be produced once or repeated
home. Scientists have argued that this ability to recog- in a sequence (“buh-buh-buh-buh”). When the same
nize the rhythmic and melodic characteristics of their syllable is repeated in sequence, there is little varia-
native language is a foundation for infants’ developing tion in syllable-to-syllable duration, pitch, and loud-
ability to recognize words (Werker, 2012). The ability ness. These syllable sequences are called reduplicative
to detect the unique melody and rhythm of the native babbling. Although the CV syllable is most frequently
language rhythm “bootstraps” the extraction of words observed in early babbling, other forms (such as vowel-
from the speech signal. Babies typically comprehend consonant [VC] syllables) may also be heard.
their first words around 6 months of age, using their Why do most early babble syllables have a CV
knowledge of the rhythmic aspects of speech to isolate form favoring bilabial consonants like “b” and only
a word (Werker, 2012). a few of the possible vowels? Several proposals have
Although babies between 0 and 3 months of age been set forth to account for this fact, but here we
almost certainly have very limited comprehension briefly describe one perspective, chosen for its care-
of speech and language, other behaviors may lay the fully developed background and theoretical simplicity.
groundwork for future communication skills. For Peter MacNeilage and Barbara Davis, in work done
example, caregivers may interpret baby sounds and at the University of Texas at Austin, view babbling as
mutual eye gaze as having communicative intent. the evolutionary product of the discovery by nonhu-
Based on this assumed communicative intent, the care- man primates, and ultimately early humans, of the
giver may engage in turn-taking behavior, exchang- sound-producing capabilities of the moving articu-
ing vocalizations with the baby and using eye contact lators (MacNeilage & Davis, 2000). Chewing is char-
according to the “typical” rules of conversation. Even acterized by rhythmic up-and-down motions of the
mandible as the tongue and teeth position and grind vocal tract. The motorically simple “frame” of rhyth-
food. At some point in early history, primates acciden- mic mandibular movements can, as motor capabili-
tally or purposely phonated (created sound by vibrat- ties of the articulators develop and mature, be “filled”
ing their vocal folds) during this rhythmic mandible with increasingly complex content. This content will
movement and took notice of the modulation of the include the motions, positions, and configurations of
sound (try it: generate a steady voice and move your the tongue, lips, jaw, and other parts of the vocal tract
mandible up and down, see how the motion generates required for the production of different consonants and
a repeating “syllable-like” effect). According to Mac- vowels. Third, the theory has universal implications: If
Neilage (1998), this is how the basic syllable was born. early babbling is indeed derived from simple motions
Presumably speech evolved in humans from this basic of the mandible, at least the early sound content of bab-
syllabic “frame.” In fact, MacNeilage and Davis refer bling should be the same in all languages because all
to the mandible opening-closing movement as a frame, babies are using the same mechanism.
a vocal tract movement capable of holding the content This last point is important and invites the ques-
of a syllable. The closing of the vocal tract is conducive tion, “Do babies from different language environments
to forming the kind of tight constriction that is charac- produce a uniform set of babbling sounds, or do they
teristic of consonants, and the opening permits the full also produce sounds showing the unique influence of
acoustic resonance typical of vowels. Content of the their native language?” For babies just beginning to
basic syllable is the specific identity of the consonant babble, a firm answer to this question is unavailable
and vowel making up the syllable. From the simple because there are not enough relevant data from vari-
“buh-buh-buh” or “bih-bih-bih” resulting from rhyth-
mic motions of the mandible with the tongue resting
passively within the mouth, the tongue hitching a ride Phonetic Practice or
on the jaw, so to speak, humans learned that different Random Sound Play?
content (different speech sounds) could be inserted
into the basic frame by changing the motion and posi- Before scientists began careful, detailed studies
tion of the tongue, lips, and jaw. This gave evolving of babbling, it was thought to be no more than
humans a wide range of sound combinations for the sound play. Babies discovered the sound-making
labeling of different objects, actions, and people. When capabilities of their little speech mechanisms:
the basic CV was expanded to include different syllable adjustments to the respiratory system (e.g., loud-
shapes (e.g., VC, CCV) and different sequences of syl- ness change), larynx (e.g., pitch change), and
lable shapes (e.g., “baby,” an elaboration of “buh-buh”; articulators (speech sound change) resulted in a
“ice cream” as an elaboration of “ay-ay”), the ability to variety of speech sounds, some of them resem-
label objects, actions, and feelings became immensely bling those produced by adults and therefore
flexible. Together with advances in brain structure and very amenable to phonetic transcription. The
function, a vast collection of words was assembled, and sound play perspective on babbling implied that
phrases to sequence those words allowed even wider complete phonetic inventories for babies from
communication of meaning. The creation and evolu- all over the world would be the same — even in
tion of sophisticated human language skills may pos- babbling from babies close to 1 year of age. In
sibly be traced back to the basic babble syllable. these inventories, most if not all of the possible
The “Frames-then-Content” theory of MacNeilage phonetic events capable of being produced by
and Davis (2000) is appealing for several reasons. First, the human speech mechanism would be found.
it is a theory not only about the evolutionary basis of There would be no language-sensitive influences
speech sound development, but also about the develop- on babbling because the onset of true language
mental course of speech sound development in the first was initiated by the special language acquisition
year of life. Prebabbling babies can often be observed described in Chapter 3. We now know babies
to wag their mandibles without sound, and it is not start this way, and as they progress to the end
unreasonable to imagine them adding sound to the of the first year of language learning, sharpen
wags and discovering the syllable-like results, like their and mold their phonetic inventories under the
early human predecessors. A “buh” or “bih” is a com- influence of their native language phonetics.
mon first syllable in babbling, just as one might expect Babbling is a well-practiced dress rehearsal for
from a simple mandible wag with sound added and a the opening night of language performance — the
relaxed tongue. Second, the theory derives the sound- first words — rather than a cacophony of random
producing abilities of humans from very basic move- sounds from performers without a script.
ments of the lips, jaw, and tongue within the human
ous languages. Some studies of early babbling suggest 3 to 8 Months: Perception

a phonetic inventory that is essentially uniform across and Comprehension
several languages, with little evidence of the native-
language phonetic inventory. More data are available During the 3- to 8-month period, babies lose the abil-
on the preword, phonetic inventories of later babblers ity to discriminate between selected sound pairs as
whose native language is Swedish, Japanese, French, described previously. Phoneme contrasts that are not
and English. These data show an influence of the native used in the native language “drop out” of the infant’s
language on the phonetic inventory of babbling (Boys- perceptual repertoire and the auditory system is grad-
son-Bardies, Hallé, Sagart, & Durand, 1989; Boysson- ually tuned to those contrasts used in the native lan-
Bardies & Vihman, 1991; Lee, Davis, & MacNeilage, guage. A good example of this is the difficulty with the
2010). A conclusion from this work is that babbling “r”-“l” distinction experienced by native speakers of
and its development is a kind of practice for the use of Japanese when they are listening to (and producing)
native language speech sounds in first words. English. The loss of the ability to discriminate /r/ from
Babbling with a “true” CV form in which the syl- /l/ does not affect the Japanese baby’s ability to master
lable is repeated is called canonical babbling. D. Kim- her language, because the “r” and “l” sounds do not
brough Oller of Memphis State University, Tennessee, create minimal pairs in Japanese.4 In English they do,
has been studying babbling for many years and believes as evidenced by word pairs such as long-wrong, light-
the CV form is a basic structure of human sound sys- right, hail-hair.
tems (Lee, Jhang, Relyea, Chen, & Oller, 2018). A CV Toward the end of the 3- to 8-month period,
form, for example, is the most common syllable in another remarkable, auditory-perceptual skill can be
many languages, and its frequent appearance in early observed among infants. Around 6 to 7 months of age,
(canonical) babbling is revealing of a basic structural infants show an ability to identify specific words within
characteristic of speech sound systems. A baby is said the ongoing stream of speech. The speech acoustic
to produce canonical babbling when her previously signal from connected speech does not show obvious
difficult-to-transcribe sounds become recognized as boundaries between words, and indeed the continuous
“real” consonants and vowels in a CV form. stream of sounds often makes the process of word iden-
Later babbling clearly shows influences from the tification difficult for digital speech recognition pro-
phonetics of the parent language (as in a comparison grams. Adults have little difficulty with this skill, but
between Mandarin Chinese and American English: at some point infants must (and do) learn to perform
see Lee et al., 2018). Other factors, such as parental segmentation of the speech signal to extract words from
interaction style, may also affect babbling onset and the continuous sequence of auditory speech events.
its content. An accumulation of evidence over the past 20
Canonical babbling is not just a cute characteris- years suggests that even though infants may not com-
tic of infants taking their speech mechanism for a test prehend the language being spoken around them,
drive. There is evidence that the age at onset of bab- they are paying attention to its structural properties.
bling predicts the age of onset of first words, and that Language exposure in the first half-dozen months of
the specific phonetic content heard in babbling predicts life and beyond allows infants to build up a cognitive,
the phonetic content of first words (McGillion et al., statistical model of these structural properties. Presum-
2017). In other words, the practice of specific speech ably, a product of this cognitive model around 7 months
sounds during babbling predicts early words hav- of age is knowledge of the likely forms of words. This
ing the same speech sounds (“the consonants used knowledge allows the baby to begin extracting words
in babble are typically the ones used in first words” from the stream of speech.
[McGillion et al., 2017, pp. 157–158]). The age of onset The learned skill of identifying words in the con-
of babbling also seems to predict the development of tinuous stream of speech is probably the basis for an
vocabulary at later childhood stages. Finally, canonical increase in comprehension of language toward the
babbling is delayed or absent in several developmental end of the 3- to 8-month period. Infants begin to show
speech disorders and may contribute to the diagnosis comprehension of words as they approach 8 months of
of certain conditions (such as in autism, or in children age, but only within rich context. This means the word
with intellectual disabilities [Lohmander, Holm, Eriks- is accompanied by supporting gestures and intona-
son, & Liberman, 2017]). tion, and perhaps is spoken at a particular time of day
4
ecall from Chapter 3 that a minimal pair contrast is when one sound substitutes for another in the same position in a word (such as initial
R
consonants) and results in a change in word meaning.
(e.g., waking up, or at meals), all of which contribute to age between sound differences on the one hand, and
the infant’s comprehension. For example, a Mom who meaning differences on the other hand, “hardens” the
leaves the house every morning with her baby may say “l” versus “r” sound contrast as a categorical differ-
the word “out?” while pointing to the door; the baby ence — that is, the sound difference is phonemic, func-
understands the word because of the rich context in tioning to distinguish words. A Japanese baby may
which it is spoken. The same word spoken at a different have a stuffed bunny rabbit, which her parents call
time of day and without the supporting gesture may “rini” (Japanese word for “little bunny”). The Japa-
not be understood by the 8-month-old baby. nese “r” is sometimes described as being articulated
Werker (2012) describes a more specific example, between an English “r” and “l”, and Mom and Dad
relevant to the early comprehension of words, of how may vary the way they say the “r” at the beginning of
the baby’s environment may shape the phonetic distinc- “rini,” but the word is always spoken in the context of
tions that come to have phonemic status in the native the stuffed bunny. Baby hears the variations in the Jap-
language. In her own words, “the cooccurrence of two anese “r” sound and in so doing learns to dismiss the
phones [she means, phonetic events] with two differ- phonetic differences because the variation between “r”-
ent objects could help pull them apart, whereas the co- like and “l”-like phonetic sounds is not tied to mean-
occurrence of two phones with a single object could ing differences (in this case, object differences). This is
help collapse the distinction” (Werker, 2012, p. 55). what Werker means by a phonetic distinction being
What does Werker mean? Let’s imagine that a baby “collapsed” — the variants of the sound are not critical
who is learning American English hears the word “lap” to signaling differences in meaning. In Japanese, they
and “rap” in her environment. (“Come sit in Mommy’s are phonetic variations of the same phoneme category.
lap” as Mom lifts the baby into her lap, and, “Mommy Werker’s (2012) idea about how phonetic differ-
loves rap” as she turns up a tune in the car; Mommy ences may or may not become phonemic differences,
is old school.) Baby knows, of course, that Mommy’s and how a child’s learning of the sound system of his
lap is a place she loves to be, and understands that or her language is a pathway to word learning, is inter-
Mommy loves music with heavy beats and a steady preted in Figure 5–2. The process of learning sound
stream of human speech. The baby hears many ver- distinctions and their relationship to word learning is
sions of both words, because Mommy takes her in imagined in this figure as a circular, interactive pro-
her lap a lot, asking her if she wants to sit there, and cess. The left side of Figure 5–2 depicts the process of
listens to a lot of rap in the car and is always telling learning the “l”-“r” distinction for babies whose native
baby the name of the music style. Baby develops the language is English. Different speech sounds are heard
idea of the sounds, “l” and “r,” being associated with and paired with different objects or actions. The link-
different meanings because clearly, Mommy’s lap and age of the phonetic variation with different objects sug-
the music she calls “rap” are different things. This link- gests “l” and “r” as different categories, contributing
Words Established Words Established
Different Different
Different One
Speech Speech
Phonemes Phoneme
Sounds Sounds
Different Objects Same Object

or Actions or Action
Figure 5–2. The learning of sounds as different phonemes (left circle) or variants of the same phoneme
(right circle). The learning of phonemes “primes” the baby to be sensitive to words with the same sounds, and
to develop early comprehension of words beginning with these sounds. Based on Werker, J. (2012). Perceptual
foundations of bilingual acquisition in infancy. Annals of the New York Academy of Sciences, 1251, 50–61.
to the learning of unique vocabulary items beginning listen to children or adults engaged in conversation,
with these different sounds (“Words Established” in you hear their voice pitch change frequently, rising
Figure 5–2). The circular nature of the process repre- across certain groups of syllables and falling across oth-
sents the establishment of word items beginning with ers. Pitch changes in speech are called intonation, and
“l” and “r” and identifies these sounds as good can- most people can recognize “normal intonation” even
didates for the learning of other word items begin- if they cannot define it explicitly. Unusual intonation,
ning with the same sounds. The newly learned “r”-“l” whether the monotone speech of certain individuals
contrast primes the baby to be on the lookout for new who are sad or depressed or the wild pitch changes of
words with these initial sounds. The process is inter- a game-show host after too many cups of coffee, are
active because it is not simply the accumulation of also easily recognized.
massive amounts of speech acoustic data , as happens Adults are sensitive to small changes in intona-
when a baby is exposed to so much human speech, but tion — they recognize something momentous when a
rather the organization for meaning of these sound data baby begins to produce variegated babbling with into-
by environmental data — people, objects, and actions. nation. It is as if the baby is trying to mimic the sound
The right side of Figure 5–2 shows the same proof normal conversation but does not have the words to
cess for the baby learning Japanese. As in the case convey meaning. This stage of babbling is called “jar-
of the English-learning baby, the Japanese baby is gon” because the child’s utterances sound like “real”
exposed to phonetic variations that on close inspection speech even if consisting of strings of nonsense syl-
by a trained phonetician seem to be sometimes “r”-like, lables with varied sound content.
sometimes “l”-like. In Japanese, these phonetic varia- When children produce jargon, usually close to a
tions are applied to the same object or action, as if the year of age, parents can expect a first, “real” word at
single object/thing can be represented by the “r”-like any time. Transitional forms between babbling and real
or “l”-like variant. Unlike the case of English, the pho- words, called protowords (also called phonetically con-
netic variants are not paired with different objects but sistent forms), may also be heard. These are syllable-
are “mapped” on to the same object. When the child is sized utterances (usually more complex than canonical
exposed to multiple instances of the sound variations syllables) produced by the child in a consistent way,
applied to a single object, and many other objects, with a recognizable referent, apparently meant to con-
names, and actions, she “collapses” the phonetic varia- vey meaning. Protowords, however, are not part of the
tions into a single phonemic category. New words can adult lexicon — they are not real words. For example,
be established within this category, but the phonetic one of the author’s nephews consistently said “bahmp”
variants do not prime the child for “r”-like versus to refer to bread, either to identify it for the delight of
“l”-like words. his easily delighted audience or to request a piece for
eating or shredding. The consistent use of “bahmp”
to refer to bread qualifies it as a protoword because
8 to 12 Months: Production “bahmp” is not part of the English lexicon.5 Protowords
are often mixed in with jargon, and their use is another
Canonical babbling may continue for several months signal that “real” words are just around the corner.
after its onset, certainly into the last part of the first
year of life. In some children, this most simple type of
babbling is followed by continued use of the repeated 8 to 12 Months: Perception
frame, but with varied content across the syllables. and Comprehension
Now, along with (or instead of) “buh-buh-buh-buh”
the baby may produce “bah-dee-goo-gae,” or any num- In the 8- to 12-month period, infants continue to tune
ber of other CV syllable combinations. This is called their ability to discriminate speech sounds of their
variegated babbling in recognition of its varied content. native language. The corollary to this is the loss of
Variegated babbling retains certain features of discrimination ability for nonnative sound contrasts.
canonical babbling, namely, a “metered” syllable se- Scientists believe this “sculpting” of speech percep-
quence, as if every syllable has the same duration and tual skills is a result of continued exposure to the lan-
“flat” melody, as if produced on a single pitch. If you guage. The infant’s cognitive model of the structural
5
I t could be part of the English lexicon, if users of the language would agree on its meaning. But at the current time, if you approached 50 people
on the street and asked them what “bahmp” means, most would probably shrug their shoulders and say they do not know. All of those people,
however, would recognize that the form of this protoword — a consonant-vowel-nasal-consonant (CVNC) form — is an allowable sequence of
sounds for the formation of English words (see discussion of phonotactics in Chapter 3).
properties of the language is of prosodic form not only Gesture plays an important role in preverbal lan-
for words (e.g., the relationship between stressed and guage skills as well. Many gestures are initiated by par-
unstressed syllables) but also for the important seg- ents in their interactions with the baby. The gestures
mental (sound-level) properties. accompany recurring actions (“where is it?”; “more”;
As infants approach 1 year of age, they begin to “all gone!”), emotional states (smiling, surprise, sad face),
comprehend more language, including an increas- and representation of object properties, as noted just
ing number of words and, apparently, more complex previously. As the baby enters the fourth month and is a
sentences. Before and after the first birthday, children more active communication partner, pointing becomes an
typically comprehend more words than they produce. important gestural component of communication interac-
Around one year of age children also begin to com- tions. Parents connect spoken words to people, objects,
prehend more complex sentences. Comprehension of pictures, and actions by pointing at them. The baby hears
complex sentences in this age period requires lots of and sees a multitude of this coordinated communication
supporting context, as described previously for word act, even for the same person, object, and so forth, and
comprehension by children around 8 months of age. learns the idea of a word (the spoken label) as well as the
As the first year of life is completed, infants begin potential utility of their own pointing gestures to request
to show comprehension of some paralinguistic aspects a spoken label. Pointing therefore serves as a builder
of communication. The term “paralinguistic” is used of comprehension vocabulary in early babyhood and
here to denote nonsegmental (not associated with throughout the first year of life and months after as well.
speech sounds and their sequencing to form words) Pointing also serves to establish what is called joint
aspects of voice and speech that nevertheless convey attention. Around 6 months of age, babies will follow
meaning. Often, this meaning concerns the mood, a point and look at the object or person to which (or
intent, or state of mind of the speaker. For example, whom) the point is directed (Rohlfing, Grimminger, &
rising pitch across an utterance signals a question, as Lüke, 2017). The point joins the attention of both par-
in the difference between “He’s here?” versus “He’s ent and child to the object, or to the person, or in more
here” (for wh- questions, the rising pitch is not neces- advanced cases to an action such as “running” (“Look,
sary).6 Another example is loudness of the voice, which the boy is running”). The emergence of pointing is a
can signal a range of speaker moods and intents. “Bed landmark stage in preverbal language skills. Early
time” means something very different spoken loudly pointing may even predict the speed and sophistication
as compared to a soft, gentle version of the utterance. of language development in the first few years of life.
Paralinguistic aspects of language are complex (ask Conversely, some scientists believe delayed pointing
anyone who has aspired to be an actor), but by the end or its absence can be predictive of delay or disorder in
of the first year, infants are beginning to comprehend language development (Lüke, Ritterfeld, Grimminger,
their meaning. Liszkowski, & Rohlfing, 2017).
Gesture and Preverbal

Chapter Summary
Language Development
People gesture when they speak; this is hardly a sur- Throughout the first year of life, babies learn a range
prise. Perhaps a less obvious aspect of gesture is its of linguistic and general knowledge skills that serve as
integral role in communication. Adults, and children the foundation for language skills.
who are past their first words and on their way to two- The skills include preparation for producing and
word utterances, coordinate gesture with oral speech. understanding language.
Gestures play more than a supporting role in convey- Three chronological age periods, 0 to 3 months, 3
ing meaning. Indeed, gesture may accomplish com- to 8 months, and 8 to 12 months, are presented as age
municative goals not easily conveyed by oral speech ranges during which preverbal skills are developed;
such as (for example) representation of shape, size, and the age ranges are somewhat arbitrary due to the large
orientation of an object that is being discussed (Brentari chronological variability among children’s mastery of
& Goldin-Meadow, 2017; Goldin-Meadow, 2017). these skills.
6
he distinction between the intonation of statements (declarative utterances) and questions may be disappearing to some (or a large) degree.
T
In young people, there is a growing tendency to produce statements with a rising pitch. This is called “Uptalk” and seems to be more common
in young women compared with young men. Some writers trace this trend to the early 1990s. The author believes a catalyst for the widespread
use of this conversation style is the dialogue in the film “Clueless” (1995). For more on Uptalk, see Tyler (2015).
In the 0- to 3-month period, babies are able to dis- Lee, C-C., Jhang, Y., Relyea, G., Chen, L-m., and & Oller, D.
tinguish closely related speech sounds but do not truly K. (2018). Babbling development as seen in canonical bab-
comprehend language; during this period, their sound bling ratios: A naturalistic evaluation of all-day record-
production is dominated by vocalizations that do not ings. Infant Behavior and Development, 50, 140–153.
Lee, S. A. S., Davis, B., & MacNeilage, P. (2010). Universal
convey intention meaning.
production patterns and ambient language influences in
In the 3- to 8-month period, babies begin to com-
babbling: A cross-linguistic study of Korean- and English-
prehend simple aspects of language, but a rich context learning infants. Journal of Child Language, 37, 293–318.
is needed to support comprehension. Lohmander, A., Holm, K., Eriksson, S., & Liberman, M. (2017).
Comprehension skills improve throughout the 3- Observation method identifies that a lack of canonical
to 8-month period, but a rich context remains an impor- babbling can indicate future speech and language prob-
tant component of language understanding lems. Acta Paediatrica, 106, 935–943.
Early in the 3- to 8-month period, sound produc- Lüke, C., Ritterfeld, U., Grimminger, A., Liszkowski, U., &
tion is characterized by coos and goos, and toward the Rohlfing, K. J. (2017). Development of pointing gestures
end of the period babbling emerges. in children with typical and delayed language acquisi-
Canonical babbling, consisting of consonant- tion. Journal of Speech, Language, and Hearing Research, 60,
3185–3197.
vowel, repeated syllables, is the first syllable type
MacNeilage, P. F. (1998). The frame-content theory of evolu-
produced by babies, followed by variegated and then
tion of speech production. Behavioral and Brain Sciences, 21,
jargon babbling. 499–511.
In the 8- to 12-month period, children begin to MacNeilage, P. F., & Davis, B. L. (2000). On the origin of inter-
understand more complex utterances and have espe- nal structure of word forms. Science, 288, 527-531.
cially good comprehension skills with rich context. McGillion, M., Herbert, J. S., Pine, J., Vihman, M., dePaolis,
Production skills in the 8- to 12-month period may R., Keren-Portnoy, T., & Matthews, D. (2017). What paves
include protowords, also called phonetically consistent the way to conventional language? The predictive value of
forms, which are not “real” words but are used consis- babble, pointing, and socioeconomic status. Child Develop-
tently to refer to a specific toy, pet, parent, and other ment, 88, 156–166.
objects/people. Mehler, J., Jusczyk, P., Lambertz, G., Halstead, N., Berton-
cini, J., & Amiel-Tyson, C. (1988). A precursor of language
First words are usually produced around 1 year
acquisition in young infants. Cognition, 29, 144–178.
of age.
Menard, L., Schwartz, J-L., & Boë, L-J. (2004). Role of vocal
A good deal of theoretical controversy surrounds tract morphology in speech development: Perceptual tar-
how and why babies develop language skills through- gets and sensorimotor maps for synthesized French vow-
out the first year of life. els from birth to adulthood. Journal of Speech, Language, and
Hearing Research, 47, 1059-1080.
Oller, D. K. (1980). The emergence of the sounds of speech in
References infancy. Child Phonology, 1, 93–112.
Rohlfing, K. J., Grimminger, A., & Lüke, C. (2017). An inter-
Boysson-Bardies, B., Hallé, P., Sagart, L., & Durand, C. (1989). active view on the development of deictic pointing in
A cross-linguistic investigation of vowel formants in bab- infancy. Frontiers in Psychology, 8, 1319. doi:10.3389/fpsyg
bling. Journal of Child Language, 16, 1–17. .2017.01319
Boysson-Bardies, B., & Vihman, M. M. (1991). Adaptation to Segal, O., Hejli-Assi, S., & Kishon-Rabin, L. (2016). The effect
language: Evidence from babbling and first words in four of listening experience on the discrimination of /ba/ and
languages. Language, 67, 297–319. /pa/ in Hebrew-learning and Arabic-learning infants.
Brentari, D., & Goldin-Meadow, S. (2017). Language emer- Infant Behavior and Development, 42, 86–99.
gence. Annual Review of Linguistics, 3, 363–388. Tyler, J. C. (2015). Expanding and mapping the indexical
Eimas, P. D., Siqueland, E. R., Jusczyk, P., & Vigorito, J. (1971). field: Rising pitch, the Uptalk stereotype, and perceptual
Speech perception in infants. Science, 171, 303–306. variation. Journal of English Linguistics, 43, 284–310.
Goldin-Meadow, S. (2017). What the hands can tell us about Vihman, M. M. (2017). Learning words and learning sounds:
language emergence. Psychonomic Bulletin and Review, 24, Advances in language development. British Journal of Psy-
213–218. chology, 108, 1–27.
Kuhl, P. K., Stevens, E., Hayashi, A., Deguchi, T., Kiritani, S., Vorperian, H. K., Kent, R. D., Lindstrom, M. J., Kalina, C. M.,
& Iverson, P. (2006). Infants show a facilitation effect for Gentry, L. R., & Yandell, B. S. (2005). Development of vocal
native language phonetic perception between 6 and 12 tract length during early childhood: A magnetic resonance
months. Developmental Science, 9, F13–F21. imaging study. Journal of the Acoustical Society of America,
LaGasse, L. L., Neal, A. R., & Lester, B. M. (2005). Assessment 117, 338–350.
of infant cry acoustic cry analysis and parental perception. Werker, J. (2012). Perceptual foundations of bilingual acquisi-
Mental Retardation and Developmental Disabilities Research tion in infancy. Annals of the New York Academy of Sciences,
Reviews, 11, 83–93. 1251, 50–61.
6
Typical Language Development
Introduction The social setting in which a child matures, her

perceptual, cognitive, and conceptual skills, plus
Typical (normal) language development is variable in linguistic factors, all have the potential to influence
children. Some children develop language skills early, language development (Johnston, 2006). These fac-
some late, but most typically developing children tors probably interact in different ways for different
attain approximately equivalent levels of language children, accounting in part for the wide variability
skill by the age of 5 or 6 years. Like preverbal lan- in language development milestones observed across
guage skills, the sequence and milestones of normal children. The reader should keep in mind these poten-
language development described in this chapter are tial influences on a child’s language development.
appropriate for the “average” child. Table 6–1 presents a summary of language develop-
Some may argue that an “average” child does not ment sequences and milestones for the typically devel-
exist, and that variability in language development oping child. This table includes information covered
among typically developing children can be tied to in Chapter 5, summarizing how infants move through
environmental and cultural variables. For example, an coos, goos, and babbling stages as they approach the
infant who has been spoken to a great deal by primary end of the first year and the production of their first
caretakers may emerge as a child who develops lan- word. Between 12 and 18 months, toddlers add 7 to
guage very quickly. Substantial language stimulation, 11 words per month to their vocabulary, for a total of
however, may be culture specific. In some cultures, about 50 production words at age one and a half years.
adults do not direct much speech to children. For These words are most often spoken as single-word
example, a child reared in such an environment, or in utterances, as if the child does not have the concept of
any environment with relatively infrequent language a sentence. At this point in development, the compre-
stimulation, may have fewer words at 18 months com- hension vocabulary is typically larger than the produc-
pared with an 18-month-old child who is spoken to a tion vocabulary.
lot, but the relatively small spoken vocabulary must Around 18 to 24 months, toddlers experience a
be evaluated relative to the prevailing environmental/ vocabulary “spurt,” adding a huge number of words to
cultural influences. This child’s small vocabulary may their vocabulary and beginning to combine them into
reflect a substantial influence of his culture. A good simple two-word utterances. Between 2 and 3 years,
summary of potential environmental/cultural influ- vocabulary continues to grow, utterances become lon-
ences on language development is given by Roseberry- ger, and children begin to use bound morphemes in an
McKibben (2007, pp. 47–49). appropriate way. During this period, children begin to
69
Table 6–1. Summary of Stages (by Age Range) of Typical Language Development
Age Range Language Skills

Newborn May be able to discriminate certain phonemes; may attend to
specific voices
1–3 months Coos, goos, probably not used intentionally for communication
purposes
3–6 months Onset of canonical babbling toward later end of age period
6–8 months Variegated babbling toward end of period, comprehension of

words in rich contexts
8–12 months Jargon, consistent phonetic forms (protowords), first word toward
end of first year, comprehension of sentences in context
12–18 months New words gained until a total of about 50 at the end of this period;
more words comprehended than produced
18–24 months Vocabulary “spurt” as a result of naming insight; two-word

utterances toward the end of the period, vocabulary of about
200–300 words, many more words comprehended and sentences
understood in rich context
2–3 years Usage of grammatical morphemes, longer utterances (three-word,

possibly four-word utterances), typical mean length of utterance
(MLU) around 1.5–2.5; comprehension excellent for simple
sentences, more complex sentences understood with context and
in familiar settings
3–4 years Vocabulary growth, continued mastery of grammatical morphology,

MLU ~3–4
4–5 years More complex sentences, developing pragmatic skills for

conversation and understanding of more complex sentences;
MLU ~3.5–4.7; relative clauses, coordination, passive forms;
metalinguistic skills begin to develop
5–6 years Mastery of grammatical morphemes complete, continued

development of conversational and narrative skills
7 years and Expanding vocabulary, production and comprehension of complex

beyond sentence forms, various metalinguistic skills improve into college-
age years
learn conversational and narrative skills as well, devel- use of language (“pragmatics,” such as turn-taking in
oping the pragmatic skills for effective communication conversation, and understanding nonverbal cues to
in real-life situations. communication such as facial expression and gesture).
During the fourth year, vocabulary continues to The development of reading skills is likely to affect oral
increase along with longer and more complex sen- language, both comprehension and expression (pro-
tences. By age 5 years, most typically developing chil- duction), and the development of oral language skills
dren are beginning to recognize that language has a is likely to affect reading skills. Mastery of high-level
structure built from individual components (e.g., chil- vocabulary skills and sentence-level utterances con-
dren begin to recognize that words consist of individ- tinues into college-age years and beyond. Throughout
ual sounds). Past age 5 years, language gains include these developmental stages, language comprehension
expanding vocabulary, comprehension, and production including morphology, syntax, vocabulary, and prag-
of increasingly complex sentences, and language sub- matics is likely to be more sophisticated compared with
tleties such as understanding and telling jokes, social language expression (production).
12 to 18 Months to the doggie category. At this stage of language learn-

ing only Muffy qualifies as a doggie. For the child who
Between the ages of 12 and 18 months, a typical child underextends a word such as this, the link between
learns to produce about 50 words. These 50 words object and meaning is entirely specific to the context
are mostly nouns, including specific names for peo- in which the object is named — the familiar four-legged
ple (“Mama,” “Dada,” “Muffy”), body part names, animal in the baby’s house, whom she has known for-
food names, and names of other familiar objects (e.g., ever as a doggie.
“book,” “cup”). The nouns included among the first
50 words tend to be very concrete — abstraction is
not prominent in the toddler vocabulary — and are 18 to 24 Months
used frequently by both caretakers and the child.
A smaller number of the first 50 words are verbs such As the end of the second year is approached, some-
as “run,” “walk,” and “play.” There may also be a few thing interesting happens to change the rate at which
adjectives such as “allgone,” “dirty,” and “cold,” and toddlers add new words to their spoken vocabulary.
a few greeting words (“bye-bye”) and function words Up to this point, children have added roughly 7 to
(e.g., “where,” “for”). Many of these observations 11 new words per month, largely by constant repeti-
about the first fifty words are described by Nelson tion by caregivers of object-word links. Mom or Dad
(1973; 1981). has pointed to the dog many times and said the word
As in preverbal language development, compre- “doggie” before the toddler produces the word. Some-
hension outpaces production in the 12- to 18-month time late in the second year, toddlers have a “nam-
period. If the relationship between comprehension ing insight,” which allows them to do fast mapping of
and production is measured in number of words recog- spoken words to objects, actions, descriptions, and so
nized versus spoken, an 18-month-old toddler under- forth. It is as if the child suddenly realizes, “I get it,
stands many more words than she produces. Children when something is pointed to and Mom uses a word
in this age range may also appear to understand com- at the same time, that must be the name of the object!”
plex sentences, but they are probably using context Now the child appears to need only a single instance
and familiarity to comprehend these utterances. With of object-name or action-name pairing, and the word
context eliminated, as in a laboratory experiment, sen- enters the spoken vocabulary. When the child has this
tence understanding is probably limited to fairly sim- naming insight, he or she experiences a large spurt in
ple utterances. spoken vocabulary.
Much has been made of the phenomena of over- The child’s newly found skill of fast mapping is
extension and underextension in the child’s first fifty likely to result in 20 to 40 new words per month. By
words. When a toddler overextends the meaning of a 24 months, this vocabulary spurt results in a vocabu-
word, the semantic category is too broad. For exam- lary size of 200 to 300 words.
ple, the toddler who refers to all four-legged animals Toward the end of the second year, language skills
as “doggie” is generalizing from experience with the gain in sophistication, possibly as a result of the rapid
family dog. Similarly, if Daddy has a beard, all men increase in vocabulary. To this point, children commu-
with beards become worthy of the name. When over- nicate by producing single words, but now two-word
extension is viewed within the framework of linguistic utterances are heard for the first time. Some scientists
category formation, the establishment of different and believe the vocabulary must reach a relatively large
distinctive semantic categories is still at a very early size (for a toddler, at least), before the two-word utter-
stage, as are the mental representations for different ance stage is entered. In this view, a critical vocabulary
four-legged animals. There is an ongoing process of size “bootstraps” the grammatical step to two-word
refinement of these categories as the child is exposed to utterances. Additional detail on multiword utterances
more linguistic data provided by parents, other adults, is provided later in the section entitled, “Multiword
and video. Utterances, Grammatical Morphology.”
The opposite phenomenon is underextension, in At 2 years of age, language comprehension skills
which a single example of an object-word link becomes continue to improve, sharing a lot of the characteris-
sufficient to define an entire semantic category. Muffy, tics of earlier stages of language development. Par-
the family dog, becomes the only dog in the world. ents often overestimate their toddler’s comprehension
When Mom points out Spike the slobbering bulldog to abilities, when comprehension is defined as the abil-
her 16-month-old and asks, “What’s that?,” the baby ity to understand utterances in the absence of context.
does not respond because Spike, or any dog other than Children use their growing fund of world knowledge,
Muffy, have not yet contributed a valid representative however, to comprehend the meaning of relatively
complex utterances, and of course, comprehension Multiword Utterances,

always includes context.1 Grammatical Morphology
Beginning around 2 years of age, two notable and

Three Years (36 Months) related aspects of language development are the devel-
opment of grammatical morphology and the produc-
By 3 years of age, an average, typically developing tion of multiword utterances.
child has a vocabulary size between 600 and 2,000 Between the ages of 18 and 24 months, typically
words. The range of words is so large because “typical” developing children make first attempts at combin-
vocabulary size depends on so many factors, including two words to enhance their communication skills.
ing the toddler’s history and primary environment Early attempts may not be two “real” words, but rather
(socioeconomic status, caregiver style of communica- an approximation to a multiword utterance supported
tion, and so forth), current toys (and the range of words by combining words with gestures. For example, Bates
associated with them), exposure to television, day and her colleagues showed that toddlers often paired
care setting, and even the way in which it is decided gestures with words to create early, “two-word” propo-
that the child “has” words (such as parent report, or sitions (Bates, 1980; Bates, Thal, Whitesell, & Fenson,
from specific experimental procedures). At 3 years 1989). For example, a child may say “Daddy” and point
of age, the largest proportion of a toddler’s words is to a chair, meaning “Daddy sit.” In this case, the com-
still nouns. municative act is equivalent to two spoken words. Sim-
There is another reason to be cautious about iden- ilarly, some children may mix jargon (see Chapter 5)
tifying a “typical” number of words in a toddler’s with real words, creating a two-“word” utterance in
vocabulary (lexicon) at a given age (the same cau- which only one of the words is recognizable to an adult.
tion applies to “typical” aspects of almost any aspect In these examples, the child’s use of two units
of language development). Most of the information of meaning suggests an awareness of the potential to
we have concerning language development has come combine words for communicative purposes. Both
from studies of English-speaking children. A recent but examples include one “unit” that is not a “real” word
growing body of knowledge is demonstrating how the — a gesture in one case, a spoken nonword in the
facts of language development differ across languages other. The transition from one- to two-word utterances
and even across dialects within a language. Elin Thor- may therefore not be clear-cut. Children may produce
dardottir, a multilingual language scientist from McGill something in between, indicating they have the idea of
University in Quebec, Canada, showed that 3-year-old two-word utterances even if there are not actually two
children learning Quebec French have significantly well-formed words.
smaller vocabulary sizes than children learning Cana- When children begin to produce two-word utter-
dian English (Thordardottir, 2005). Clearly, the French- ances in which both words are “real,” the utterances
speaking children should not be regarded as having have certain “baby-ish” characteristics. The utterances
delayed vocabulary development relative to English- have only content words and lack articles, prepositions,
speaking children. Rather, the difference in vocabulary and grammatical morphemes. For example, the child
size reflects differences between the structure of the who wants to indicate to Daddy that the dog has pos-
two languages, cultural differences, or some complex session of a bone says “Muffy toy,” not “Muffy’s toy”;
combination of the two factors. Other examples of dif- the grammatical morpheme for possession is omitted
ferent vocabulary development in different languages (see later in chapter).
can also be found in the research literature. Throughout A good deal of effort has been devoted to identi-
the remainder of this chapter, the reader should keep fying the regularities and rules of toddlers’ two-word
in mind the potential cross-linguistic differences in lan- utterances. There are several different ways to review
guage development. this information; we have chosen to present Brown’s
1
here are two points to be made here. First, an experimenter can estimate a child’s “pure” comprehension abilities by presenting utterances
T
isolated from any context to obtain responses that show the presence or absence of understanding. Results may not be particularly reveal-
ing of “typical” language usage but can contribute to models and theories of language competence across development. It is important to
understand that a child’s difficulty in comprehending a complex utterance in a laboratory study does not mean the child cannot understand
the same utterance when spoken in a familiar situation. Second, certain developmental language delays may involve an inability or deficit in
using world knowledge to comprehend utterances that are beyond current “pure” comprehension abilities. Scientists may want to separate
the two sources of comprehension ability (“pure” comprehension of language from use of world knowledge to assist comprehension) to better
estimate the relative effects in certain disorders.
(1973) perspective on this stage of utterance develop- “demonstratives” is a semantic category for specifying
ment. Brown interpreted two-word utterances as ex- something that may be ambiguous (“That one,” not the
pressions of broader semantic relations. In other words, one you’re pointing to, Dad, wake up!). “Possessor and
toddlers organize these simple utterances in terms of possession” are important categories for the toddler
larger, relational categories that serve as “frames” for who views connections between agents or objects and
many different possible utterances. Table 6–2 presents their owners as critical to an orderly world (“Mommy
some of the semantic relations proposed by Brown, sock” is not so critical, of course, as “Daddy TV”; a sock
with examples of how different utterances “fit” the is a sock, but a big-screen TV . . . ).
frames. Also included are possible adult versions of The combinations of semantic categories, some of
the toddler utterances. which are listed as semantic relations Table 6–2, give
“Agents” are people, animals, action figures, and the toddler a great deal of flexibility in creating a vari-
so forth — any individual who can cause something to ety of utterances. Words can be “slotted” into these
happen or stand in relation to an object. “Actions” are frames, providing the toddler a way to communicate
typically verbs, indicating action (e.g., eat, cry), occur- about important agency, actions, ownership, and loca-
rences (e.g., shine, rain), or states of being (e.g., happy, tion in her environment. The semantic relations listed
sad). “Objects” have an obvious definition. “Loca- in Table 6–2 also provide the beginning of a grammar
tives” are words that indicate locations and directions, for the toddler, amounting to a set of simple rules for
as in “Muffy down.” When children use locatives, it how words can be combined to produce meaningful,
suggests they know about objects or agents in differ- effective utterances.
ent locations (e.g., “Muffy bed” versus “Muffy out”). The semantic categories proposed by Brown can
Note how Muffy falls into different semantic relations be combined to create longer but still incomplete utter-
depending on her status. When she is doing something, ances. Between 2 and 3 years of age, for example, the
such as eating, she is an agent of the action; when she toddler may generate utterances such as “Daddy watch
is located on the bed (“Muffy bed”), she is an entity in TV” and “Muffy eat shoes,” using the agent+action+
a specific location. Other semantic categories are also object framework, or “Daddy watch TV den” or “Muffy
used as frames for multiword utterances. For example, eat shoes den” (ruining Daddy’s expensive loafers
Table 6–2. Brown’s Semantic Relations as “Frames” for Two-Word Utterances
Semantic Relation Examples How Would an Adult Say It?

Agent + Action Muffy eat Muffy is eating her food
Jenny cry Jenny is crying
Daddy yell Daddy is yelling at Bobby
Agent + Object Man hat The man is wearing a hat

Jenny dress Jenny has a dress
Mommy treat Mommy has a treat
Action + Object Kick ball I just kicked the ball

Eat popsicle I want to eat this popsicle
Drive car Mommy is driving the car
Action + Locative Go park We’re going to the park

Fly up The bird flew into the tree
Entity + Locative Muffy down Muffy is downstairs

Car there The car is over there
Demonstrative + Entity That car! That other car, not this one
Possessor + Possession Mommy sock This sock belongs to Mommy

Daddy TV Daddy has a big-screen TV
and therefore his big-screen TV experience) using the First, an adequate sample of spontaneous speech
agent+action+object+locative framework. None of must be obtained from a child, perhaps while the child
these utterances are adult-like, but they show increas- is playing with a parent. The utterances produced dur-
ingly sophisticated language skills as more complex ing this interaction are recorded, along with Mom’s part
relations are expressed by fitting words into the seman- of the conversation, and the number of morphemes per
tic relations categories shown in Table 6–2. utterance is counted, and the sum of all morphemes
across the total number of utterances is divided by the
number of produced utterances.
Expanding Utterance Table 6–3 shows a simple example of the computa-
Length: A Measure of tion of MLU. Ten utterances from a 3-year-old child are
Linguistic Sophistication shown in the left column along with the Mom’s part
of the conversation; the number of morphemes for
After typically developing children master two-word each child utterance is given in the right column. Some
utterances, they extend utterance lengths in a system- utterances are short (e.g., “Throw there,” “In street,”
atic way. Utterance length has been a prime focus of both with two morphemes), and two are longer (“Dad-
scientists interested in language development. As we dy’s big TV broked,” six morphemes; “Daddy don’t got
will see, consideration of how utterance length changes football,” five morphemes).2 This kind of variability in
as a child develops must include an account of gram- utterance length for a given child at a given point in
matical morphology and how it is mastered. language development is not unusual.
In Roger Brown’s (1925–1997) 1973 book (Brown, When utterance lengths are averaged across the
1973) he reported a detailed analysis of the early lan- 10 child utterances in Table 6–3, the computed MLU is
guage development of three children (Brown christened 3.3, which is representative for a typically developing
them Adam, Sarah, and Eve). The observations were child of this age. Note especially utterances 6, 7, and 8,
longitudinal, meaning for each of the three children, where the number of morphemes exceeds the number of
language samples were collected and analyzed over words. The child is using grammatical morphology (see
time, as the children developed their skills. In addition, later in this chapter), and the free and bound morphemes
the language samples analyzed for Adam, Sarah, and are counted as separate “units” in the utterances, even
Eve were taken from spontaneous utterances produced when the grammatical morphology is applied incor-
in conversation with their mothers (and occasionally rectly (utterance 6, “broked” = two morphemes).
other caretakers). Brown was interested in an account Brown’s MLU data for Adam, Sarah, and Eve are
of genuine, functional language development, rather shown in Figure 6–1. Age in months is shown on the
than the kind of data collected in a highly structured x-axis and MLU on the y-axis. Two general characteris-
experiment. Eve’s language was first sampled when she tics of these data are clear: (a) MLU increases for each
was 18 months old, and Adam’s and Sarah’s when they child as he or she gets older, and (b) there are differ-
were 27 months old. Importantly, all three children had ences between the children. Most notably, Eve (blue
roughly the same average length of utterance when the line) increases her MLU from about 1.5 to just under
longitudinal observations began. In this sense, at the 4.5 over a younger age range (18 to 27 months) as com-
beginning of the longitudinal observations, the three pared with Adam and Sarah (27 to 43 months). All
children were more or less at equivalent stages for the three children increase their MLU in the same way, but
complexity of their multiword utterances. Eve does so at a younger age.
Like vocabulary development, data on MLU may
vary across languages and cultures. In Elin Thordar-
Mean Length of Utterance (MLU)
dottir’s study of children’s language learning of Cana-
The idea of using the average (mean) typical length of dian English and Quebec French, at comparable ages
a child’s utterance as a measure of language sophistica- French-speaking children had greater MLUs than
tion was one of the conclusions of Brown’s study. In fact, English-speaking children. Does this mean the French-
the measure he called mean length of utterance (MLU) has speaking children have more sophisticated language
become an “industry standard” as an index of language development skills than the English-speaking children?
sophistication. The computation of MLU is straightfor- The answer is “no.” Rather, the difference between the
ward, even though the process of collecting and analyz- two groups of children can be explained by differ-
ing usable data demands great care and patience. ences between the languages. French has a much more
2
hese morpheme counts demonstrate that there are cases in which the number of morphemes is not crystal clear: Is “Daddy’s” two or three
T
morphemes? (Dad + y (morpheme for diminutive) + possession morpheme, ’s). In the two examples, we count “Daddy” as one morpheme.
Table 6–3. Example of Computation of Mean Length of Utterance for Ten
Utterances, with “Units” Being Morphemes (Shown Only for Child Utterance)
Utterance Morphemes
Mom: What do you have?
1. Child: See I got ball. 4
Mom: What are you going to do with the ball?

2. Child: Throw there. 2
Mom: Be careful, you don’t want to break anything.

3. Child: This, break! 2
Mom: Throw it over there (points), that’s safe.

4. Child: Break our house! 3
Mom: Then where would we live?

5. Child: In tree! 2
Mom: That wouldn’t be much fun.

6. Child: Daddy’s big TV broked. 6
Mom: With the ball?

7. Child: I’m laughing! 4
Mom: What’s funny?

8. Child: Daddy don’t got football. 5
Mom: Not a bad idea!

9. Child: Throw at TV? 3
Mom: Um, not a good idea . . .

10. Child: Not good. 2
Total Utterances: 10 Total Morphemes: 33
MLU = 33/10 = 3.3
4.5
4.0
MLU (morphemes)
3.5
Figure 6–1. Plot of MLU data (y-axis)
as a function of age (x-axis) for the three
3.0
children studied by Brown (1973). Adapted
and modified from Brown, R. (1973). A first
2.5 language: The early stages. Cambridge, MA:
Harvard University Press.
Eve
Adam
2.0
Sarah
1.5
18 22 26 30 34 38 42
AGE (mos.)
75
extensive system of grammatical (bound) morphemes ances still sounds “baby-ish” when the utterances do
as compared with English, which tends to make MLU not include grammatical morphemes. When children
greater for French-speaking children. In other words, enter the two-word stage, they begin to use some gram-
French words are more likely than English words to be matical morphemes that make utterances sound “com-
inflected — that is, to have a bound morpheme (or mul- plete.” English grammatical morphemes are listed in
tiple morphemes) attached. Interestingly, when Thor- Table 6–4, in the order proposed by Brown for their
dardottir expressed length of utterance using words acquisition. Early grammatical morphemes such as
as the unit within utterances, French- and English- the present progressive (-ing, as in “She running”),
speaking children did not differ. prepositions, and plurals are mastered by many chil-
The lesson from this cross-linguistic comparison is dren sometime around the third birthday or a little
that a single measure such as MLU may reflect different after. Other grammatical morphemes may not be used
things in different languages. Because MLU is typically accurately even in the fourth year. What is clear, how-
computed by counting morphemes, the meaning of the ever, is the degree to which the continued refinement
measure varies across languages with different types of the grammatical morpheme system — think of it as
of grammatical morphology. “Heavily inflected” lan- the elegant window dressing of language — adds to the
guages — that is, languages with lots of frequently used adult-like sound of a child’s speech.
grammatical morphemes — may appear to have higher
MLUs at a given age as compared with languages with
Grammatical Morphology and Rule Learning
fewer grammatical morphemes. What may be true for
English may not be true for other languages. Children’s learning of grammatical morphology reflects
rule learning. For example, the “-ed” grammatical mor-
pheme for past tense is applied as a rule to verbs when
Grammatical Morphology the child wants to express something that has already
happened. The child does not need to learn the past
The child who combines the semantic categories previ- tense morpheme for every verb; rather, he learns the
ously described for two-, three-, and four-word utter- following generalization (that is, rule): when express-
Table 6–4. Grammatical Morphemes of English
Morpheme Example
Present progressive (-ing) She running
Preposition “in” Muffy in bed
Preposition “on” Spoon on floor
Plural inflections (e.g., “s”, “es”) dogs, dresses
Past inflections on irregular verbs I went (go) home; I ate (eat) candy
Possessive inflections Muffy’s ball
Uncontractible copula (is, am, and are) Here it is! They were naughty!
Articles (the, a, an) The dog; A man
Past inflections on regular verbs (e.g., “ed”) He walked fast; The baby cried
Regular third person forms (-s) She walks fast
Irregular third person forms (has, does) He has some; She does
Uncontractible auxiliary forms Is Daddy home? You were there
Contractible copula (e.g., ’s and ’re) Muffy’s there; They’re gone
Contractible auxiliary forms (e.g., ’d) He’d play every day
Note. The order in which the morphemes are listed is roughly the order in which they are
acquired, starting at about age 2 years. Based on Brown, 1973.
ing an action that has occurred, attach “-ed” to the verb. decompose words into their speech sound compo-
Ironically, a proof of rule learning is that children attach nents, to use (or understand) the same word with very
the morpheme to a verb that has an irregular past tense different meanings, to engage in linguistic wordplay,
form. In English, verbs such as “go,” “hit,” and “run” to be able to judge the correctness (or incorrectness) of
all have irregular past tense forms (“went,” “hit,” word order in sentences, and to recognize ambiguity in
“ran”). Typically developing children often overgener- the meaning of a single sentence.
alize the grammatical morpheme for past tense by say-
ing “goed,” “hitted,” and “runned.” This demonstrates
Words and Their Speech Sound Components
knowledge of the rule and the ability to combine free
and bound morphemes. Part of language development An adult who is asked the question, “What happens
is the elimination of these overgeneralized inflections to the word ‘about’ when the first sound ‘uh’ is taken
and the learning of irregular forms. away?” will answer, “You have the word ‘bout.’ ” This
As with the other aspects of language develop- answer demonstrates the ability to break a word apart
ment, children vary in their order of development and into its speech sound components. This metalinguis-
the rate at which they learn grammatical morphemes. tic skill is generally not part of the language capabil-
The order of mastery proposed by Brown is not fixed, ity of toddlers. Ask a 4-year-old the same question,
and in fact, varied somewhat for Adam, Eve, and Sarah. and he or she will likely be baffled by it. This meta-
The rate of mastery of grammatical morphemes also linguistic skill appears around the age of 5 or 6 years.
varies among typically developing children. “Typical” Before children develop this skill, they seem to treat
development may include the child who masters cor- words as “unbreakable” units: break one part (take
rect use of all grammatical morphemes by age 3 years away the “uh” from “about”), and you break the whole
as well as the child who continues to have difficulty word. Perhaps it is not a coincidence that the ability
with some grammatical morphemes at age 4.5 years. to recognize words as being made up of individual
speech sounds appears around the same time as early
reading skills.
Typical Language Development
in School Years Same Word, Different Meanings
By age 5 or 6 years, about the time a child enters first As children figure out that words can be broken down
grade, typically developing children have a relatively into sound components, they begin to understand and
large and expanding vocabulary as well as mastery of possibly use the same word with different meanings.
grammatical morphology. What else is there to learn Preschool children tend to be rigid in their under-
apart from new words, grammatical morphemes, standing of words and may have difficulty separat-
and multiword utterances? As it turns out, there is ing a particular word from its referent. “Cold” for a
still plenty to learn. Here we describe three aspects 4-year-old toddler is specific to temperature and does
of advanced language skills and their development not make sense as a description of someone’s person-
throughout the school-age years. These include meta- ality (“She is cold”) or as the effect of a particularly
linguistic, pragmatic (specifically discourse), and com- vicious right hook in a prize fight (“Knocked him out
plex grammar skills. cold”). School-age children begin to understand how
one word may have multiple meanings and, in fact,
how certain words may function in metaphors and idi-
Metalinguistic Skills oms. This metalinguistic skill is developed throughout
the school-age years and into and through adolescence.
Metalinguistic skills include the ability to reflect on lan- Some words can be used in very subtle, distant ways
guage and its components, to use language in a way from their most obvious meanings (as in the previ-
that demonstrates knowledge of these components and ous “cold” example, or “clam,” “clam up,” clammy”).
the arbitrary way in which they are combined. Selected This language skill takes a good deal of time to reach
examples of metalinguistic skills include the ability to full maturity.3
3
J ust how often words are used by native speakers in these multiple ways, with no sense of unusual language usage (i.e., recognition of frequent
use of idiomatic linguistic forms), becomes obvious when you communicate with people who are in a second language environment and
who have what appear to be pretty good skills in that language. The author has had a number of doctoral students from Taiwan and Korea,
and when meeting with them and speaking casually, he finds himself using phrases and expressions loaded with words having a variety of
metaphorical and idiomatic meanings. It is only after these expressions have produced a puzzled expression on the students’ faces that he
realizes how much our language is like an express train, avoiding all the local stops and getting to the final destination as fast as possible.
Linguistic Wordplay your 7-year-old brother to class; he is visiting and

wants to experience a real college class. As the instruc-
Preschool children may experiment with words by tor begins yet another long monologue, you turn to
changing sounds to make them funny, but real word- a friend sitting on the other side of your brother and
play does not emerge until the school years. There are say, “This class is awesome” without a surface trace of
many different types of wordplay; much of it is designed sarcasm or irony — nothing in your facial expression,
to be humorous. Some forms involve simple changes in your intonation, or your rhythm. Your statement sug-
sounds of words (“Captain Brunch” to describe some- gests nothing other than a straightforward declaration.
one who enjoys those Sunday trough feedings at the Hearing this, your friend knows you mean exactly the
local hotel chain) or mimic a foreign accent (“zome- opposite, but your little brother thinks, if this class is
where” for French-accented “somewhere”). These lin- awesome, maybe I want to reconsider college and instead
guistic abilities depend on the earlier-acquired skill learn a useful trade.
of decomposing words into their component sounds. This is an important part of pragmatics, and of
Other types of wordplay may require the skill of under- pragmatic development. The friend, hearing “This
standing multiple meanings of different words and the is awesome” within the history and context of the
same meaning for different words, as in the following class, having had experiences with other, less immo-
silly exchange: “Question: How can you tell when a bile instructors, knowing the big brother, and so
bucket gets sick? Answer: It looks a little pale.” Still forth, used his presuppositions about the comment to
others require much more sophisticated knowledge of interpret it correctly: the class is clearly not awesome.
cultural slang, possibly combined with very low-level The little brother, however, does not have some of
humor: “You can tell if your doctor is a quack when the presuppositions available to his brother’s friend
you see his large bill.”4 (he had not been sitting in this mind-melting lecture
hall for 13 weeks), and even in the immediate context
Pragmatic Skill: Discourse of an obviously boring professor, with other students
in various stages of sleepiness, does not have the prag-
Pragmatics covers a wide range of behaviors that are matic skill to understand the true meaning of his broth-
thought to be important to language use in social er’s comment.
contexts (see Chapter 3). The question of pragmatic Here we summarize the development of discourse
development as part of language development can be skills as an important part of pragmatic development.
stated simply as, “How do children learn the rules and Discourse is the broad term that includes conversation
customs of social communication?” These rules and skills and narrative (storytelling). According to Brinton
customs may include something as straightforward as and Fujiki (1989), conversation skills include (a) turn-
politeness, to the more complex consideration of how taking, (b) topic management, and (c) conversational
children figure out what someone means when what repair.
they say is not what they really mean.
People who study this latter skill often refer to it as
Turn-Taking
the “presupposition” part of pragmatics, which means
placing a conversational experience within the speak- Turn-taking during conversation is a skill, guided by
er’s and listener’s world knowledge, and the child’s learned rules. In a conversation, you speak and when
ability to place a current communicative exchange you are finished with a thought, your conversation
within the proper context. partner replies, and so on and so on. People taking
For example, assume you have been attending part in this conversation typically know when to stop
each lecture of a class, and it is the 13th week of a long, speaking (if they have the floor), how to anticipate that
hard, 15-week semester. In each lecture, the instruc- their turn to speak is near, and when to begin speaking
tor has droned on without mercy, rarely making eye as the current speaker finishes. Turn-taking behaviors
contact with students or changing the melody of his seem obvious because we learn them relatively early
speech. He reads information from his slides, never and so many people do take turns in a socially accept-
making impromptu remarks or elaborating on a point able way. But all of us have encountered people who
in a natural, spontaneous way. The exams have been begin to speak long before their conversational partner
as uninteresting and obtuse as the instructor. On this has finished speaking, who seem to miss those cues
day of the never-ending semester, you have brought that say, “I’m done with my turn in a second or two.”
4
he word “quack” as a description of a doctor with questionable (or even fraudulent) skills is apparently derived from the 16th century Dutch
T
word “quacksalver,” meaning someone who boasts while they apply a salve to a wound, or who sells useless medicines on the street.
The foundation for turn-taking skills is apparently Example 2

developed in infancy, when caregivers respond to
Bart: What’s wrong with the Packers?
infant vocalizations as if they are participating in a true
communication exchange. Children as young as 1 year Ralphie: My doggie doesn’t know where to go to
of age seem to get the general idea of waiting until the bathroom.
someone speaking to them is finished before begin-
Bart: Maybe they need a new quarterback . . .
ning their own turn. By the preschool years — roughly
between the ages of 3 and 6 — children’s turn-taking Ralphie: Watch my nose open and close all by itself!
skills are fairly sophisticated.
Bart: Do you care about the Packers?
Turn-taking in conversation becomes more sophis-
ticated throughout adolescence and into adulthood. Ralphie: Okay, I’ll come back and play later.
Speaking turns between members of a conversational
pair become more related, with the content of consecu- These examples suggest that even if the nature of
tive utterances containing increasingly more agree- the “topic” is hard to define, in many cases, it is easy
ment in theme and factual basis (Nippold, 1998). As to tell when one member of a communication “dyad”
turn taking becomes more sophisticated, there are (two people in conversation) is off topic. Anyone who
fewer off-topic turns, and people engaged in commu- has interacted with a small child knows that Example 2
nication begin to value the perspectives of their com- is certainly possible. Throughout adolescence and into
munication partners. adulthood, topic management is likely to change, with
fewer topic shifts per conversation and longer times
spent on a given topic (Nippold, 1998).
Topic Management
Topic management is what conversational partners do Conversational Repairs
to maintain a sensible spoken interaction. Conversa-
tion about a shared topic makes for sensible commu- One aspect of joint conversational efforts is the way in
nication. If this sounds slightly ridiculous as an aspect which miscommunications are handled. In conversa-
of language development, consider this: In a study of tions between two adults, an adult and child, or two
conversations between toddlers aged 19 to 38 months children, there are instances of uncertain meanings,
and their mothers, the youngest children shared top- unintelligible words, and ambiguous referents (e.g.,
ics with their moms about 56% of the time, the old- when a speaker says “he,” who does he mean if the
est about 76% of the time (Bloom, Rocissano, & Hood, person has not been identified previously?). The pro-
1976). Sometime between the ages of 2 and 3 years, cess by which speakers or listeners “flag” a miscom-
toddlers learn something about how to stay on topic munication and fix it is called conversational repair.
during a conversation. Here are two examples of conversational repair:
The mastery of topic management as a compo-
nent of language development is difficult to pin down Example 3
because the definition of “topic” is loose and open to Bart: The Packers are huge!
different interpretations. Even with disagreements
concerning the fine points of what does and does not Milhouse: But I thought you said they’re no
constitute a conversational topic, the following two good . . .
examples make the point nicely, and easily, as 5-year- Bart: No, they’re big guys, look at those linemen!
old Bart tries to sample opinion from two of his friends
on his favorite football team: Example 4
Example 1 Bart: What is up with those CB’s?
Bart: What’s wrong with the Packers? Milhouse: Huh, CB’s?
Milhouse: Brett Favre is my favorite. Bart: You know, cornerbacks, the guys who can
intercept passes.
Bart: He . . . he’s too old.
In the first case, Milhouse misinterprets Bart’s use
Milhouse: He’s probably cold up there, he’s from
of the word “huge” to mean “good.” Milhouse is con-
way down.
fused because Bart had expressed his concern with the
Bart: Sacked again! Packers’ poor performance (Example 1). Bart repairs
the conversational breakdown by specifying his use of animals, like bighorn sheep and these funny little guys
the word in literal terms. The second example, referred that looked like beavers. One time, one of those beaver
to as a clarification request, is a little more straightforward guys ran right across the trail and he like scared us,
as Milhouse simply does not know what “CB’s” means. even the guide. So he said we should keep our eyes
Bart repairs the conversation by defining the term. open for other animals on the trail, and then we were
Conversational repair skills develop throughout really wide awake, man. Really. At night we would sit
childhood, although it is difficult to specify ages at around a, um, campfire and eat dinner and sing and
which particular kinds of repair (and there are many tell scary stories. Those were fun, lots of fun. Some-
more than the two examples given previously) are times we wouldn’t go to bed until midnight, or maybe
learned. Clarification requests like the one in Example 4, even later. I think I’ll go back next summer.
or simpler versions, as when a listener says “Huh,”
“What,” or responds to a statement with a quizzical “Tell me what you did on your summer vacation.”
look, may be observed in children as young as 24 to 26
This place had water and animals right in front of us.
months. Other types of conversational repair, such as
He told us it was dangerous. The sheep and every-
a sequence of clarification requests that converge on
thing. Those sheep, um, something else, too. I had a hot
the intended meaning of the speaker (several requests
dog and chips. We had some food, hot dogs. A bunch
for clarification that become increasingly more spe-
of us went hiking. One time the guy told us to watch it.
cific) probably develop at a later age, perhaps around
The beaver animal. He said some scary stories; I cov-
36 months. More advanced conversational repair skills
ered my ears. The campfire was really hot; they gave us
continue to develop throughout the school-age years
paper plates for our food. I liked staying up late. I wish
and into adolescence.
the pop was a different flavor, I don’t like cola.
Narrative Some of you reading these examples are prob-

ably familiar with a slightly less obvious version of the
The term narrative means, in the simplest sense, story-
disorganized, second narrative — the ones you hear in
telling. A narrative can be an account of an event, a per-
some lecture halls. The point is made by comparing
suasive speech, or any extended sequence of utterances
the two narratives: relating something in a coherent
(or paragraphs, in writing). A narrative can be elicited
way requires a consistent theme, a sensible sequence,
from a child by asking a question such as, “Tell me
and grammatical devices to tie the different sentences
about your summer vacation” or “Tell me what hap-
together. These cohesive devices are bolded in the first
pens in the movie Moana.” The child responds by pro-
narrative and nearly absent in the second one. Notice
ducing a series of utterances which are meant to “hang
especially the pronouns “he” in the fourth and fifth
together,” so that the result is a sensible narrative.
sentences of the first narrative; these refer to different
Utterances within the narrative can hang together
entities (the first to the beaver-animal, the second to
in a number of ways. They can have a consistent theme,
the guide), but the listener has no problem knowing
can be sequenced appropriately, can have explicit ref-
who or what is being referred to. This is because the
erents (can include all the information required by the
narrative is well formed and therefore removes any
listener to understand the narrative), can use cohesive
ambiguity that may arise as a result of the consecutive
devices such as pronouns and conjunctions to link
use of the same pronoun. In contrast, the referents of
individual sentences with each other, and can be told
the two “he” pronouns in the second narrative (sec-
with a minimum of repetitions and alternate formula-
ond and tenth sentences) are not at all clear, because
tions of the same statement. Here are two narratives
the overall theme is unfocused, and the sentences are
in response to the same question, produced by male
oddly sequenced.
children around the age of 6 years; one hangs together,
Narrative skills develop continuously, probably
and the other does not.
well into adulthood, but it is not easy to attach ages
to the development of specific skills. Nippold (1998)
“Tell me what you did on your summer vacation”.
has said that from about age 5 years onward, narrative
Well, we went to a mountain camp, which was at a skills are improved by increasing the length of stories,
lake. They had guides who helped us and fed us lunch including more details within them, incorporating sub-
every day, you know, hot dogs, chips, and pop and themes within the major theme, and providing smooth
stuff. They showed us how to, like, hike through the transitions between the various episodes of the story
forest around the lake, and sometimes there would be (that is, not just jumping from episode to episode, as
might be the case for young children). The develop- throughout the school years and beyond. Some con-
ment of narrative skills is important not only because structions, such as passives with a conjunction (“The
it serves an important social function, but because dance number was turned down by the ballerina
adults — especially teachers — may make judgments because it was too difficult”) or the use of gerunds plus
concerning a child’s intelligence and potential based past perfect voice (“Running through the woods was to
on the child’s ability to tell a coherent story. be her lifelong choice of exercise”) are clearly not heard
from children in the first few grades and perhaps even
Complex Sentences through middle school. As a general rule, the greater
the syntactic complexity of an utterance, the later the
When children begin school, their utterances are likely age at which it will be comprehended and mastered in
to be relatively short and simple in grammatical form. spoken language.
From about age 5 years, children begin to use more
compound and complex sentences, which increase the
length of their utterances (Nippold, 1998). A compound Sample Transcript
sentence contains two independent clauses joined by
a conjunction (e.g., “I have a test today so I am going Table 6–5 contains a sequence of transcripts of con-
to study”), and a complex sentence contains at least versations between an adult and a typically develop-
one independent and one dependent clause (“I have ing child at three ages — 30 months, 42 months, and
a test today if the instructor did his job”). These two 54 months. These longitudinal transcripts for a single
examples are a simplification of the many ways in child allow direct comparison across age to demon-
which sentences can have complex syntax. The impor- strate developmental changes in language skills. The
tant point is, the use and understanding of these com- transcripts are brief, but the child utterances show clear
plex forms develops through the school-age years and developmental trends. Study the transcripts for the
adolescence. changing language structures and usage, and compare
Syntactically complex sentences — both their the changes to the discussion of language development
production and their comprehension — are mastered previously presented.
Table 6–5. Transcripts of a Single Child’s Language Performance During Conversation With an Adult
Sample From 30 Months Old

C: THIS BABY IS CRY/ING. C: DADDY (IS) TAKE/ING (A) SHOWER.
A: WHY? A: HE/’S TAKE/ING
A: WHY IS HE CRY/ING? C: {SHOWER NOISES}.
C: MOMMY AND DADDY GET HIM. C: (DO) (YOU) HEAR WATER COME/ING OUT (OF) HERE?
A: OH, GOOD. A: I HEAR IT.
C: THEY (ARE) GO(ING) UPSTAIRS. A: I HEAR THE WATER.
C: MOMMA GOT HIM. C: IT/’S IT/’S IT/’S IT/’S WATER RIGHT HERE.
C: MOMMA GOT HIM OUT (OF) HIS CRIB. C: ME CLOSE/(ED) (THE) DOOR/S.
A: OH, GOOD. A: YOU DID.
C: YEAH. A: THAT/’S A GOOD IDEA.
A: WHAT WAS THE MATTER? A: THEN THE WATER STAY/3S INSIDE.
C: I DO/N’T KNOW. C: {SHOWER NOISES}.
C: HE/’S SLEEPING WITH HIS DIAPER. C: HE/’S DONE TAKE/ING (A) SHOWER.
A: SLEEPING WITH HIS DIAPER. A: HE/’S DONE [G]?
A: THAT/’S PRETTY SILLY. A: OKAY.
continues
Table 6–5. continued

A: WHAT HAPPEN/ED? C: HEY I WONDER IF THIS MIGHT BE SUPERMAN/Z
C: SHE FELL RIGHT UP. DOG/S?
A: SHE FELL UP HUH? A: OH.
C: {SCREECH}. C: (SO) SO HE X HIM TO BE KRYPTO.
A: WELL WHO/’S GONNA WATCH THE BABY/S A: WHAT?
CHRISTOPHER? C: BECAUSE HIS NAME (COULD) COULD BE KRYPTO
C: THEY DON’T WANT TO HAVE GUY/S TO WATCH BECAUSE HIS HIS HIS NAME IS THAT/’S THAT/’S)
THEM. THAT/’S (WHAT) SUPERMAN/Z DOG IS NAMED.
C: THEY DON’T WANT ANYONE WATCH/ING THEM. A: SUPERMAN/Z DOG IS NAMED KRYPTO?
A: HEY, (CAN YOU) WHILE YOU/’RE OUT THERE, CAN C: MHM.
YOU CHECK THE MAILBOX AND SEE IF THERE/’S A: OH.
ANY MAIL? A: I THINK, IS/N’T THERE SOMETHING ABOUT
C: THERE/’S NOTHING. SUPERMAN AND KRYPTONITE, IS/N’T THERE?
A: NO MAIL TODAY? C: MHM.
C: NO. A: WHAT HAPPEN/3S IF SUPERMAN GO/3S BY
A: OH THAT/’S TOO BAD. KRYPTONITE?
C: HERE/’S THEIR DOGGY. C: HE HE GET/3S HITTED BY IT.
A: OH WHAT/’S HIS NAME? A: YEAH.
C: HIS NAME IS GOODO. A: IT/’S NOT GOOD FOR HIM, IS IT?
A: GOODO [G]? C: UHUH.

C: WAIT. A: MHM.
C: I THINK I KNOW WHAT THIS IS FOR. C: SEE?
A: OH. C: (THEN) AND THEN (IT) BRING/3S THE LITTLE BABY
A: WHAT IS IT FOR? UP AND THEN IT BRING/S THE SOCCER PERSON UP.
C: I THINK THIS IS LIKE IF SOMEBODY IS TOO FAR AND C: AND THEN IT BRING/3S THIS GUY UP.
THEY (ARE) TOO TIRED FROM DRIVE/ING THEY C: (AND THEN IT CLOSE/3S) AND IT CLOSE/3S (AND
HOOK THIS HOOK \ UP AND THEN THEY DRIVE BY. THEN DRIVE/ING) AND THEN EVERY SINGLE PEOPLE
C: AND THERE/’S A FIRE ON THE HOOK AND (THEY) [EW:PERSON] GET/(S) HURT IN THE FIRETRUCK.
BRING IT UP HERE SO THEY CAN SPRAY IT. C: SO THEY HAVE TO GO TO THE AMBULANCE.
A: OH YEAH. A: <OH MY GOODNESS>.
A: THAT SOUND/3S LIKE A GOOD IDEA. C: <YOU WANNA SEE>?
A: I WONDER IF THEY NEED SOME HOSE/S. A: YEAH.
C: THIS IS A HOSE. A: I WANNA SEE THAT.
A: IT IS A HOSE. C: THEY TAKE THE THING OUT.
A: YOU/’RE RIGHT. C: THEN THEY TAKE OUT THE BED.
A: THAT/’S A BIG ONE. C: THEN THE BABY GET/3S ON (AND THEN) AND THEN
C: LOOKIT. THE OTHER PERSON GET/3S ON.
C: (THEY/’RE LAY/ING) THAT ONE FIT INTO THE BACK. C: AND THE OTHER PERSON GET/3S ON AND THEN>
C: SEE? C: (AH) THAT PERSON (IS/N’T) IS PRETTY SAFE.
A: THEY ARE SO LUCKY. C: SO THIS GUY ISN/’T GO/ING.
C: AND THEN WATCH.
C: THE PEOPLE WITH SPECIAL LITTLE BED/S SO THEY
ALL LAY DOWN ON THE BED [EU].
Note. The samples were taken at 30, 42, and 54 months. Child utterances are indicated by “C,” Adult utterances are indicated by “A.” Gram-
matical morphemes that were produced are preceded by a forward slash; grammatical morphemes and other words that were omitted
are enclosed in parentheses, so the complete utterances can be inferred from reading all the words. “3S” indicates third-person singular
(as in “GET/3S = “gets”).
82
occurring in the first several years of life, these later-

Chapter Summary life language changes have a great deal of importance
for social and academic skills.
Typical language development occurs over a broad
range of ages at which milestones are reached. Age
benchmarks for major language-development accom- References
plishments, such as 50 words at 18 months, initial use
of grammatical morphology around age 2 years, and Bates, E. (1980). Vocal and gesture symbols at 13 months.
the metalinguistic skill of decomposing words into Merrill-Palmer Quarterly, 26, 407–423.
component sounds around age 5 years, show substan- Bates, E., Thal, D., Whitesell, K., & Fenson, L. (1989). Integrat-
tial variability across children for age of mastery. ing language and gesture in infancy. Developmental Psy-
chology, 25, 1004–1019.
Children’s lexicons — their vocabularies — begin to
Bloom, L. Rocissano, L., & Hood, L. (1976). Adult-child dis-
grow rapidly when they have a naming insight and,
course: Developmental interaction between information
with a single trial of having an object labeled, “get” the processing and linguistic knowledge. Cognitive Psychol-
connection between words and objects and actions and ogy, 8, 521–552.
properties (such as colors or size). Brinton, B., & Fujiki, M. (1989). Conversational management
The vocabulary spurt that begins with the naming with language-impaired children. Rockville, MD: Aspen.
insight occurs around the same time as the appearance Brown, R. (1973). A first language: The early stages. Cambridge,
of the two-word sentence. MA: Harvard University Press.
Multiword utterances are typically combinations Johnston, J. (2006). Thinking about child language: Research to
of semantic categories, as originally discussed by Roger practice. Eau Claire, WI: Thinking Publications
Brown. The semantic categories are “frames” into Leadholm, B. J., & Miller, J. F. (1992). Language sample analysis:
The Wisconsin guide. Madison, WI: Wisconsin Department
which children insert different words; the frames are
of Public Instruction.
blueprints for acceptable sentence structures.
Loban, W. (1976). Language development: Kindergarten through
Two-word utterances are gradually expanded to grade twelve (Research Report No. 18). Urbana, IL: National
three- and four-word utterances. The mastery of gram- Council of Teachers of English.
matical morphology plays an important role in creating Nelson, K. (1973). Structure and strategy in learning to talk.
longer sentences. Monographs of the Society for Research in Child Development,
MLU, a measure of utterance length, is a good 38, 1–135.
index of the linguistic sophistication of a child; the Nelson, K. (1981). Acquisition of words by first-language
comparison of MLUs across different languages must learners. Annals of the New York Academy of Sciences, 379,
account for the different grammatical morphology 148–159.
across languages. Nippold, M. A. (1998). Later language development: The school
age and adolescent years (2nd ed.). Austin, TX: Pro-Ed.
Language development from the beginning of
Roseberry-McKibben, C. (2007). Language disorders in children:
grade school through adolescence and into young
A multicultural and case perspective. Boston, MA: Pearson
adulthood involves expansion of the lexicon, increased Education.
use and comprehension of sentences with complex syn- Thordardottir, E. T. (2005). Early lexical and syntactic develop-
tax, and development of pragmatic skills in conversa- ment in Quebec French and English: Implications for cross-
tion and narratives; although the focus of language linguistic and bilingual assessment. International Journal of
development is typically on the dramatic changes Language and Communication Disorders, 40, 243–278.
7
Pediatric Language Disorders I
Introduction Specific Language

Impairment/Developmental
There are many reasons why a child may have signifi- Language Disorder
cant difficulties in acquiring language. Hearing impair-
ment, intellectual disabilities associated with a variety SLI is largely defined using exclusionary criteria — no
of conditions, acquired brain injury (resulting from a intellectual disability, hearing impairment, motor defi-
traumatic brain injury, childhood stroke, seizure dis- cits, autism, or other conditions that might account for
order, or tumor), or psychiatric problems can result language difficulties (Leonard, 2014). Traditionally,
in a clinically significant language delay. The current SLI has been defined in terms of normal range nonver-
chapter covers language disorder of unknown cause, bal IQ (score of 85 or better on standardized measure
language disorder associated with autism spectrum dis- of IQ, which is within 1 SD of the mean of 100). It
order (ASD), and language disorder associated with hear- should be noted that at least in the United States the
ing loss. As discussed later, language disorder without term SLI has mostly been used by researchers, rather
an obvious cause or due to a related condition is re- than by clinicians.
ferred to as either specific language impairment (SLI) or A proposed alternative for the diagnostic category
developmental language disorder (DLD). The debate of of SLI is developmental language disorder (DLD). Chil-
diagnostic terminology between SLI and DLD is ongo- dren diagnosed with DLD, like children with SLI, have
ing. In this chapter, the label “SLI/DLD” is used to reno obvious condition or disease that explains their lan-
flect the usage of both terms even though clinicians and guage delay. DLD may be preferred because it is a more
researchers in different countries (including the U.S.) general term than SLI and does not exclude children
use one or the other (see summary in Volkers, 2018). with nonverbal IQs of 70 to 85. In addition, the diag-
These types of language disorder affect a large nostic term DLD may be more widely understood by
number of children and continue into adulthood. parents, educators, and administrators who approve
These disorders are not specific to race, ethnic group, speech-language therapy for children evaluated for
or country; they are observed and diagnosed around language delay (Bishop, Snowling, Thompson, Green-
the world. halgh, & CATALISE consortium, 2016, 2017).
85
SLI/DLD is the most frequent disability among guage performance is mismatched with the child’s
children. According to several estimates, SLI/DLD age. Formal tests of hearing (Chapter 23) place these
occurs in around 7% of the population at school entry children in the normal range for auditory sensitivity.
(5 years of age) (Norbury et al., 2016; Tomblin, Records, There are no obvious signs of neurological disease or
Buckwalter, Zhang, Smith, & O’Brien, 1997). Children evidence of autism or psychiatric disturbance. There
diagnosed with SLI/DLD are at increased risk for read- is no evidence that the child has been raised in a
ing problems as well as other academic problems. The language-impoverished environment. Nonverbal tests
early language impairment and later reading problems of intelligence (IQ tests), in which children do not have
in school-aged children accumulate and lead to poor to make responses requiring the use of language, reveal
school performance and a poor career outlook. Society IQ scores within the normal range.
clearly benefits by an understanding of SLI/DLD, its SLI/DLD is a disorder with “fuzzy” boundaries.
causes, characteristics, and treatment. The diagnosis is not always made with a high degree of
Even with the large variability in age milestones confidence because the delayed language may resem-
and rates of learning in typical language development ble delayed language characteristics not only in other
(see Chapter 6), there are a surprisingly large number disorders, but in language development in typically
of otherwise typically developing children whose lan- developing children who are learning language more
guage skills are judged to be significantly impaired. In slowly than most of their peers. Interested readers are
these cases, deficits in language performance may be referred to the reviews published by Conti-Ramsden
primarily in the area of language production or in both and Durkin (2012), Kamhi and Clark (2013), Ellis Weis-
language comprehension and production. mer (2013), and Laasonen et al. (2018).
Standardized Tests Language Characteristics

of Children with SLI/DLD
A brief review of standardized tests
is as follows (Chapter 4). A large number of Children with SLI/DLD are like typically develop-
typically developing children are tested on a ing children in at least one important sense: they have
particular skill (such as vocabulary size), at a variable language skills at a specific age. Children
particular age, resulting in a range of scores. with SLI/DLD may have impairments in vocabulary
These scores are converted to standardized units development, in phonology, in the use of grammatical
(z-scores), which are converted to a mean score morphemes, in syntactic constructions, and in prag-
of 100 and a standard deviation of 15 points, on matics. Some children diagnosed with SLI/DLD seem
either side of the mean. Sixty-eight percent of to have near-normal comprehension of language but
the tested children have scores between 85 and significant problems with the production (expression)
115 on the standardized test, and 95.5% of the of language; others have marked problems in both
children have scores between 70 and 130. comprehension and production. A common thread in
children diagnosed with SLI/DLD is a marked impair-
ment in grammatical morphology, especially in pro-
There is a lot of variation in the language charac- ductive language.
teristics of children diagnosed with SLI/DLD. A typi-
cal description of a child diagnosed with this disorder
Phonology
is (a) the child did not produce his first word until 20
months of age, and at 3 years of age has an unusually Children with SLI/DLD have more difficulty learn-
small vocabulary (e.g., less than 200 words) and is ing the sound structure of their language than typi-
just starting to combine words into short phrases; (b) cally developing children. At a given age, children
the child shows only minimal mastery of grammati- with SLI are likely to have more speech sound errors
cal morphemes; and (c) the child’s comprehension of than typically developing children. The problems with
language is better than her production of language, speech sound development may be related to a deficit
but both are delayed relative to typical development. in phonological memory, a kind of short-term memory
In addition, the child does not produce complex sen- important for the encoding of speech sound character-
tences, may be hesitant to talk, and has problems with istics (Gathercole & Baddeley, 1990; Gathercole, 2006).
social interactions. Children with SLI/DLD make many more errors than
When adults listen to a child diagnosed with SLI/ typically developing children when asked to repeat
DLD, the language sounds “babyish,” as if the lan- nonwords (“nonword repetition tasks”). Nonword
able. Children with SLI/DLD may have problems with

Phonology various grammatical morphemes (English grammatical
morphemes are listed in order of acquisition in Chapter
This chapter and the next summarize 3). Some researchers propose that children with SLI/
the language disorders of several groups of chil- DLD have particular difficulty with verb tense markers
dren according to the components of language (Rice, 2014).
summarized in Chapter 3. These components
are phonology, morphology, syntax, content,
and pragmatics. The first four are referred to as Syntax
“structural” components of language, the last The delayed mastery of grammatical morphemes in
the social (usage) component. In this textbook, children with SLI/DLD may reflect a more general
pediatric phonological disorders are presented problem of sentence comprehension. Children with
in separate chapters. Why, then, are phonologi- SLI/DLD who are diagnosed with a primarily expres-
cal disorders presented as a potential component sive language disorder still show comprehension skills
of pediatric language disorders? Phonology, the that lag those of their typically developing peers.
structure of speech sound systems and the rules It is possible that the incorrect use of grammati-
governing the use of those sounds, is not the cal morphemes is related to comprehension of sentence
same as articulation. Articulation (phonetics) is structure. The difficulty with sentence processing (i.e.,
the physical act of producing a speech sounds; comprehension) is a poor model for the expression of
it is a motor behavior that is not necessarily grammatical morphemes. This view is supported by
unique to a specific language or dialect. Phonol- the finding of a relationship between sentence com-
ogy can be thought of as the bridge between prehension and grammatical morpheme expression
phonetics and language. Much as language is in children with SLI/DLD: better comprehension is
organized and produced according to rules, associated with better expression of grammatical mor-
phonology is the organization of the phonetic phemes (Bishop, Adams, & Norbury, 2006).
inventory into a rule-based sound system. This is
why we include phonology and its disorders as
a potential (and often-observed) component of Vocabulary
pediatric language disorders. When children with SLI/DLD are given formal tests
for vocabulary size, their scores tend to be lower than
those of typically developing children of the same age
repetition errors are thought to reflect impairment of (Laws & Bishop, 2003). Vocabulary tests can be given
phonological working memory. This is important separately for expressive (productive) and receptive
because impaired phonological memory may create (comprehension) abilities, and SLI/DLD children
difficulty in learning new words. The smaller vocabu- often show a greater difference from typically devel-
lary size in children with SLI/DLD may therefore be oping children on the expressive test. This is consistent
due, in part, to impairments in phonological work- with the idea that children with SLI/DLD tend to have
ing memory (Cody & Evans, 2008; Jackson, Leitao, & greater delays in expressive, as compared to recep-
Claessen, 2016). tive, language skills. Besides early delays in acquiring
vocabulary during the preschool period, older chil-
dren with SLI/DLD display deficits in their breadth
Grammatical Morphemes
(i.e., number of words they can define) and depth (i.e.,
Problems with grammatical morphemes are generally amount of information for each word) of vocabulary
viewed as the “signature” of SLI/DLD. Children diag- knowledge (McGregor, Oleson, Bahnsen, & Duff, 2013).
nosed with SLI/DLD may have minimally delayed Older children and adolescents also have difficulties
vocabularies and even age-appropriate phonology but with more abstract or less frequent meanings of words
significantly impaired production and/or comprehen- (e.g., “cold” as a personality trait rather than a tem-
sion of grammatical morphemes. The “babyish” char- perature) and with nonliteral (figurative) expressions
acteristic of expressive language in children with SLI/ (e.g., “feeling blue”).
DLD, mentioned earlier, is associated with this type
of difficulty. When a 4-year-old child produces sen-
Pragmatics
tences such as, “That my doggie,” “Yesterday I play
with Bobby,” “Man in street,” and “My brother run to Helland and Helland (2017) review the evidence for
Mommy,” the mismatch of age and grammar is notice- pragmatic difficulties among children diagnosed with
SLI/DLD. These children may have problems initi- also have use in guiding an intervention plan for the
ating and maintaining conversations and may have child diagnosed with SLI/DLD. For example, separate
poor turn-taking skills. In addition, topic manage- analyses of grammatical morphemes and vocabulary
ment during conversations and narratives tends to be may identify which component is most delayed and,
poor, much like the examples given in Chapter 6. As therefore, most in need of therapy. Whether the child’s
pointed out by Roseberry-McKibbin (2007) and Hel- language disorder is viewed through the lens of sepa-
land and Helland, pragmatic problems lead to social rate components of language, or as a disorder of mul-
and academic difficulties; children with SLI/DLD are tiple components of language development, the goal
often regarded as “different” by teachers and fellow is to understand how language is affected for use in
students. The perception by others as “different” may everyday life.
be prompted by pragmatic difficulties such as normal
greetings (“Hi, what’s going on?”), being attentive to
the content and context of a conversation to become an What Is The Cause of SLI/DLD?
effective communication partner, and understanding
nonliteral aspects of language (“Let’s go” said as “Let’s The cause or causes of SLI are not well understood.
roll”). As discussed later in this chapter, pragmatic dif- There are differing viewpoints among researchers
ficulties among some children with SLI/DLD are simi- about possible factors underlying SLI/DLD. Process-
lar to those seen in ASD but are usually less severe. ing views suggest deficits in lower-level auditory
processing skills, higher-level memory abilities such
as working and phonological memory, and problems
Summary of the Language with executive function — the ability to guide language
Disorder in SLI/DLD learning behavior, by focusing attention on important
stimuli (and by implication, to know how to ignore
In the preceding profile of how language may be other stimuli for the best learning outcomes).
delayed in children with SLI/DLD, each component An alternate view is that SLI/DLD can be explained
of language was presented separately. This approach by language-learning problems that are specifically
has some merit for instructional purposes, but in many related to grammatical morphology or sentence-level
cases too much focus on the individual trees blinds syntactic constructions. More broadly, a genetic basis
us to a proper view of the forest. Delays in particular for disordered learning of these specific language
components of language almost certainly have conse- components has been proposed to have a causal con-
quences for mastery of other components. As noted nection with SLI/DLD (Dale, Rice, Rimfeld, & Hayiou-
previously, a child with significantly impaired mastery Thomas, 2018).
of grammatical morphemes and who receive negative
social cues because of the “babyish” sound of their
speech is likely to initiate conversations less frequently. The Role of Genetics in SLI/DLD
This, in turn, may result in substantially less practice of
language skills, possibly resulting in lost opportunities There is strong evidence of a genetic component in SLI/
to develop vocabulary. A similar situation may occur DLD. SLI/DLD appears to be heritable, most likely as
for children who have a substantial problem compre- a result of the interaction of several genes whose struc-
hending language. These children may have problems tures predispose an individual to have a developmen-
following conversations; when they try to make con- tal language disorder. In the case of SLI/DLD, these
tributions to conversations their utterances may not be several genes are likely to interact with environmental
consistent with the topic being discussed because they conditions (Peterson, McGrath, Smith, & Pennington,
have not understood it. Children with SLI/DLD may 2007). The genetic predisposition to SLI/DVD means
also lose language practice opportunities when other that a language delay is not an inevitable outcome of
children or even adults do not choose them as conver- a child’s genetic profile. It is the predisposition that is
sation partners because of their problems with topic heritable; environmental influences such as extensive
management. versus minimal language input to the developing child
Individual language components may, in fact, be may work against or in favor of the predisposition.
selectively disordered in children. This lends some Evidence for the heritability of SLI/DLD comes
real-world validity to identification of the status of from a range of studies, including twin and broader
separate language components in a child diagnosed familial studies. Twin studies reveal that SLI/DLD is
with SLI/DLD (or any pediatric language disorder). more likely to occur in both members of monozygotic
The analysis of separate components of language may (identical) twins compared with both members of dizy-
gotic (fraternal) twins. Stated in a different way, if one (the lexicon); and use refers to pragmatics (social
member of a twin pair has SLI/DLD, the probability communication).
of the other member having SLI/DLD is significantly At kindergarten age, 70% to 75% of children with
higher in the identical versus fraternal twin pair. This ASD are verbal. This group includes children with
supports a genetic component in SLI/DLD because well-developed language skills and varying degrees
identical twins have the same genetic profile, whereas of language disorders. Currently, there are not good
fraternal twins do not. estimates of the proportion of children with ASD who
Other familial studies have uncovered evidence have language disorders in addition to difficulties with
of greater probability of language delay, or a history social communication. Preschool children with ASD
of language delay, in relatives of a child with SLI/ who have limited language skills are referred to as
DLD compared with relatives of typically developing “preverbal,” and those without functional spoken lan-
children (that is, a control group). The heritability of guage after 5 years of age are referred to as “minimally
a genetic predisposition for SL/DLD is supported by verbal.” Twenty-five to 30% of children diagnosed with
these findings (Rice, 2013; Tomblin, 2009). ASD are estimated to be minimally verbal at kinder-
garten age. Some nonverbal children may have a few
words or phrases but do not routinely use spoken lan-
Language Delay and Autism guage to communicate.
Spectrum Disorder Although the absence of multiple word utterances
at 5 years of age represents a significant language dis-
As defined by the Diagnostic and Statistical Manual of order (Chapter 6), these children are expected to make
Mental Disorders, fifth edition1 (DSM-5), (http://www gains in their language skills as they progress through
.aappublications.org/content/early/2013/06/04/aap- the school years (Tager-Flusberg & Kasari, 2013).
news.20130604-1), a diagnosis of Autism Spectrum Dis- According to Boucher (2012). Even older children and
order (ASD) is made when a child has problems with adults with autism who appear to have typically devel-
social communication/interactions and demonstrates oping language skills may have subtle language differ-
repetitive and restricted behaviors. Within these two ences from neurotypical children and adults (see Box,
general categories of behavior, there are more specific “Neurotypical and Neurodiverse”).
criteria for the diagnosis. Our focus is on the speech
and language characteristics of children diagnosed
with ASD according to the DSM-5 criteria. Neurotypical and
Like SLI/DLD, the causes of ASD are unknown. Neurodiverse
The Centers for Disease Control and Prevention (CDC)
estimated in 2018 that 1/59 children at 8 years of age in “Neurotypical” is a term used primarily in the
the United States were diagnosed with autism (https:// autism community (autistic persons and their
www.cdc.gov/mmwr/volumes/67/ss/ss6706a1.htm allies) to indicate typical skills and behaviors;
?s_cid=ss6706a1_w). Between 2000 and 2014, there the term is descriptive of what used to be
was a dramatic increase in the prevalence of autism classified as “normal.” Many research publica-
(see review in Graf, Miller, Epstein, & Rapin, 2017). tions on autism use the term “neurotypical” to
Many children diagnosed with autism have some form describe control groups of typically developing
of language disorder in addition to the core deficit of (or developed) individuals. The term is not
social communication (pragmatics), which are part of used as the standard, but rather a point on a
the diagnostic criteria for this condition. continuum of brain types. A different point on
that continuum is “neurodiverse,” suggesting (in
the case of ASD) the brain type of persons with
Language Characteristics in ASD autism. These two terms do not stand opposed
as “normal” versus “abnormal.” Rather, neurodi-
Chapter 3 described three major components of lan- verse is regarded as different from neurotypical,
guage — form, content, and use. Form includes phonet- not disordered.
ics, phonology, and morphology; content is meaning
1
The Diagnostics and Statistical Manual of Mental Disorders, version 5, is published by the American Psychiatric Association to specify diagnos-
tic criteria for disorders it classifies as “mental disorders.” ASD is included in this classification. The criteria for a diagnosis ASD have been
updated over the last 20 years and in some respects are controversial.
Phonology boy who owns the bike came home” (more compli-
cated). It certainly is possible that children with ASD
Phonological delays (speech sound delays/disorders:
may be less likely than typically developing children
see Chapters 13 and 15) have not been studied exten-
to use (express) sentences with more complicated syn-
sively in children with autism. The lack of data may be
tax, but their comprehension of syntactical complexity
due partly to the clinical impression of normal or near-
is clearly impaired.
normal phonetic/phonological skills among children
with autism.
Speech sound delays have been noted for a small Vocabulary
percentage of children with autism who have near-
As with other language components, vocabulary skills
typical speech and language development. The nature
vary to a large degree among children with ASD. In
of the speech sound errors is rarely reported, and the
general, vocabulary skills in children with ASD are
resolution of the errors as the children mature has not
delayed relative to those of typically developing chil-
been addressed. This is important because typically
dren. This is similar to the vocabulary profile of children
developing children have speech sound errors that
with SLI/DVD, as well as children with intellectual dis-
resolve during the course of phonetic/phonologi-
abilities (Chapter 8).
cal development. The speech sound errors reported
The relationship between receptive and expres-
for children with autism may be similar to the errors
sive vocabulary in ASD seems to be different com-
made by typically developing children and may
pared with typically developing children and groups
resolve without intervention (see Boucher, 2012, p. 224).
of children with delayed or disordered language
A review of speech sound disorders in children with
development. Larger receptive compared with expres-
autism is available in Broome, McCabe, Docking, and
sive vocabularies are the rule for these children — they
Doble (2017).
understand more words than they say. In contrast, a
significant number of children with ASD have receptive
Morphology vocabularies that are more impaired than their expres-
sive vocabularies, based on age expectations (Kover,
Some children with ASD have deficits in morpho- McDuffie, Hagerman, & Abbeduto, 2013; Kover & Ellis
logical development. The delay may include mor- Weismer, 2014).
phological markers such as tense (walk-walked) and Children with ASD may have delayed expressive
possessives (Bob-Bob’s) as well as other morphemes. vocabularies because the new words they add tend to
In children with speech and language skills that are be limited to word possibilities that sound alike (e.g.,
not severely impaired, the development of morphol- “house,” “mouse,” “ball,” “fall”; see Kover & Ellis
ogy is often within age-level expectations or mildly Weismer, 2014; McDaniel, Yoder, & Watson, 2017). Typ-
delayed. In fact, delayed morphological development ically developing children add words from a broader
in some children with ASD has been described as simi- range of possibilities. It is not clear why this may be
lar to the profile of morphological errors observed in the case, but if true, it points to a therapeutic strategy
SLI/DLD. to build expressive vocabulary in children with ASD:
extend vocabulary training items to words that do not
share sounds with words already in the expressive
Syntax
vocabulary.
The profile of syntactic development in ASD high- Why is receptive vocabulary affected to such a sig-
lights the difference between expressive and receptive nificant degree in children with ASD? The answer is
(comprehension) language development. Syntactical not clear. One possibility is that nonverbal cognitive
development in verbal children with ASD seems to be abilities have a disproportionate effect on receptive, as
near typical, and especially so for children with more compared with expressive, vocabulary (Kover et al.,
advanced language skills even when those skills are 2013).
delayed. Like morphology, expressive syntax may seem An analysis of longitudinal expressive and recep-
near typical, but comprehension of syntax is not. This tive vocabulary skill in McDaniel et al. (2017) was
split between expressive and comprehension develop- done to evaluate the influence of receptive vocabulary
ment becomes more obvious with more complicated on expressive vocabulary. A large group of children
syntactical forms. An example of a difference between with ASD were studied over a 16-month period; the
a simple and more complicated syntactical form is the investigators reasoned that change in receptive vocab-
sentence, “The boy came home” (simple) versus “The ulary skill over this period would predict change in
expressive vocabulary skill. This is a very reasonable across individuals despite a range of structural lan-
expectation from the idea that the size of the receptive guage abilities. Structural language abilities include
vocabulary “drives” the size of the expressive vocab- phonological, morphological, syntactic, and semantic
ulary. Surprisingly, the analysis did not support this components, which may range across individuals diag-
expectation (McDaniel et al., 2017). nosed with ASD from typically developing to severely
The results of McDaniel et al. (2017) may have impaired skills. Whatever the range of structural lan-
important clinical implications. If the receptive vocab- guage skills may be, pragmatic language impairments
ulary does drive the expressive vocabulary, therapy are always present in children diagnosed with ASD.
directed at improving the receptive vocabulary is a Many individual behaviors contribute to appropri-
sound idea for expanding the expressive vocabulary. ate pragmatic language. These behaviors are grouped
McDaniel et al.’s results, however, do not support this into five categories for observation of children 5 years
approach. Perhaps therapeutic efforts to expand the and older, listed in Table 7–1 (Cordier, Munro, Wilkes-
expressive vocabulary are best directed at expressive Gillan, Speyer, & Pearce, 2014). Several of the items
tasks, or expressive plus receptive tasks. include behaviors that are not verbal but describe a
more abstract level of social communication (Baird &
Norbury, 2016). Observed impairments of the items in
Pragmatic Language
Table 7–1 are not part of a formal checklist for diag-
Recall that one of the two core deficits that must be nosis of ASD but serve as one way to appreciate the
observed for a diagnosis of ASD is “problems with wide range of pragmatic language behaviors and their
social communication/interactions.” (The other core potential to contribute to a diagnosis of pragmatic lan-
deficit is “repetitive and restricted behaviors.”) In this guage impairment. Symptoms of pragmatic language
section, we focus on the core deficit of social communi- impairment in ASD may or may not appear until chil-
cation/interactions. dren are old enough to engage in social situations in
A more precise understanding of the core impair- which the impairments can be reliably observed (Baird
ment of pragmatic language in ASD is its presence & Norbury, 2016).
Table 7–1. Five General Categories, Each of Which Includes Several

Behaviors That May Be Observed in the Pragmatic Language Impairment in
Autism Spectrum Disorder
Category Examples
Introduction and Selects and introduces a range of conversational topics
responsiveness Initiates verbal information appropriate to the context
Nonverbal Uses and responds to identifiable, clear, intentional

communication body actions and movements
Uses and responds to a variety of facial expressions to
express consistent meanings
Social-emotional Considers/integrates another’s viewpoints/emotions

attunement Appropriate use of social language within context
Executive function Attends to communicative content; plans and initiates

appropriate responses
Versatile ways to interpret/connect/express ideas
Negotiation Uses appropriate methods for resolving disagreement

Expresses feelings appropriate to the context
Note. Adapted from “Reliability and Validity of the Pragmatics Observational Measure
(POM): A New Observational Measure of Pragmatic Language for Children,” by R. Cordier,
N. Munro, S. Wilkes-Gillan, R. Speyer, and W. M. Pearce, 2014. Research in Developmental
Disabilities, 35, pp. 1588–1598.
Social Communication Disorder

The DSM-5 includes specific criteria the DSM, meaning that a child had autistic-like
for the diagnosis of Social (Pragmatic) Commu- characteristics but did not fully meet the criteria
nication Disorder (SCD). This is a disorder of for a diagnosis of autism. SCD, diagnosed in
social communication that does not include the childhood, is likely to persist into adulthood.
repetitive and rigid behaviors seen in ASD but SCD is not explained by low cognitive ability but
shares with it features of pragmatic disorder, interferes substantially with social relationships
including nonverbal and verbal impairment. and academic and career performance. Children
In fact, some experts contend that SCD was diagnosed with SCD may also have impairments
previously diagnosed as “Pragmatic Language of structural language (phonology, morphology,
Impairment,” and others think these children were syntax, and content). SCD, a controversial diag-
labeled “Autism Spectrum Disorder Not Otherwise nosis, is discussed comprehensively by Norbury
Specified”— a category in the previous version of (2014) and Baird and Norbury (2016).
Language Delay and due to various conditions/diseases or accidents (Kre-

Hearing Impairment mer, 2019).
It has been estimated that 54% to 68% of hear-
Children born with hearing impairment have hearing loss at age 4 has a genetic basis (Morton & Nance,
ing losses ranging from mild to profound. Degree of 2006). Although there are many genes implicated in
hearing loss is defined on the basis of audiometric congenital hearing loss, and especially hearing loss
findings. Audiometry, discussed fully in Chapter 23, in the severe and profound categories, a more limited
includes the quantification of sound energy (measured number of genes have been identified with direct influ-
in decibels) required for a listener to detect tones at a ence on development of the cochlea, the sense organ
series of different frequencies. According to the Ameri- for hearing. These genes are often referred to as “deaf-
can Speech-Language-Hearing Association (ASHA), ness genes.”
mild hearing loss requires 21 to 40 decibels (dB) to
reach this “just detectable” criterion, moderate hear-
ing loss 40 to 55 dB, severe loss 71 to 90 dB, and pro- Language Characteristics
found hearing loss 91 dB and greater (https://www. in Hearing Impairment
asha.org/public/hearing/degree-of-hearing-loss/).
The “profound” category includes people who are All individuals with hearing loss are at risk for devel-
legally deaf, some of whom may be able to respond opmental speech and language impairment. The
to very high levels of sound energy with amplification degree of hearing loss predicts the risk of speech and
(e.g., a hearing aid). Some individuals do not respond language delays in a general sense, but not absolutely.
to any level of sound energy, even with amplification. There are children with severe hearing loss with speech
and language skills equivalent to those of children with
moderate hearing loss, and children with greater hear-
Epidemiology of Hearing Loss ing loss having better speech and language skills than
children with lesser loss (see Fitzpatrick, Crawford,
The incidence of hearing loss is one to three per thou- Ni, & Durieux-Smith, 2011, for examples of this). Led-
sand births (approximately 0.1% to 0.2%). Between the erberg, Schick, and Spencer (2013) have published an
ages of 5 and 9 years, the prevalence of hearing loss excellent review of factors that may or may not contrib-
increases to 2.7 to 3.5 per 1,000 children. Why is the ute to speech and language development in children
prevalence in school-age children higher than the inci- with hearing loss.
dence in newborns? Possible reasons include the in- Oral speech and language impairments in hearing-
clusion in prevalence estimates of children who were impaired children make sense, because the input that
not diagnosed with hearing loss at birth, the influ- typically drives the development of speech and lan-
ence of certain medications on hearing, and the inclu- guage is degraded to varying degrees (mild, moderate,
sion of children who have acquired hearing loss severe) or unavailable (deaf). The input from parents,
siblings, and other people who speak to a baby, to a excellent reviews of language development in children
toddler, and to older children, plays a huge role in the with hearing loss, including deafness. A comprehen-
development of both receptive and expressive skills for sive review of language development in deaf children
oral language. with cochlear implants is provided by Ganek, Robbins,
We use the term “oral language” to make clear that and Niparko (2012).
other modalities for communication exist, and in fact
are critical to understanding language skills in persons
Phonology
with congenital hearing loss, and especially children
who are born deaf. For example, deaf individuals who Expressive phonological skills develop more slowly
are native users of American Sign Language (ASL) learn in hearing impaired children compared with typically
language via a visual, rather than auditory medium. developing children. In 4- and 5-year-old children with
Information is communicated as efficiently by ASL as moderate-to-severe hearing loss and either hearing aids
it is by speech, and the language has a structure of lin- or cochlear implants, Fitzpatrick et al. (2011) found dra-
guistic rules, just like oral languages. In short, ASL is a matically lower scores on the Goldman-Fristoe test of
language like any other (Lederberg et al., 2013). ASL is articulation compared with scores of typically develop-
the native language of the Deaf community, the capital ing children of the same ages. The Goldman-Fristoe is
“D” signifying the community whose members do not a standardized test that counts the number of correctly
view deafness as a disability. Roughly 10% of deaf chil- articulated sounds and relates the score to expected
dren are born to Deaf parents (Mitchell & Karchmer, articulatory skills at a given age.
2004). A significant deficit in speech intelligibility results
The discussion in the current section focuses from the frequent occurrence of speech sound errors
primarily on hearing impaired and deaf people who among children with hearing loss. Listeners have dif-
choose to communicate orally, or have parents who ficulty understanding children with hearing loss partly
choose oral communication for their children. This choice (and probably largely) due to speech sound errors; in
of oral communication may involve some support from general, the greater the number of errors, the greater is
a manual language such as ASL or other manual com- the speech intelligibility deficit.
munication systems. Speech sound development in children with hear-
ing impairment depends on a number of receptive-lan-
guage factors. Hearing loss makes it difficult to learn
Speech and Language Development the acoustic properties required to develop a cognitive
and Hearing Impairment representation of speech sounds. Some consonants
such as fricatives (e.g., “s,” “sh”) have less sound
As stated by Fitzpatrick et al. (2011, p. 605), “children energy than other consonants and are more susceptible
with hearing loss of all degrees of severity continue to to the effects of hearing loss on forming phonological
lag behind expectations for children with normal hear- representations. Phonological memory also seems to be
ing . . . in multiple communication domains.” Typically, impaired in hearing impaired children (Halliday, Tuo-
the issue is not so much if there is a delay, but the extent mainen, & Rosen, 2017a). This is a short-term memory
of the delay. This is critical because delays in any or all specialized for speech sound information. Impairments
aspects of speech and language have significant poten- in phonological memory work against the establish-
tial to affect academic and social skills. ment of good cognitive representations of speech
The following discussion of developmental speech sound categories.
and language characteristics in children with hearing Data on French-speaking children with mild-to-
loss is broad. The summaries do not necessarily apply moderate hearing loss and no other disabilities suggest
to every child with a hearing loss; just as in the typi- that phonological gains are made throughout child-
cally developing population, different children have hood but do not “normalize” in adolescence. Halliday
different paths to speech and language learning. In et al. (2017b) and Nittrouer, Muir, Tietgens, Moberly,
particular, and as previously noted, the pattern of & Lowenstein (2018) point to phonological skill as the
speech and language development varies broadly with most impaired aspect of language skill in children at
degree of hearing loss. Keep in mind, however, that the all levels of hearing impairment. Phonological delays
influence of the degree of hearing loss on speech and that extend from grade school and into high school and
language development is often offset to some degree beyond have the potential to affect academic perfor-
by amplification (e.g., hearing aids) and/or cochlear mance, especially in the areas of reading and writing
implants. Almost all of the research papers cited have (Deluge & Tuller, 2007).
Morphology and Syntax than that of typically developing children (Hayes,

Morphological and syntactic skills are delayed in chil- Geers, Treiman, & Moog, 2009). In general, better lan-
dren with hearing impairment. The degree and nature of guage outcomes can be expected with earlier age at
these delays depend on many factors, including sever- implantation (Ganek et al., 2012).
ity of hearing loss, cognitive skills, and history of ampli-
fication (hearing aids) and/or cochlear implantation. Pragmatic Language
Delay in the mastery of grammatical morphemes
has been reported for children with mild, moderate, As pointed out by Goberis, Beams, Dalpes, Abrisch,
and severe hearing loss. Syntax is also delayed, some- Baca, and Yoshinaga-Itano (2012), very little research
times into the adolescent years (Deluge & Tuller, 2007). has been done on social language use in hard of hear-
The delays in mastery of grammatical morphemes and ing and deaf individuals. Goberis et al. developed a
syntax are observed for both receptive and expressive questionnaire in which social language use items were
language (reviewed in Halliday et al., 2017b). rated by parents of hard of hearing and deaf children;
Cochlear implants have a significant influence on these ratings were compared to ratings by parents of
language learning among prelingually deaf children children with typical (normal) hearing. The primary
(deaf before the age of 5 years). Cochlear implants findings were as follows: (a) children with hearing
provide deaf individuals with auditory stimula- loss had slower development of pragmatic language
tion that contributes to skills in all areas of language skills compared with typically developing children,
development. In a study of deaf children aged 5 to and (b) the rate of social language learning from age 3
13 years who had received a cochlear implant (or to 7 years depended on hearing loss category. Rate of
implants — both ears) prior to age 5 years, and had used learning was relatively high in the mild hearing loss
them for an average of approximately 6 years, children group, and relatively low in the profoundly hearing
with implants developed significant language skills, impaired group. Observation of children with hearing
but their expressive morphological and syntactic skills loss in social language situations is needed to specify
continued to be delayed relative to a typically develop- characteristics of pragmatic language use suggested by
ing control group (Boons, De Raeve, Langereis, Peeraer, the parent questionnaire data. One reason that there
Wouters, & van Wieringen, 2013). Within the group of has not been much research on social language use in
children with cochlear implants, some had morpho- this population is that it is not a primary area of dif-
logical and syntactic skills equal to those of typically ficulty as it is for children with ASD.
developing children.
In general, even children who are implanted at
early ages (around 1 year old, or even younger) may Chapter Summary
struggle with morphological and syntactic skills as
they grow older. This may be the case even when other
aspects of their language (e.g., vocabulary) are at typi- Specific language impairment (SLI) and developmental
cally developing levels (Ganek, Robbins, & Diparko, language disorder (DLD) are two terms used to des-
2012). Chapter 24 presents additional information on ignate essentially the same disorder in children, that
cochlear implants. of delayed language development in children who are
typically developing in every other way; in this chap-
ter, the disorder is labeled “SLI/DLD.”
Vocabulary SLI/DVD is the most frequent disability among
Vocabulary is often regarded as a strength among chil- children between 3 and 5 years of age, occurring in
dren with hearing impairment. Children between the about 3% to 7% of the population.
ages of 8 and 16 years with mild-to-moderate hearing SLI/DLD is typically diagnosed around the age of
loss may have age-appropriate receptive and expres- 3 years.
sive skills only slightly lower than age expectations SLI/DLD is a diagnosis of language disorder in
(Halliday et al., 2017b). When tested at age 5 years, the absence of known causes; hearing loss, autism and
children who had cochlear implants before the age other development disabilities, intellectual disability,
of 2 years and at least 2 years of experience with the craniofacial anomalies, psychiatric disturbance, and
implants, had significantly lower receptive vocabulary neurological disease must be ruled out as potential
skills than typically developing children. However, causes for the language delay.
when the same children were tested over two consecu- Children diagnosed with SLI/DLD may have
tive years, their rate of vocabulary growth was greater difficulties with all aspects of language development,
both expressive and receptive; expressive delay can References

exist in the absence of receptive delay, but children
Baird, G., & Norbury, C. F. (2016). Social (pragmatic) commu-
with a receptive delay are likely to have expressive
nication disorders and autism spectrum disorder. Archives
delays. of Disease in Childhood, 101, 745–751.
All components of language are likely to be Bishop, D. V. M., Adams, C. V., & Norbury, C. F. (2006). Dis-
delayed in SLI/DVD, with especially prominent delays tinct genetic influences on grammar and phonological
in vocabulary and grammatical morphology. memory deficits: Evidence from 6-year-old twins. Genes
SLI/DVD is a significant problem because many Brain and Behavior, 5(2), 158–169.
of the preschool children who receive this diagnosis are Bishop, D. V. M., Snowling, M., Thompson, P. Greenhalgh, T.,
likely to experience substantial academic, social, and & CATALISE consortium (2016). CATALISE: A multina-
career difficulties. tional and multidisciplinary Delphi consensus study. Iden-
Strong evidence exists for a genetic component in tifying language impairments in children. PLoS ONE, 11(7);
SLI/DLD, based on familial patterns of the disorder, e0158753. https://doi.org/10.1371/journal.pone.0158753
Bishop, D. V. M., Snowling, M., Thompson, P. Greenhalgh, T.,
but the precise combination of genes that make a child
& CATALISE consortium. (2017). Phase 2 of CATALISE:
susceptible to SLI/DLD has not been identified. A multinational and multidisciplinary Delphi consensus
Autism spectrum disorder (ASD) is diagnosed study of problems with language development: Terminol-
when a child has a well-identified social communica- ogy. Journal of Psychology and Psychiatry, 58, 1068–1080.
tion disorder and repetitive and restricted behaviors. Boons, T., De Raeve, L., Langereis, M., Peeraer, L., Wouters,
Some children diagnosed with ASD are non- J., & van Wieringen, A. (2013). Expressive vocabulary,
verbal, but the great majority have language skills rang- morphology, syntax and narrative skills in profoundly
ing from significantly delayed to typical or even deaf children after early cochlear implantation. Research
advanc-ed; these skills are likely to improve as the child in Developmental Disabilities, 34, 2008–2022.
matures. Boucher, J. (2012). Research review: Structural language in
Children with ASD have variable severity (and in autism spectrum disorder — Characteristics and causes.
Journal of Child Psychology and Psychiatry, 53, 219–233.
some cases, no delay) of structural language skills.
Broome, K., McCabe, P., Docking, K., & Doble, M. (2017).
Like SLI/DLD, familial patterns of ASD have been A systematic review of speech assessments for children
firmly established but the specific genes underlying the with autism spectrum disorder: Recommendations for
disorder have not been identified. best practice. American Journal of Speech-Language Pathol-
All children with hearing loss, no matter the sever- ogy, 26, 1011–1029.
ity, are at risk for language impairment; the degree of Cody, J. A., & Evans, J. L. (2008). Uses and interpretations of
loss predicts the severity of the language impairment non-word repetition tasks in children with and without
to a substantial but not perfect degree. specific language impairments (SLI). International Journal
A genetic, hereditable basis has been identified for of Language and Communication Disorders, 43, 1–40.
persons with deafness, although the precise combina- Conti-Ramsden, G., & Durkin, K. (2012). Language develop-
tion of deafness genes has not been identified. ment and assessment in the preschool period. Neuropsy-
chology Review, 22, 384–401.
Expressive phonological skills develop more
Cordier R., Munro, N., Wilkes-Gillan, S., Speyer, R., & Pearce,
slowly in hearing impaired children compared with W. M. (2014). Reliability and validity of the Pragmat-
typically developing children; some researchers believe ics Observational Measure (POM): A new observational
that the most impaired language component in chil- measure of pragmatic language for children. Research in
dren with hearing loss is phonological. Developmental Disabilities, 35, 1588–1598.
Impairments in speech intelligibility, resulting Dale, P. S., Rice, M. L., Rimfeld, K., & Hayiou-Thomas, M. E.
from the phonological impairments, are a significant (2018). Grammar clinical marker yields substantial heri-
problem in children with hearing loss. tability for language impairments in 16-year-old twins.
Receptive and expressive morphological and Journal of Speech, Language, and Hearing Research, 61, 66–78.
syntactical skills are delayed in hearing impaired Deluge, H., & Tuller, L. (2007). Language development and
children. mild-to-moderate hearing loss: Does language normalize
with age? Journal of Speech, Language, and Hearing Research,
Vocabulary is regarded as a relative strength in
50, 1300–1313.
children with hearing impairment. Ellis Weismer, S. (2013). Specific language impairment. In L.
Pragmatic language has not been fully studied in Cummings (ed.), Cambridge handbook of communication dis-
children with hearing loss. orders (pp. 73–87). Cambridge, UK: Cambridge University
Language development in children with profound Press.
hearing impairment seems to be optimal when a child Fitzpatrick, E. M., Crawford, L., Ni, A., & Durieux-Smith,
receives cochlear implants before the age of 2 years. A. (2011). A descriptive analysis of language and speech
skills in 4- to 5-yr-old children with hearing loss. Ear and Laasonen, M., Smolander, S., Lahti-Nuuttila, P., Leminen,
Hearing, 32, 605–616. M., Lajunen, H-R. Heinonen, K., . . . Arkkila, E. (2018).
Ganek, H., Robbins, A. M., & Niparko, J. K. (2012). Language Understanding developmental language disorder — The
outcomes after cochlear implantation. Otolaryngologic Helsinki longitudinal SLI study (HelSLI): A study pro-
Clinics of North America, 45, 173–185. tocol. BMC Psychology, 6(24). https://doi.org/10.1186/
Gathercole, S. E. (2006). Nonword repetition and word learn- s40359-018-0222-7
ing: The nature of the relationship. Applied Psycholinguis- Laws, G., & Bishop, D. V. M. (2003). A comparison of lan-
tics, 27, 513–543. guage abilities in adolescents with Down syndrome and
Gathercole, S., & Baddeley, A. (1990). Phonological memory children with specific language impairment. Journal of
deficits in language disordered children: Is there a causal Speech, Language, and Hearing Research, 46, 1324–1339.
connection? Journal of Memory and Language, 29, 336–360. Lederberg, A. R., Schick, B., & Spencer, P. E. (2013). Language
Goberis, D., Beams, D., Dalpes, M., Abrisch, A., Baca, R., and literacy development of deaf and hard-of-hearing
& Yoshinaga-Itano, K. (2012). The missing link in the children: Successes and challenges. Developmental Psychol-
language development of the deaf and hard of hearing: ogy, 49, 15–30.
Pragmatic language development. Seminars in Speech and Leonard, L. B. (2014). Children with specific language impairment
Language, 33, 297–309. (2nd ed.). Cambridge, MA: MIT Press.
Graf, W. D., Miller, G., Epstein, L. G., & Rapin, I. (2017). McDaniel, J., Yoder, P., & Watson, L. R. (2017). A path model
The autism “epidemic”: Ethical, legal, and social issues of expressive vocabulary skills in initially preverbal pre-
in a developmental spectrum disorder. Neurology, 88, school children with autism spectrum disorder. Journal of
1371–1380. Autism and Developmental Disorders, 47, 947–960.
Halliday, L. F., Tuomainen, O., & Rosen, S. (2017a). Auditory McGregor, K., Oleson, J., Bahnsen, A., & Duff, D. (2013).
processing deficits are sometimes necessary and some- Children with developmental language impairment have
times sufficient for language difficulties in children: Evi- vocabulary deficits characterized by limited breadth and
dence from mild to moderate sensorineural hearing loss. depth. International Journal of Language and Communication
Cognition, 166, 139–151. Disorders, 48, 307–319.
Halliday, L. F., Tuomainen, O., & Rosen, S. (2017b). Language Mitchell, R. E., & Karchmer, M. A. (2004). Chasing the mythi-
development and impairment in children with mild to cal ten percent: Parental hearing status of deaf and hard of
moderate sensorineural hearing loss. Journal of Speech, hearing students in the United States. Sign Language Stud-
Language, and Hearing Research, 60, 1551–1567. ies, 4, 138–163.
Hayes, H., Geers, A. E., Treiman, R., & Moog, J. S., (2009). Morton, C. C., & Nance, W. E. (2006). Newborn hearing
Receptive vocabulary development in deaf children with screening — A silent revolution. New England Journal of
cochlear implants: Achievement in an intensive auditory- Medicine, 354, 2151–2164.
oral educational setting. Ear and Hearing, 30, 128–135. Nittrouer, S., Muir, M., Tietgens, K., Moberly, A. C., & Low-
Helland, W. A., & Helland, T. (2017). Emotional and behav- enstein, J. H. (2018). Development of phonological, lexical,
ioural needs in children with specific language impair- and syntactic abilities in children with cochlear implants
ment and in children with autism spectrum disorder: The across the elementary grades. Journal of Speech, Language,
importance of pragmatic language impairment. Research in and Hearing Research, 61, 2561–2577.
Developmental Disabilities, 70, 33–39. Norbury, C. F. (2014). Practitioner review: Social (pragmatic)
Jackson, E., Leitao, S., & Claessen, M. (2016). The relation- communication disorder conceptualization, evidence and
ship between phonological short-term memory, receptive clinical implications. Journal of Child Psychiatry and Psychol-
vocabulary, and fast mapping in children with specific ogy, 55, 204–216.
language impairment. International Journal of Language and Norbury, C. F., Gooch, D., Wray, C., Baird, G., Charman, T.,
Communication Disorders, 51, 61–73. Simonoff, E., . . . Pickles, A. (2016). The impact of non-
Kamhi, A. G., & Clark, M. K. (2013). Specific language impair- verbal ability on the prevalence and clinical presenta-
ment. In O. Dulac, M. Lassonde, & H. B. Sarnat (Eds.), tion of language disorder: Evidence from a popula-
Handbook of clinical neurology, Vol. III (3rd series), Pediatric tion study. Journal of Child Psychiatry and Psychology, 57,
Neurology Part I (pp. 219–227). Amsterdam, the Nether- 1247–1257.
lands: Elsevier. Peterson, R. L., McGrath, L. M., Smith, S. D., & Pennington,
Kover, S. T., & Ellis Weismer, S. (2014). Lexical characteristics B. F. (2007). Neuropsychology and genetics of speech,
of expressive vocabulary in toddlers with autism spec- language, and literacy disorders. Pediatric Clinics of North
trum disorder. Journal of Speech, Language, and Hearing America, 54, 543–561.
Research, 57, 1428–1441. Rice, M. L. (2013). Language growth and genetics of spe-
Kover, S. T., McDuffie, A. S., Hagerman, R. J., & Abbeduto, cific language impairment. International Journal of Speech-
L. (2013). Receptive vocabulary in boys with autism spec- Language Pathology, 15, 223–233.
trum disorder: Cross-sectional developmental trajectories. Rice, M. L. (2014). Grammatical symptoms of specific lan-
Journal of Autism and Development Disorders, 43, 2696–2709. guage impairment. In D. V. M. Bishop & L. B. Leonard
Kremer, H., (2019). Hereditary hearing loss: About the known (Eds.), Speech and language impairments in children: Causes,
and the unknown. Hearing Research. https://doi.org/10 characteristics, intervention and outcomes (pp. 17–34). Lon-
.1016/j.heares.2019.01.003 don, UK: Psychology Press.
Roseberry-McKibbin, C. (2007). Language disorders in children: Tomblin, J. B., Records, N. L., Buckwalter, P., Zhang, X. Y.,
A multicultural and case perspective. Boston, MA: Pearson Smith, E., & O’Brien, M. (1997). Prevalence of specific
Education. language impairment in kindergarten children. Journal of
Tager-Flusberg, H., & Kasari, C. (2013). Minimally verbal Speech Language and Hearing Research, 40, 1245–1260.
school-aged children with autism spectrum disorder: The Volkers, N. (2018). Diverging views on language disorders
neglected end of the spectrum. Autism Research, 6, 468–478. researchers debate whether the label “developmental lan-
Tomblin, J. B. (2009). Genetics of child language disorders. In guage disorder” should replace “specific language impair-
R. G. Schwartz (Ed.), Handbook of child language disorders ment. ASHA Leader, 23, 44–53.
(pp. 232–256). New York, NY: Psychology Press.
8
Pediatric Language Disorders II
Introduction The term “syndrome,” common to DS and FXS,

deserves a formal definition: a syndrome is a group of
This chapter presents information on language charac- symptoms that occur together, and is seen in a series of
teristics of children (and to some extent, adults) with children (not just one child), and that represent a dis-
intellectual disability (ID). The focus is on ID and its ease process. In DS and FXS, the grouping of symptoms
effect on speech and language in children with Down is an important part of the diagnosis; the diagnosis in
syndrome (DS) and Fragile X syndrome (FXS), but the both syndromes can also be confirmed by genetic anal-
information applies to other disorders in which ID is a ysis as discussed below.
prominent characteristic.
The effects of hearing impairment on speech and
language development, discussed in Chapter 7, must Criteria for a Diagnosis of ID
be factored into the effects of ID on speech and lan-
guage development. This is because children with According to the Diagnostic and Statistical Manual
intellectual disabilities are much more likely to have of Mental Disorders, Fifth edition (DSM-5) (American
hearing impairment than children in the general popu- Psychiatric Association, 2013), ID is diagnosed when
lation (Carvill, 2001). The combination of an ID with a chronic impairments of general mental abilities have an
hearing impairment makes the challenge of language impact on adaptive functioning in three areas, includ-
development more difficult than the effect of either dis- ing (a) conceptual skills: language, reading, writing,
ability alone. Although the combined effects of ID and math, reasoning, knowledge, and memory; (b) social
hearing impairment are not discussed in detail here, behaviors: empathy, judgment, interpersonal communi-
the reader should keep in the mind the potential effect cation skills, making and maintaining friendships; and
of the combination of ID and hearing impairment on (c) practical behaviors: personal care, job responsibilities,
speech and language development in DS and FXS. management of money, recreation, and organization of
DS and FXS account for the great majority of cases tasks.1 These impairments must be observed during
of ID. Speech and language development are affected childhood. Adults who acquire these impairments,
by the intellectual disability, although not necessarily due to (for example) stroke, head trauma, or dementia
in the same way for DS and FXS. (Chapter 9), are not diagnosed with ID.
1
The wording of these requirements for a diagnosis of ID is a very close paraphrase of the wording in the DSM-5.
99
A 2011 analysis of the worldwide prevalence of nonverbal IQ scores reflect the child’s cognitive ability,
ID was estimated to be roughly 1% of the population which reflects skills such as reasoning, memory, and
(Maulik, Mascarenhas, Mathers, Dua, & Saxena, 2011). processing speed.
A follow-up analysis of a large number of international ID is diagnosed based on IQ and deficits in adap-
studies, extending the work of Maulik et al. (2011), tive behavior/functioning. Adaptive functioning refers
found the intellectual disability prevalence in children/ to the social and practical skills needed to get along in
adolescents and adults who as children met the criteria the world. Clinical evaluation of adaptive functioning,
for ID to range from 0.05% to 1.55% (McKenzie, Milton, as well as scores on standardized tests of these skills,
Smith, & Ouellette-Kuntz, 2016). The vast majority of are equally important as an IQ score in making a diag-
individuals with ID (roughly 85%) have mild, rather nosis of ID. IQ scores and scores on standardized tests
than more severe, impairments. of social and practical behaviors are usually well below
Formal diagnosis of ID is based on standardized the normal range for children who are evaluated for a
scores obtained from an intelligence quotient (IQ) mea- diagnosis of ID, and together support the diagnosis. In
sure, as well as clinical observation of social and practi- some cases (probably very few), a child may have an IQ
cal skills. A child diagnosed with ID typically has an IQ score below 70 but have scores within the normal range
score below 70. An IQ score of 70 is two standard devi- for both social and practical skills. Such a child is not
ations below the average IQ score of 100 in the general diagnosed with ID.
population. When “raw” IQ scores are standardized,
meaning they are transformed to form a normal distri-
bution with an average of 100, one standard deviation Down Syndrome (DS):
equals 15 points; thus, an IQ score of 70 or below is at General Characteristics
least two standard deviations below the average of 100.
IQ scores typically reflect language skills, which The 23 pairs of chromosomes in the human genome can
may underestimate the child’s intelligence. For exam- be shown in an image called a karyogram, which shows
ple, a child diagnosed with a language disorder may an individual’s genotype (see Box, “Genetic Terminol-
have a relatively low IQ score due to the language- ogy,” for definitions of genetic terms). Figure 8–1 shows
based nature of the test. There are nonverbal IQ tests karyograms for a typical human male (left) and female
that provide an estimate of intelligence that is indepen- (right). The chromosome pairs are numbered from 1 to
dent of language skills. It is not uncommon for chil- 22; the 23rd pair is shown within the red circles. The
dren to have nonverbal IQ scores that are higher than 23rd chromosome pair in the male genotype has an X
IQ scores that are based in part on language skills. The and a Y chromosome, compared with the two X chro-
Human karyotype
Male Female
1 2 3 4 5 1 2 3 4 5
6 7 8 9 10 11 12 6 7 8 9 10 11 12
13 14 15 16 17 18 13 14 15 16 17 18
19 20 21 22 X Y 19 20 21 22 X X
Figure 8–1. Karyograms showing the karyotype for typically-developing (and developed) human
males (left ) and females (right ).
mosomes of the 23rd pair in the female genotype. An

individual’s sex is coded by the 23rd chromosome pair. Down Syndrome (Trisomy 21)
The genetic basis of DS is a de novo mutation (see
Box, “Genetic terminology”) of the 21st chromosome.
A de novo mutation is not inherited, but rather occurs
when the sperm joins the egg. The mutation is an addi-
tional chromosome — hence, the term “trisomy 21” to
designate a third chromosome added to the typical 1 2 3 4 5
pair at chromosome 21. The DS karyotype is shown in
Figure 8–2, where the added chromosome is indicated
by an arrow. There are other genotypes (genetic pro-
files) associated with Down syndrome, but Trisomy 21
is the most common and is the focus of the presentation 6 7 8 9 10 11 12
in this chapter. DS is the most common genetic cause
of ID. A photograph of a child with DS is shown in
Figure 8–3.
13 14 15 16 17 18
Epidemiology and the

DS Phenotype 19 20 21 22 x y
Based on data from the years 2004 to 2006, the preva- Figure 8–2. Karyogram of Down syndrome, showing
lence of DS is estimated to be 1 in 691 births, or 6,037 Trisomy 21. Arrow indicates the third chromosome at pair 21.
Genetic Terminology
Chromosomes: Twenty-three pairs Phenotype: The observable characteristics of an
of strands of DNA and proteins, each strand individual, including traits, anatomical features,
carrying genes; the 23rd pair is often called the biochemical characteristics, and so forth. A pheno-
sex chromosome because it is different for males type reflects the interaction of a genotype with
and females. The nuclei of most cells in the body environmental factors. For a group of individuals
contain the genetic information carried by the with the same genotype (e.g., a group of individu-
23 chromosomes. als with FXS, all of whom have the same genotype),
Karyotype: A description of a person’s chromo- the phenotype is typically variable.
somes, such as, “The karyotype of a person with Mutation: A gene mutation is a change in its
Down syndrome is Trisomy 21 — a third chromo structure that differs from that gene’s structure in
some added to the 21st pair of chromosomes”; most of the population. The change is not revers-
or, “The karyotype of a person with Fragile X ible (that is, it is permanent) and may be inherited
syndrome is damage to the X chromosome of (passed from generation to generation); a de novo
the 23rd pair.” A karyogram is an image of the mutation is not hereditary but occurs when the
chromosomes arranged in the way shown in sperm and egg are joined and an error occurs
Figure 8–1. The karyogram shows the karyotype during cell division, resulting in a gene (or genes)
(https://www.quora.com/What-is-revealed-by- that differs (differ) from the corresponding genes
a-karyotype). found in most of the population.
Gene: A unit of DNA, located on a chromosome, Monogenic: A trait or condition, possibly with
that controls the development of anatomical many phenotype characteristics, associated with a
structures and traits. Genes are passed from one single gene (as in FXS).
generation to the next. Polygenic: A trait or condition associated with
Genotype: The complete set of genes in an organ- multiple genes (as in autism spectrum disor-
ism, which varies across species; the human der [ASD]), typically with many phenotype
genome is estimated to contain about 20,000 genes. characteristics.
severity: any of the characteristics may be present in

mild, moderate, or severe form.
An individual with DS is also likely to have facial
differences, which may include a flattened face, eyes
that slant upward and have an almond shape, small
ears, a tongue that often sticks out of the mouth, and a
short neck.
Language Characteristics in DS
Speech and language impairments are common in

DS. The impairment may include deficits in phonol-
ogy, morphology, syntax, content, and social use of
language (pragmatics). Many of these deficits can be
attributed to ID, which is characteristic of DS (Wester
Oxelgren, Myrelid, Annerén, Westerlund, Gustafsson,
& Fernell, 2018). In addition, hearing loss may contrib-
ute to speech and language impairments in DS (Martin,
Klusek, Estagarribia, & Roberts, 2009).
Phonology
Acquisition of the sound system of the language is
delayed in most children with DS, with some errors
identified as disordered (i.e., not seen in typical speech
sound development; Kent & Vorperian, 2013). The pat-
tern of speech sound development often follows the
Figure 8–3. Photo of a boy with Down syndrome. pattern seen in typically developing children, with a
Reproduced from https://en.wikipedia.org/wiki/Down_ variable age-of-acquisition delay and slower rate of
syndrome. This file is licensed under the Creative Com- speech sound mastery (Chapter 13). Many adults with
mons Attribution-Share Alike 3.0 Unported license. DS also have speech sound errors that may be lifetime
impairments.
Some authors (e.g., Kadaverak, 2014) have stated
new cases annually (Kirby, 2017, adapting data from that the delay in speech sound acquisition in DS is a
Parker et al., 2010). DS is a complex condition. Like result of anatomical differences in the speech mecha-
FXS, there are many possible problems in DS, some nism. These differences include a protruding tongue, a
or all of which may be present in a given individual. small oral cavity (Xue, Caine, & Ng, 2010), and atypical
In other words, the genotype (trisomy 21) has a wide laryngeal structures as well as weakness of respiratory
range of phenotypes. Typically, individuals with DS muscles. Precisely how these anatomical differences
have some degree of intellectual disability and accom- affect speech sound development is unknown. Another
panying speech and language impairment. Additional factor likely to contribute to speech sound errors is dys-
characteristics of a person with DS may include hearing arthria, or poor control of speech mechanism structures
loss (often associated with middle ear disease), visual (e.g., the articulators, and larynx) due to neurologic
impairment, congenital heart defects, sleep apnea due deficits in speech motor control regions of the central
to an obstructed airway, mental illness, and dementia. nervous system.
Furthermore, the person with DS may be of small stat- Disordered speech sound development in children
ure and have poor muscle tone and loose joints. The and adults with DS may involve errors for sounds that
presence or absence of these characteristics in a person are mastered early by typically developing children.
with DS may change over time. For example, demen- For example, sounds such as /t/, /d/, and /n/, typi-
tia, a chronic and usually progressive brain disease that cally mastered no later than age 3 years, may be pro-
causes memory loss, impaired reasoning, and behav- duced incorrectly by children with DS, throughout
ioral changes, is likely to be observed in older indi- childhood and into adulthood. In addition, children
viduals with DS. The phenotype variability includes with DS make vowel errors as they learn their sound
system; as discussed in Chapter 13, vowel errors in tic impairments include difficulty with pronouns
typically developing children are unusual past the age (e.g., “Him pushed her” instead of “He pushed her”),
of 3 years (see review in Kent & Vorperian, 2013). production of short, simple sentences unlike the more
Speech intelligibility is affected significantly by the complex sentences produced by typically developing
speech sound errors in DS. Wild, Vorperian, Kent, Bolt, children with the same cognitive skill,2 and increas-
and Austin (2018) administered a single-word test of ed comprehension difficulties as utterances increase
speech intelligibility to typically developing individ- in complexity (Martin et al., 2009). These problems
uals and individuals with DS. In this kind of speech in language expression make individuals with DS
intelligibility test, speakers record single words (such sound much younger than their chronological age.
as sheep, boot, bath, and hot), and listeners respond to Problems with language comprehension limit the
each of the recorded words by entering them via key- learning of more complex forms and therefore limit
board. Speech intelligibility is expressed as the percent- the ability to incorporate these forms into expressive
age of total words presented that were heard correctly. language.
For example, 25 correctly heard words from a presenta-
tion of 50 words is expressed as 50% speech intelligibil-
Vocabulary
ity. In Wild et al. typically developing children between
the ages of 4 and 5 years were close to 80% intelligi- Vocabulary has often been regarded as a relative
bility, on average; children with DS in this age range strength in individuals with DS. This applies to both
had speech intelligibility ranging between 10% and receptive and expressive vocabulary, with better skills
65%. Even at 20 years of age, individuals with DS had on the receptive side. When receptive vocabulary is
speech intelligibility of only 60% to 70% (Wild et al., compared between children with DS and typically
2018), well below adult speech intelligibilities which developing children who are matched to children
are close to 100% in typically developing individuals. with DS on cognitive skill (see footnote 2), perfor-
For individuals with DS, poor speech intelligibility is a mance among children with DS is close to mental age
significant problem in social interaction. expectations.
Finally, stuttering behaviors and other dysflu- Expressive vocabulary skill is not as much of a
encies have been observed in approximately 30% of strength as receptive skill. Delays in expressive vocab-
children with DS; this compares with a prevalence ulary relative to receptive vocabulary are common,
of stuttering among the typically developing popula- and the expressive vocabulary may be smaller than
tion of around 1% (Eggers & van Eedernbrugh, 2018). expected based on nonverbal cognitive skills. Like typi-
Fluency disorders are covered in Chapter 17. cally developing children, children with DS continue
to add to their expressive vocabulary as they get older,
albeit at a slower rate compared with the former group
Morphology and Syntax
(Martin et al., 2009).
Morphology and syntax are both impaired in children A recent study (Loveall, Moore Channell, Abbe
with DS. The impairment in these aspects of language duto, & Connors, 2019) demonstrates the potential
seems to be greater than expected from the children’s influence of one language component (verbs) on the
cognitive skills. In contrast, vocabulary in children other (syntax) in language development. These authors
with DS is often consistent with cognitive skill; thus, used a storytelling approach to obtain expressive lan-
the frequent statement in the research and clinical lit- guage samples from children with DS and from typi-
erature that in DS morphology and syntax are dispro- cally developing children. Children with DS produced
portionately impaired, compared to other language as many different verbs as typically developing children
components (Martin et al., 2009). Language compre- but used those verbs less frequently.
hension and expression for morphology and syntax Verbs, which are typically acquired after nouns in
are both affected. both typically developing children and children with
Bound morphemes (e.g., tense markers such as DS (Loveall, Moore Channell, Philips, Abbeduto, &
-ed, third person singular such as he does [I do]) are Connors, 2016), require additional words to make their
examples of morphological problems in DS. Syntac- meaning clear. This is unlike nouns, which can stand
2
I n other words, when typically developing children and children with DS are matched for cognitive skill using a nonverbal estimate of IQ, the
morphological and syntactic abilities of children with DS are significantly poorer than the abilities of the typically developing children. This
means that the poor morphological and syntactic abilities of children with DS are not accounted for by their cognitive level — the language
deficit is in excess of what would be predicted by cognition. Of course, when typically developing children and children with DS are matched
in this way, the children with DS are older than the typically developing children.
alone and often do so in the early speech of children. dren but not as affected as in FXS (especially FXS +
For example, “ball” does not need other words to make ASD) or ASD (see below). As with pragmatic deficits
its meaning clear, but “throw” is clarified when it is in FXS and ASD, skills in pragmatic language among
joined to other words (“I throw the ball”). Verbs specify children with DS are not only delayed but improve
the role of nouns in a sentence — that is, they require at a slower rate when compared to typically develop-
syntax for support. Loveall et al. (2016, p. 83) explain ing children.
this interaction between verbs and syntax well: “verbs
are responsible for linking words within a sentence
together, and as such, they play a key foundational Fragile X Syndrome:
role in syntax. If disrupted, then syntactic development General Characteristics
could also be impacted.”
Loveall et al. (2019) argue that the less frequent use FXS is diagnosed by genetic testing. In FXS, there is
of verbs in the expressive language of children with a mutation of the X chromosome on the 23rd pair.
DS may have an important influence on the weak- Figure 8–4 shows a karyogram with an arrow point-
ness noted previously for expressive syntax. Expres- ing to the area of mutation on the X chromosome of
sive syntax is disproportionately impaired in DS (see the 23rd pair (enclosed within the red oval); there is a
earlier) — perhaps the less frequent verb usage of this slight break or discontinuity of the chromosome. The
important “linking” vocabulary affects the develop- mutation results in the group of characteristics (or a
ment of expressive syntactic skill. subgroup of those characteristics) that make up the
This information, generated by carefully per- phenotype of FXS (see Box, “Genetic Terminology”).
formed research, has potentially important implica- Both girls and boys are diagnosed with FXS (both have
tions not only for diagnosis of language problems in X chromosomes), with boys usually having a more
DS but also for therapy strategies. If expressive use of severe version of the syndrome. This is because girls
a specific vocabulary category such as verbs stimulates have a second X chromosome that can compensate for
the growth of syntactic skills, a therapeutic focus on the single, mutated X chromosome in boys with FXS.
increased use of verbs may be more than an exercise in Much of the research literature in FXS is concerned
vocabulary building. with boys.
The phenotype in FXS includes facial differences,
intellectual disabilities, cognitive and language disor-
Language Use (Pragmatics)
ders, as well as other characteristics. These may include
Pragmatic language use is a complex skill with many anxiety, depression, visual, auditory, and psychiatric
different aspects, as discussed in Chapters 3 and 7. problems. The facial differences are not necessarily
Examples of pragmatic language use include how obvious at birth and may not be apparent for a while.
much talking is done in social situations, understanding Figure 8–5 displays photographs of a male with FXS, as
appropriate language use in different communication a child (A) and as an adult (B). Note the long face with
settings, language redundancy (multiple repetitions the high forehead, the large jaw, and large ears charac-
of the same sentences), patterns of eye contact during teristic of the facial differences previously described.
conversations, and the ability (or willingness) to initi- The facial characteristics often gain greater prominence
ate conversations (Klusek, Martin, & Losh, 2014). with age.
Pragmatic language use in DS is widely viewed Genetic testing may be done when ID is suspected
as a weakness, in younger children, older children, because of increasing developmental delays, perhaps
and adults (e.g., Klusek et al., 2014; Lee, Bush, Martin, accompanied by facial and behavioral characteristics
Barstein, Maltman, Klusek, & Losh, 2017; Smith, Næss, that are increasingly like those observed in children
& Jarrold, 2017). Not all aspects of pragmatic language with FXS. The genetic testing confirms or disconfirms
use are affected equally. For example, teenagers with the syndrome. The child may also show behaviors
DS may have a deficit in indicating that they have consistent with ASD, such as poor eye contact, rock-
not understood a statement, but may have relatively ing, and hand flapping. In fact, up to 90% of boys diag-
strong skills when they are asked to clarify something nosed with FXS have autistic-like behaviors, and 30%
they have said, or when they are narrating stories (see to 50% meet the diagnostic criteria for ASD (Niu et al.,
review in Martin et al., 2009). Nonverbal skills in social 2017). As reviewed in Chapter 7, ASD is diagnosed
communication may also be a relative strength among with formal and semiformal tests of behavior; there is
individuals with DS. no genetic test for the condition.
Pragmatic language skills in DS are typically FXS is the leading heritable cause of ID. The muta-
delayed relative to those of typically developing chil- tion on the X chromosome of the 23rd pair is passed
Fragile X Syndrome
1 2 3 4 5
6 7 8 9 10 11 12
13 14 15 16 17 18
19 20 21 22 X Y
Figure 8–4. Karyotype for a male with FXS. The arrow points to the
region of the chromosome in which there is a mutation. The chromosome
appears to be broken in this location.
A B
Figure 8–5. A male with fragile X syndrome, as a young child (A) and young adult (B). Photos provided by permission
of Kelly Randels Coleman.
105
from parent to child. A child who receives the fragile X Phonology

mutation and is diagnosed with FXS may have a range
of the characteristic anatomical (e.g., the facial differ- Typically developing children learn the sound system
ences) and behavioral characteristics, and the severity of their native language in a systematic way (Chap-
of those characteristics may range from very mild to ter 13). Sounds such as stop consonants are mastered
very severe. This is an example of a genotype being before fricatives, /w/ before /r/ and /l/, single con-
associated with a wide range of phenotypes. sonants before consonant clusters, to name a few well-
known examples. Phonological processes also follow
a systematic trend. For example, the process of final
Epidemiology of FXS consonant deletion (“dah” for “dog”) is typically
eliminated earlier than the process of cluster reduction
The prevalence of FXS is approximately 1 in 5,000 (“pay” for “play” or “sop” for “stop”).
males, and 1 in 4,000 to 8,000 females. Additional Barnes and colleagues (2009) reviewed previous
details of prevalence and patterns of hereditary trans- evidence for speech sound development in single-
mission can be found in Niu et al. (2017) and Saldar- word productions of children with FXS. This review
riaga et al. (2014). suggests that children with FXS have delayed, not dif-
Children diagnosed with FXS are much more ferent, speech sound development relative to typically
likely to have intellectual disabilities compared with developing children. This means that the children with
children diagnosed with nonsyndromic ASD (e.g., FXS acquire the sound system of English with the same
children with ASD who are not diagnosed with FXS). error patterns as typically developing children but at a
This has implications for the separate and combined slower rate.
language disorders in FXS and ASD. The literature reviewed by Barnes et al. (2009)
included studies in which single-word productions
were analyzed. Barnes et al. wanted to know if the
Language Characteristics in FXS same speech sound learning patterns were found in
children with FXS when the utterances were from
When relevant, the following description of the lan- connected, “natural” speech. The participant groups
guage disorder in FXS includes discussion of the included children with FXS, children with FXS plus
language disorder in ASD. This is because of the over- ASD, and typically developing children. When the
lap between the two disorders. data were analyzed for speech sound error patterns,
Language in all areas is typically delayed in boys there was almost no difference between the two groups
with FXS relative to typically developing agemates of children with FXS (FXS alone and FXS plus ASD).
because of the intellectual disability; therefore, most Children in both groups had more speech sound errors
investigations compare language profiles to typically- compared with typically-developing children. Consis-
developing children using mental-age matches. For tent with these speech sound errors, speech intelligi-
example, the language abilities of a 10-year-old with bility was significantly lower for the connected speech
FXS might be compared to a 5-year-old typically devel- utterances of children with FXS, and FXS plus ASD,
oping child because both children score similarly on a when compared to speech intelligibility of typically
nonverbal IQ measure. This allows us to answer the developing children. But when the children with FXS
question, “How is the language of children with FXS and FXS plus ASD were matched to typically-devel-
similar/different from that of children at the same cog- oping children for the number of their speech sound
nitive level?” errors, the children with FXS were significantly less
A prominent language disorder in children (and intelligible than the typically-developing children. This
later, adults) with FXS is social communication impair- is somewhat surprising because speech intelligibility
ment (pragmatic language deficit). Other areas of is dependent to a large degree on the “goodness” of
language deficit exist and contribute to the social com- consonants and vowels, and the matching across the
munication problems. Following the structure of the groups of sound errors might lead to the expectation
previous chapter, language capabilities of children of equivalent intelligibility for the groups.
with FXS in the areas of phonology, morphology, syn- In summary, current research indicates that chil-
tax, content (lexicon), and pragmatics are presented. dren with FXS have speech delays with speech sound
Keep in mind variation across individuals in pheno- error patterns very much like speech sound errors
type — children with FXS, even with a common genetic observed in typically developing children during the
source of the syndrome, do not have the same language course of their speech sound development (Chapter 13).
(or other) characteristics. The pattern of speech sound errors is similar to those
of typically-developing children, but the process of 2017; Martin, Losh, Estigarribia, Sideris, & Roberts, 2013;
speech sound mastery is delayed. This is similar to the Oakes, Kover, & Abbeduto, 2013; and Sterling, 2018).
description in Chapter 15 of speech delay in otherwise Children with FXS, from toddlers to teenagers,
typically developing children. Even when children have language deficits that affect their ability to com-
with FXS are matched to typically developing children municate and engage in social experiences. Deficits
for the stage of their speech sound development, chil- in morphological and syntactical skills are frequent
dren with FXS are less intelligible than typically devel- among children with FXS. Morphological deficits
oping children. The severity of the speech intelligibility include (for example) tense marking (wait-waited),
problem in FXS is variable, ranging from essentially proper form of the verb “is” (“he is”-“they are”), and
normal speech intelligibility to severely unintelligible third person singular (“I do”-“he does”). Recall that
speech. The variability of the phenotype is not well incorrect use of morphemes is characteristic of a brief
understood. (See Box, “Embrace the Gray.”) phase of typical language development; mastery of
most if not all morphemes is completed around age
4 years (Brown, 1973). Children with FXS take much
Morphology and Syntax
longer to master grammatical morphemes and may
A fairly large research literature is available on the maintain a deficit in morpheme use into adulthood.
language characteristics of children with FXS. Studies Similarly, typically developing children master simple
of language deficits in FXS often report data for boys sentence forms (“The boy fed the dog”) earlier than
because their language deficit, as well as the severity of more complex forms such as sentences with embed-
their ID, is typically worse than the deficits observed ded clauses or of greater length (“The boy who fed the
in girls. The current review is based on information dog is Joe”). The severity of these deficits in language
found in several recent reviews and experiments (Fine- form varies within the population of children with FXS.
stack, Sterling, & Abbeduto, 2013; Haebig & Sterling, These deficits do not necessarily involve the same mor-
2017; Komesidou, Brady, Fleming, Esplund, & Warren, phemes or syntactic forms for all children with FXS.
Embrace the Gray

An introductory text for Communication significant contribution to speech intelligibility.
Sciences and Disorders must provide information Measures of other variables that are likely to
on the most current, research-generated knowledge contribute to speech intelligibility are also left out
in a broad way. Coverage of the material must of the analysis. Another example is that children
be selective and avoid many of the research and in the FXS, FXS + ASD and typically-developing
clinical details that are relevant to the conclusions groups were matched using a technique that makes
reached by authors. Higher-level courses build on the language skills of children in the three groups
the general information presented in this textbook, “equal.” The use of this matching technique means
largely by adding details left out of an introductory that the children with FXS were much older than
textbook. A good example is the coverage in this the typically developing children when the conso-
chapter of the Barnes et al. study. Their conclusions nant “goodness” was compared across groups.
are that the phonological skills of children with These strategies are controversial. Now, do not get
FXS, FXS + ASD, and typically developing children this author wrong: the study by Barnes et al. is a
are similar when matched for stage of speech- fine study, and the results are relevant to both basic
sound development, but that speech intelligibility science and clinical practice. But there is a lot of
is lower in the first two groups compared with gray area in the findings. This is the attraction of
the typically developing group. This seems to be a science: embrace the gray area of current knowl-
little odd. After all, it seems reasonable to expect a edge and by careful analysis and experiment make
strong relationship between the “goodness” of the it less gray. This is the attraction of clinical practice:
sound system and the goodness of speech intel- embrace the gray area of an individual’s cognitive,
ligibility. But the details of the study are revealing. language, and social skills, apply clinical expertise
For example, Barnes et al.’s analysis is focused on including familiarity with research findings, and
consonants and does not take account of vowels, by a carefully structured therapeutic plan improve
the articulatory “goodness” of which makes a the individual’s language and social skills.
Not all morphemes are subject to incorrect use, and typically developing children. In addition, a greater
children with FXS do not necessarily have difficulties deficit in expressive vocabulary may be observed in
with the same morpheme(s) (Sterling, 2018). children with FXS only compared with children with
Language researchers and clinicians who treat FXS plus ASD. The increase in expressive vocabulary
developmental language disorders are interested in over time is not predicted very strongly from a child’s
both comprehension and production (expression) of nonverbal, cognitive abilities.
language forms. Comparison of comprehension with The deficit in receptive vocabulary seems to be
expression skills may provide insight to the most pro- different from the deficit in expressive vocabulary. An
ductive approach to language therapy. For example, a important difference is that receptive vocabulary appears
child with relatively strong comprehension skills but to increase with development in a manner consistent
weaker expression skills may receive the greatest lan- with increasing nonverbal cognitive ability. In other
guage benefit from therapy focused on expression. words, unlike expressive vocabulary, receptive vocabu-
There are several ways to assess morphological lary can be predicted from nonverbal cognitive abilities.
and syntactic skills for both comprehension and expres-
sion in children with FXS (as well as other children
Language Use (Pragmatics)
with language disorders). These assessments include
formal, standardized tests, as well as analysis of lan- Children with FXS have problems with language use
guage patterns in more natural conversational settings. for social situations. As described by Klusek, Martin,
A general measure of expressive language sophistica- and Losh (2014), pragmatic language skills include,
tion, taken from natural language samples, is mean “the selection of conversational topics fitting to the
length of utterance (MLU), discussed in Chapter 6. situation, appropriate word choice, and the ability to
In general, children with FXS have impairments modify language in order to match the expectations
in both comprehension and expression of morphology and knowledge base of the communication partner”
and syntax, with expression more affected than com- (p. 1692). Based on research to date, pragmatic lan-
prehension. Children with FXS have shorter MLUs guage skills are more affected in children (and adults)
compared to typically developing children and increase with FXS who have also been diagnosed with ASD,
their MLU throughout early language development compared with children diagnosed with FXS “only”
at a much slower rate than typically developing chil- (see review in Klusek et al., 2014). As discussed in
dren. As a general measure of language sophistication, Chapter 7, social communication deficits (nonverbal
the difference in MLU between children with FXS and and verbal pragmatic abilities) are a hallmark of ASD.
typically developing children reflects a significant defi- An interesting issue is the possible difference in
cit in the morphological and syntactic forms of expres- the pragmatic language deficit in boys with FXS “only”
sive language. and boys with FXS plus ASD. Is the pragmatic language
deficit more severe in FXS plus ASD compared with
FXS “alone”? The answer seems to be “yes.” Niu et al.
Vocabulary
(2017) have argued that research and clinical evidence
Like other aspects of language development, both indicates that pragmatic language deficits in “nonsyn-
receptive and expressive vocabulary of children with dromic” ASD are more severe than pragmatic language
FXS lag the vocabulary of typically developing children. deficits in FXS plus ASD. Notice the difference between
Even with this lag, the larger receptive as compared to these group comparisons: FXS “only” compared to FXS
expressive vocabulary in typically developing children plus ASD, and “nonsyndromic ASD” compared to FXS
is also found in FXS. The relative deficit of receptive plus ASD. The determination of how ASD adds to or
and expressive vocabulary in children with FXS, com- changes FXS, or FXS adds to or changes nonsyndromic
pared with typically developing children, may be more ASD, is very complicated and awaits further research.
apparent in younger children (around 8 years of age). A specific example of a pragmatic language defi-
Older children (around 12 years of age) may “catch up” cit is difficulty with topic maintenance during a con-
to vocabulary skills of typically developing, mental versation. Conversing is a joint, socially reinforcing
age–matched children. This summary, and the one that pastime, in which the participants discuss a topic until
follows, is based on studies and reviews published by the conversation turns to a related or different topic. It
Finestack, Sterling, and Abbeduto (2013), Haebig and is as if the participants know the rules for supporting
Sterling (2017), Kover and Abbeduto (2010), Lewis et al. productive and socially reinforcing talk. Individuals
(2006), and Martin et al. (2013). who struggle with this skill may suddenly change the
Expressive vocabulary development in children topic under discussion, in effect not understanding the
with FXS is slower than vocabulary development in rules of conversation. And within the same context,
the individual with a pragmatic language deficit may Vocabulary in DS is characterized by better recep-
continue to contribute this new, unrelated topic to the tive skills compared to expressive skills; the expres-
discussion, even when the original topic continues to sive vocabulary may be less than expected based on a
be discussed by the other participants. Perseverative/ child’s mental age.
repetitious language is a prominent feature of language Pragmatic language skills are generally impaired
use in individuals with FXS. in DS, but some components of pragmatics, such as
A pragmatic language deficit is potentially devas- story narration and ability to respond to questions that
tating to an individual’s social development and life. ask for clarification, are relative strengths.
The deficit can partially or largely affect the ability to FXS, which occurs in approximately 1 in 4,000
make friends, share information, participate in sports, boys and 1 in 4,000 to 8,000 girls, is the leading heri-
and engage in many other aspects of life. table cause of ID, and many children diagnosed with
FXS are also diagnosed with autism.
The ID and other characteristics in FXS, including
impaired language skills, are more severe in boys, com-
Chapter Summary
pared with girls.
The phenotype in FXS is variable, both in the phys-
Intellectual disabilities, which have a prevalence world- ical and behavioral characteristics; the language skills
wide of approximately 1% of the population, are component of the behavioral phenotype varies from
defined as chronic impairments of mental abilities that mildly to severely impaired.
affect adaptive functioning in conceptual skills, social A prominent characteristic of the language disor-
behaviors, and practical behaviors. In most cases of ID, der in FXS is in the area of pragmatics.
the individual is mildly affected. Children with FXS acquire the sound system of
Diagnosis of ID is made in childhood and is not English with the same error patterns as typically devel-
used for adults who acquire mental deficits by (for oping children but at a slower rate; their phonological
example) strokes, head injury, or degenerative neuro- development is delayed, not different.
logical conditions. The development of comprehension and produc-
Overall IQ scores, which are used as a component tion (expression) of morphology and syntax is delayed
of the diagnosis of intellectual disabilities, reflect ver- in children with FXS, although the specific morphemes
bal skills in addition to nonverbal skills; IQ scores can and syntactic structures that are affected vary across
also be obtained by tests that include only nonverbal children.
test items for an estimate of cognitive ability indepen- Both receptive and expressive vocabulary of chil-
dent of language skills. dren with FXS lag the vocabulary of typically devel-
Down syndrome (DS) is the most common genetic oping children; like typically developing children,
cause of ID. receptive skills for vocabulary are more advanced than
DS occurs when there is a third chromosome at expressive skills.
pair 21, hence, the term “trisomy 21,” a description of Comparison of the language skills of children with
the genotype in DS. FXS “only” and children with FXS plus ASD do not
As in other genetic disorders, there is wide vari- reveal a consistent difference, but children with FXS
ability in the phenotype in DS (i.e., across individuals), plus ASD may have more severe language deficits,
which includes variability in language skills. especially in the area of pragmatic language.
Phonological skills are primarily delayed in DS,
meaning that the learning of the speech sound system
follows the same progression as in typically develop-
ing children, but at a slower rate; some speech sound References
errors in DS may be unusual, such as later learning of
sounds that are acquired early in typical speech sound American Psychiatric Association. (2013). Diagnostic and sta-
development. tistical manual of mental disorders (5th ed.). Washington, DC:
Author.
An important outcome of the delay in speech
Barnes, E., Roberts, J., Long, S. H., Martin, G. E., Berni, M. C.,
sound development in DS is that speech intelligibility Mandulak, K. C., & Sideris, J. (2009). Phonological accu-
is affected, which may limit social interactions. racy and intelligibility in connected speech of boys with
Morphology and syntax are thought to be particu- fragile X syndrome or Down syndrome. Journal of Speech,
lar areas of weakness for language skills in DS; both Language, and Hearing Research, 52, 1048–1061.
receptive and expressive language are affected, with a Brown, R. (1973). A first language: The early stages. Cambridge,
prominent weakness in expressive skills. MA: Harvard University Press.
Carvill, S. (2001). Sensory impairments, intellectual disability Martin, G. E., Klusek, J., Estagarribia, B., & Roberts, J. E.
and psychiatry. Journal of Intellectual Disability Research, 45, (2009). Language characteristics of individuals with Down
467–483. syndrome. Topics in Language Disorders, 29, 112–132.
Eggers, K., & van Eerdenbrugh, S. (2018). Speech disfluencies Martin, G. E., Losh, M., Estigarribia, B., Sideris, J., & Roberts,
in children with Down syndrome. Journal of Communica- J. (2013). Longitudinal profiles of expressive vocabulary,
tion Disorders, 71, 72–84. syntax, and pragmatic language in boys with fragile X
Finestack, L. H., Sterling, A. M., & Abbeduto, L. (2013). Dis- syndrome or Down syndrome. International Journal of Lan-
criminating Down syndrome and fragile X syndrome based guage and Communication Disorders, 48, 432–443.
on language ability. Journal of Child Language, 40, 244–265. Maulik, P. K., Mascarenhas, M. N., Mathers, C. D., Dua, T.,
Haebig, E., & Sterling, A. (2017). Investigating the receptive- & Saxena, S. (2011). Prevalence of intellectual disability:
expressive vocabulary profile in children with idiopathic A meta-analysis of population-based studies. Research in
ASD and comorbid ASD and fragile X syndrome. Journal Developmental Disabilities, 32, 419–436.
of Autism and Developmental Disorders, 47, 260–274. McKenzie, K., Milton, M., Smith, G., & Ouellette-Kuntz, H.
Kadaverak, J. N. (2014). Language disorders in children: Fun- (2016). Systematic review of the prevalence and incidence
damental concepts of assessment and intervention (2nd ed.). of intellectual disabilities: Current trends and issues. Cur-
New York, NY: Pearson. rent Developmental Disorders Reports, 3, 104–115.
Kent, R. D., & Vorperian, H. K. (2013). Speech impairment in Niu, M., Han, Y., Dy, A. Du, Jin, J., Qin, J., . . . Hagerman, R. K.
Down syndrome: A review. Journal of Speech, Language, and (2017). Autism symptoms in fragile X syndrome. Journal of
Hearing Research, 56, 178–210. Child Neurology, 32, 903–909.
Kirby, R. S. (2017). The prevalence of selected major birth Oakes, A., Kover, S. T., & Abbeduto, L. (2013). Language com-
defects in the United States. Seminars in Perinatology, 41, prehension profiles of young adolescents with fragile X
338–344. syndrome. American Journal of Speech-Language Pathology,
Klusek, J., Martin, G. E., & Losh, M. (2014). A comparison 22, 615–626.
of pragmatic language in boys with autism and Frag- Parker, S. E., Mai, C. T., Canfield, M. A., Rickard, R., Wang, Y.,
ile X syndrome. Journal of Speech, Language, and Hearing Meyer, R. E., . . . Correa, A., for the National Birth Defects
Research, 57, 1692–1707. Prevention Network. (2010). Updated national birth prev-
Komesidou, R., Brady, N. C., Fleming, K., Esplund, A., & alence estimates for selected birth defects in the United
Warren, S. F. (2017). Growth of expressive syntax in chil- States, 2004–2006. Birth Defects Research (Part A): Clinical
dren with fragile X syndrome. Journal of Speech, Language, and Molecular Teratology, 88, 1008–1016.
and Hearing Research, 60, 422–434. Saldarriaga, W., Tassone, F., González-Teshima, L. Y., Forero-
Kover, S. T., & Abbeduto, L. (2010). Expressive language Forero, J. V., Ayala-Zapata, S., & Hagerman, R. (2014).
in male adolescents with fragile X syndrome with and Fragile X syndrome. Colombia Médica, 45, 190–198.
without comorbid autism. Journal of Intellectual Disability Smith, E., Næss, K-A. B., & Jarrold, C. (2017). Assessing prag-
Research, 54, 246–265. matic communication in children with Down syndrome.
Lee, M., Bush, L., Martin, G. E., Barstein, J., Maltman, N., Journal of Communication Disorders, 68, 10–23.
Klusek, J., & Losh, M. (2017). A multi-method investiga- Sterling, A. (2018). Grammar in boys with idiopathic autism
tion of pragmatic development in individuals with Down spectrum disorder and boys with fragile X syndrome plus
syndrome. American Journal on Intellectual and Developmen- autism spectrum disorder. Journal of Speech, Language, and
tal Disabilities, 122, 289–309. Hearing Research, 61, 857–869.
Lewis P., Abbeduto, L., Murphy, M., Richmond, E., Giles, N., Wester Oxelgren, U., Myrelid, A., Annerén, G., Westerlund, J.,
Bruno L., . . . Orsmond, G. (2006). Cognitive, language and Gustafsson, J., & Fernell, E. (2019). More severe intellectual
social-cognitive skills of individuals with fragile X syn- disability found in teenagers compared to younger chil-
drome with and without autism. Journal of Intellectual Dis- dren with Down syndrome. Acta Paediatrica, 108, 961–966.
ability Research, 50, 532–45. Wild, H., Vorperian, H. K., Kent, R. D., Bolt, D. M, & Austin,
Loveall, S. J., Moore Channell, M., Abbeduto, L., & Connors, D. (2018). Single-word speech intelligibility in children
F. A. (2019). Verb production by individuals with Down and adults with Down syndrome. American Journal of
syndrome during narration. Research in Developmental Dis- Speech-Language Pathology, 27, 222–236.
abilities, 85, 82–91. Xue, S. A., Caine, L., & Ng, M. L. (2010). Quantification of
Loveall, S. J., Moore Channell, M., Phillips, B. A., Abbeduto, vocal tract configuration of older children with Down
L. & Connors, F. A. (2016). Receptive vocabulary analysis syndrome: A pilot study. International Journal of Pediatric
in Down syndrome. Research in Developmental Disabilities, Otorhinolaryngology, 74, 378–383.
55, 161–172.
9
Language Disorders in Adults
Introduction language disorders. Chapter 2 provides more detailed

information on each of the review points.
Language disorders in adults are usually the result of
an acquired condition/disease that disrupts previously
normal language skills. An acquired, adult language Review of Concepts for the
disorder contrasts with the chronic language impair- Role of The Nervous System In
ment that extends from early childhood into adulthood Speech, Language, and Hearing
due to conditions such as Down syndrome or Fragile X
syndrome. For the purposes of this chapter, a language Critical review concepts for the understanding of adult
disorder is considered to be acquired if the condition/ language disorders are (a) the cerebral hemispheres;
disease responsible for the disorder occurs after the (b) lateralization of language functions to the left cere-
mid-teenage years. bral hemisphere; (c) language expression and compre-
Many adult conditions/diseases can result in a hension as represented in different regions of the left
language disorder. Examples include stroke, traumatic hemisphere, called Broca’s area (expression) and Wer-
brain injury (TBI), adult-onset degenerative diseases nicke’s area (comprehension), respectively; (d) the sig-
such as Parkinson’s disease, and dementia-related dis- nificant role of connections between different regions
eases (such as Alzheimer’s disease). of the brain for language production and perception;
The current chapter focuses on adult speech and and (e) the perisylvian language areas.
language disorders in stroke, TBI, and dementia-
related diseases. Speech impairments are included,
when relevant, because the conditions/diseases that Cerebral Hemispheres
result in adult language disorders often include speech
disorders as well. We use the term “adult language dis- The cerebral hemispheres include the left and right
orders” throughout the rest of this chapter to include hemispheres. Each hemisphere contains a frontal, pari-
both speech and language. etal, temporal, and occipital lobe. The frontal, parietal,
A brief review of speech, hearing, and language and temporal lobes play a major role in language pro-
structures of the brain precedes the discussion of adult duction and comprehension (reception). Figure 9–1
111
Left Hemisphere
Parietal lobe
Frontal lobe
Occipital lobe
Sylvian fissure
Temporal lobe
Figure 9–1. The four lobes of the left hemisphere. Note the sylvian fissure, an important
landmark.
shows the four lobes in the left hemisphere as well as sively in the frontal lobe, or that language comprehen-
the sylvian fissure (see Chapter 2). sion is represented exclusively in the temporal/parietal
lobes. All three lobes play a role in language expression
and comprehension. The regions of Broca’s and Wer-
Lateralization of Speech nicke’s area are shown in Figure 9–2. The blood supply
and Language Functions to the brain, shown in this figure, is discussed below.
Both the left and right cerebral hemispheres contain

brain tissue associated with speech and language func- Connections Between Different
tions. Tissue in the left hemisphere is specialized for Regions of the Brain
language production and comprehension. “Lateraliza-
tion of language functions” means that, even though the Different regions of the brain are connected by fiber
two hemispheres have the same lobes, language func- tracts, so the regions can exchange information. The
tions are represented most prominently in one hemi- connecting tracts are not secondary to the cortical
sphere — they are lateralized to the left hemisphere. regions but are equally important to brain function for
speech and language. Disruption of fiber tracts, even
when cortical regions are healthy, can result in speech
Language Expression and and language impairments.
Comprehension Are Represented
in Different Cortical Regions
of the Left Hemisphere Perisylvian Speech and
Language Areas of the Brain
Language expression is primarily represented in the
frontal lobe of the left hemisphere, and language com- A side view of the left hemisphere of the brain is shown
prehension is primarily represented in the temporal in Figure 9–3. The front of the hemispheres (i.e., of the
and parietal lobes of the left hemisphere. This does not head) is to the left. The lobes previously identified are
mean that language expression is represented exclu- labeled, as is the sylvian fissure. The sylvian fissure is
Broca’s area
supply
Posterior
Middle cerebral
artery
Wernicke’s
area supply
Figure 9–2. General regions of Broca’s and Wernicke’s area, shown by the shaded, oval
areas. Blood supply to the region of Broca’s and Wernicke’s area is also shown.
Left Hemisphere
Parietal lobe
Perisylvian
speech and
Frontal lobe language areas
Occipital lobe
Sylvian fissure
Temporal lobe
Figure 9–3. View of the left hemisphere showing the region of cortical tissue, as well as the under-
lying white matter, called the perisylvian speech and language areas.
the deep groove in the cortical tissue that separates the tracts that are thought to be of critical functional impor-
temporal lobe from the frontal and parietal lobes. A red tance for speech and language. These brain regions are
oval encloses cortical tissue and the underlying fiber called the “perisylvian speech and language areas.”
The name implies that the enclosed regions are active blood flow to Wernicke’s area. If Broca’s area is the pri-
in speech and language functions and when damaged mary brain area for language expression, a stroke like
are likely to cause speech and language impairments. this is expected to affect language expression but not
language comprehension. The reverse is also possible: a
blockage after the main artery has passed the branch to
Adult Language Broca’s area does not affect the frontal lobe (the location
Disorders: Aphasia of Broca’s area) but may affect Wernicke’s area. In this
case, language comprehension is likely to be affected
Aphasia is an impairment of language expression and/ without any effect on language production. This simpli-
or comprehension, resulting from brain damage. Stroke, fied account of how blood loss affects language function
the most common cause of aphasia, is a blockage or is not the whole story but makes the point of the rela-
rupture of the arteries that supply the brain with blood. tionship of blood supply to the brain and potential loss
The blood supply of the cerebral hemispheres is of expressive versus receptive function.
extensive and detailed. It is extensive because it reaches Stroke is not a rare occurrence. There are approxi-
all parts of the hemispheres, and it is detailed because mately 800, 000 cases of stroke per year in the United
main arteries into the hemispheres branch into smaller States (https://www.cdc.gov/stroke/facts.htm), and
and smaller vessels that supply precise, local regions many more worldwide. A significant number of strokes
of the brain. have aphasia as a prominent deficit (Ellis & Urban,
A loss of blood supply to a region of the brain pre- 2016). Many patients who have aphasia recover most,
vents the affected neurons from sustaining their func- if not all, of their language abilities in the days, weeks,
tions; the neurons die. This compromises the ability of or months following a stroke. A smaller number of
that brain region to contribute to control of a behavior, patients have a chronic language impairment.
such as language.
Strokes occur for several reasons. Blood vessels
may be blocked completely by a tissue fragment that Classification of Aphasia
travels through the bloodstream after breaking off
from an artery wall; the fragment may travel a long The history of aphasia, in both clinical and research
way through the bloodstream before blocking an artery work, includes a substantial effort to classify differ-
and depriving blood to regions beyond the blockage. ent types of the disorder. Six types of aphasia, linking
An artery or vessel may have thickened walls due to damage to specific brain structures with language com-
a buildup of plaque, which narrows the vessel, lim- prehension and expression difficulties, are presented
iting blood flow to regions beyond the point of nar- here. They include Broca’s aphasia, Wernicke’s aphasia,
rowing. The neurons beyond the narrowed vessel lose conduction aphasia, anomic aphasia, global aphasia,
functional ability either partially or completely. A third and primary progressive aphasia. Primary progressive
possibility is a ruptured vessel, which spills blood into aphasia is not the result of a stroke but is discussed here
the brain and does not allow sufficient blood to reach because it is characterized by aphasic deficits (among
regions beyond the rupture. other deficits). A brief summary of apraxia of speech,
Figure 9–2 shows the blood supply to the surface which in adults is typically the result of a stroke, is
of the left hemisphere. Note the large artery (called the also provided. There are other categories of aphasia
middle cerebral artery) emerging between the tip of as well, but here we discuss the ones most relevant
the temporal lobe and the bottom of the frontal lobe. to clinical practice and theories of brain function for
The artery turns toward the back of the hemispheres language.
and gives off a branch to furnish blood to Broca’s area Aphasia types can be classified in one of two ways
(Figure 9–2, upward-pointing red arrow). As the main (and possibly other ways, not discussed here). One
artery continues in the direction of Wernicke’s area way is to identify the location of brain damage with
(arrow pointing in the direction of the back of the hemi- an imaging technique such as CT (computed tomog-
sphere), blood is supplied to Wernicke’s area. raphy) or MR (magnetic resonance). Type of aphasia
Blood is supplied to other areas of language- is diagnosed based on the expectation of the language
related tissue within the cerebral hemispheres, both disorder from brain-behavior relationships. For exam-
in the cortex and in subcortical structures. Figure 9–2 ple, damage to Broca’s area, confirmed by imaging, is
shows the potential for independent strokes in Broca’s expected to result in Broca’s aphasia (see later in chap-
area and Wernicke’s area. For example, blockages can ter), which has certain characteristics that distinguish
occur in the branch to Broca’s area without affecting it from other types of aphasia. Similarly, damage to
Wernicke’s area is expected to result in Wernicke’s nicke’s area (top, right). These images were taken from
aphasia. This is referred to as a “localization” view of slices through the brain in the horizontal plane. The
aphasia type: specific areas of the brain are associated slice level is shown in the bottom photograph of a real
with specific behaviors. brain; note that the slice cuts through Broca’s and Wer-
What does it mean to confirm a lesion (damage to nicke’s areas. The sides of the images are flipped: the
tissue) with an imaging technique? Figure 9–4 shows right side of the images is the left side of the brain. The
two CAT (computerized axial tomography) scans, one dark regions indicated by the pointers are lesions. As
of a lesion in Broca’s area (top, left), the other in Wer- expected, the lesion that includes Broca’s area is toward
Broca’s lesion Wernicke’s lesion
Central sulcus
Wernicke’s area
Broca’s area
Figure 9–4. Two CAT (computerized axial tomography) scans, one of a lesion in Broca’s area
(top, left ), and the other in Wernicke’s area (top, right ). The labeled pointers show the lesions
as darkened areas of the scans. These images were taken from slices through the brain in the
horizontal plane, as shown in the lower image. The sides of the images are flipped: the right side
of the images is the left side of the brain.
the front of the left hemisphere, and the lesion that Broca’s aphasia is called a nonfluent aphasia
includes Wernicke’s area is toward the back.1 because of the hesitations and slow, effortful speech
A different way to classify aphasia is on the basis patterns. Here is a transcript of a brief conversation
of the speech and language impairments, regardless between a speech-language pathologist and a stroke
of lesion location. For example, a patient who suffers survivor who was diagnosed with Broca’s aphasia.2
a stroke and has agrammatical, effortful speech, with
hesitations but close-to-normal comprehension skills, SLP: Can you tell us your name?
is diagnosed with Broca’s aphasia, regardless of the
Patient: John Doe (very carefully articulated).
lesion location and its extent (how much of the brain
has been damaged). This chapter does not attempt to SLP: And John, when was your stroke?
resolve the classification controversy; the presenta- P: Seven years ago (each word carefully articulated,
tion is neutral, with both probable lesion location and with an even, robotic sequence of syllables).
speech-language symptoms described.
SLP: Okay . . .
Broca’s Aphasia P: And . . . (nods, because he sees SLP wants to
continue a question).
Much of the effort to classify aphasia according to the
location of damage within the brain was inspired by SLP: And what did you used to do?
the mid- to late-19th century work of Paul Broca (1824– P: Um . . . well, um . . . worked (nodding head)
1880). Broca was a French physician who studied a . . . um . . . on a desk (this phrase spoken rapidly,
small group of patients who were able to produce only does not sound impaired) . . . um . . . seven . . .
a single syllable, or a few words but comprehended seven . . .
language with little or no problem. When the patients
died, Broca performed autopsies of their brains and Spouse: Sales.
found similars locations of brain damage in most of the P: Sales (nodding, confirming spouses assistance
patients. The damage was in the lower part of the fron- and verifying the type of work). Sales . . . and . . .
tal lobe, close to the motor cortex, just above the sylvian worldwide, and . . . very good, yeah!
fissure (see Figure 9–2). Broca suggested that the ability
to articulate syllables and words was localized to this SLP: Okay and who are you looking at over there?
part of the frontal lobe. Damage to this area resulted When you turned your head over there?
in a disorder of articulation, but not comprehension. P: That’s my . . . wife.
Broca published his results in 1865, and for more than
150 years the region he identified as the articulation SLP: And why is she helping you . . . to talk?
center of the brain has been called “Broca’s area.” P: Um . . . she’s . . . a speech . . . (forms lip for a
Broca’s aphasia is primarily an expressive lan- “p”) . . . um . . . (again, tries to position articulators
guage problem, presumably associated with damage to but no sound comes out).
Broca’s area. The patients hesitate before speaking, and
when they do speak produce single words or utterances SLP: So you have trouble with your speech?
of just a few words. Speaking seems to a listener to P: Yeah, yeah.
require a great deal of effort; word finding is a problem.
SLP: And what’s that called?
The words that are spoken are usually content words;
function words are not used, or seldomly used. Patients P: Um, phasia.
with Broca’s aphasia are also said to have agrammati-
SLP: Alright . . . and so why don’t you work now?
cism, because of the lack of function words and the
resulting lack of grammatical completeness. Compre- P: Um . . . I . . . I . . . well I do (each word separated
hension of language is good (perhaps not quite normal). by very brief pauses).
1
s used here, the term “includes” means that the lesions are larger than would be expected for lesions confined strictly to the cortical regions
A
suggested by the aphasia type (i.e., Broca’s aphasia, Wernicke’s aphasia). In other words, brain lesions are typically larger than the anatomical
tissue identified as a specific area. Also, the other dark areas on the scans, such as the curved structures in the center and toward the front,
are not lesions but other structures where you would expect the scan to show dark. The curved structures are the lateral ventricles, through
which cerebrospinal fluid flows.
2
his conversation, and all subsequent transcripts, were transcribed from YouTube clips. “John Doe” substitutes for the patient’s name.
T
Comments are enclosed in parentheses, and long pauses are indicated by ellipses ( . . . ).
SLP: What do you do now? P: We will sort right here and they’ll save their
hands right there (pointing).
P: Um . . . Voices ah home, aphasia (patient names
an organization). SLP: And what were we just doing with the iPad?
In this 1-minute exchange between the SLP and P: Uh . . . right at the moment they don’t show
the patient, the frequent hesitations (“um”), pausing darn thing (laughs).
( . . . ), word-finding problems, agrammaticism (“Sales SLP: Where’s the iPad that we were doing?
. . . and . . . worldwide, and . . . very good, yeah”), and
good comprehension (he clearly understands the ques- P: (hesitates) I’d like my change for me and change
tions) are clinical markers for a diagnosis of a mild-to- hands for me, it was happy, I would talk with Jane3
moderate Broca’s aphasia. sometimes, we’re out with them, other people are
working with them with them, I’m very happy
with them —
Wernicke’s Aphasia
SLP: Good! —
Carl Wernicke (1848–1905), a German physician, was
Broca’s equal in his influence on classification of apha- P: This girl was really (misarticulated) good. And
sia types. Wernicke studied a series of patients who happy, and I played golf, and hit a trees, we play
had great difficulty in comprehending language but out with the hands, we save a lot of hands on hold
gave no clinical evidence of a general (nonlanguage) for peoples for us, other hands, I don’t know what
hearing loss. The comprehension problem seemed to you get but I talk with a lot of hand grands . . .
be specific to language. The patients had only slight
difficulty with articulation, but their expressive lan- The 1-minute transcript shows the comprehension
guage lacked meaning and often was not responsive problem — for the most part, the patient is not respond-
to questions even though they replied (but without rel- ing to the questions with relevant answers. The patient
evant content). does a lot of talking, with well-formed speech sounds,
Autopsies of the patients’ brains showed damage a rapid speaking rate, and mostly grammatical utter-
to the upper gyrus of the temporal lobe, toward the ances, which nevertheless are nonsensical. There is
back of the sylvian fissure and close to the lower edge repetition (“hands” occurs half a dozen times in the
of the parietal lobe (see Figure 9–2). These patients had brief transcript) and one or two articulation errors (not
no apparent damage in Broca’s area. Wernicke con- transcribed here). Wernicke’s aphasia is called a flu-
cluded that the region of damage for these patients ent aphasia because utterances show little hesitation,
represented the language comprehension part of the and articulatory sequences are smoothly produced
brain. This region of the brain has since been known as and even run on with excessive utterance length. The
Wernicke’s area. diagnosis of Wernicke’s aphasia is consistent with
Here is a transcript of a brief conversation between the transcript.
a speech-language pathologist and a stroke survivor
who was diagnosed with Wernicke’s aphasia.
Conduction Aphasia
SLP: Hi John, how are you?
Fiber tracts in the cerebral hemisphere (“white mat-
Patient: I’m happy are you pretty? You look good.
ter”; see Chapter 2) connect one region of gray matter
(wants to speak more but SLP asks a question).
to another. As demonstrated by Wernicke, and others
SLP: What are you doing today? after him, a massive fiber tract called the arcuate fascic-
ulus connects Wernicke’s and Broca’s areas (Figure 9–5,
P: We stayed with the water over there at the
blue lines) (Smits, Jiskoot, & Papma, 2017). A tract con-
moment, talked with the people at the dam over
necting the same areas but lower (more ventral) in
there, they’re diving for them at the moment, he
the cerebral hemispheres is called the ventral stream.
had a-water very soon . . . for him . . . with luck . . .
These tracts run beneath the cortex — like all white
for him.
matter in the cerebral hemispheres, fiber tracts cannot
SLP: So we’re on a cruise and we’re about to — be seen on the hemisphere surfaces.
3
Jane Doe.
Arcuate fasciculus
Parietal Lobe
Wernicke’s
Area
Frontal Lobe Occipital Lobe
Broca’s Area
Temporal Lobe
Ventral stream
Figure 9–5. Two fiber tracts that connect receptive and expressive regions of the left hemisphere
(Wernicke’s region with Broca’s region). The fiber tract represented by the blue-line bundle is the arcuate
fasciculus, and the tract represented by the green-line bundle is the ventral stream. The color of the labels
matches the color of the areas (magenta, Broca’s area; blue, Wernicke’s area).
Conduction aphasia is believed to be the result of SLP: The producer asked us to count to 10, do you
a stroke that damages the arcuate fasciculus and func- remember that? Can you do that again?
tionally disconnects Wernicke’s area from Broca’s area.
P: Yes. Why. (not a question). One, two, three, four,
Wernicke’s area is not damaged, so the patient can hear
five, (rapid intake of air), seven, nnn uh, boin, too
and understand a word or sentence. Broca’s area is not
thuh, gehvry, beople, go I can’t oh I —
damaged, so words and sentences can be produced in a
normal way. Damage to the arcuate fasciculus partially SLP: No, that’s hard, let me get you started —
or completely prevents the heard utterance from being
P: I was ston —
transferred to Broca’s area for repetition. It is possible
that the behavioral characteristics of conduction apha- SLP: Okay, one, two, three (SLP is counting with
sia (i.e., expressive and receptive language; see below) the patient) —
are related to damage to other areas of the brain in
addition to the arcuate fasciculus (Yourganov, Smith, P: One, two, four, five, ss-ff-sixth, better send
Fridriksson, & Rorden (2015). poined, um . . .
The signature deficit in conduction aphasia is SLP: Good enough, let’s stop —
difficulty repeating others’ utterances, even though
language comprehension is good. Expression is often P: Is that alright?
fluent and grammatically well formed. However,
expression may also have paraphasias, which are unin- The patient clearly understands the questions,
tended errors on speech sounds, syllables, and words requesting her to count. The patient begins counting
(Ardila, 2010; Pandey & Heilman, 2016). accurately but soon produces paraphasias (e.g., “boin”
A transcript of a patient with conduction aphasia and “gehvry”) and a fluent, short, well-formed sen-
illustrates some of these characteristics. tence (“Is that alright?”).
Anomic Aphasia Global Aphasia

“Anomia” is from Greek, meaning “without name.” In Global aphasia is a nonfluent aphasia in which both
anomic aphasia, a patient is fluent but has great diffi- expressive and receptive language are impaired. The
culty with word-finding, especially for nouns, includ- patient may be able to express short, automatic-type
ing names and objects. Often, a patient with anomia phrases such as “Hello,” and “How are you?” but lon-
can describe the function of an object in great detail, ger, less automatic utterances are not produced or are
or a person’s appearance, but not be able to name the very rare. The impairment of receptive language is sim-
object or person. ilar, with occasional understanding of short, familiar
Anomia occurs in most types of aphasia — almost utterances (e.g., “how are you?”) but poor understand-
all patients, regardless of the particular kind of apha- ing of more complex language.
sia, have some anomia. Aphasia in which anomia is the In global aphasia due to stroke, the blood supply to
dominant symptom is usually mild. Comprehension is both Broca’s area and Wernicke’s area is blocked. Other
typically good in anomic aphasia. areas of the brain, such as parts of the temporal lobe
The following transcript is an example of anomia that are important for word and sentence processing,
in a patient with aphasia. The patient is looking at pic- are also affected. In fact, the entire perisylvian language
tures of four tools, including a saw, a hammer, an axe, region (see Figure 9–3) is affected, with widespread
and a screwdriver. The SLP is pointing to the saw and effects on language expression and comprehension.
asking the patient to name it. Global aphasia has been reported as the most com-
mon (Flamand-Roze, Flowers, Roze, & Denier, 2013) or
P: You knew what it is, I can’t tell you, maybe I third most common (Hoffman & Chen, 2013) type of
can . . . If I was to carry the wood and cut it in half aphasia. Global aphasia is often diagnosed shortly after
with that . . . you know . . . if I had to cut the wood a stroke.
down to bring it in . . . As a patient recovers from overall stroke symp-
toms, language recovery may also occur. Global apha-
SLP: pointing to the saw: Then you’d use one of
sia may “evolve” to different types of aphasia as the
these . . .
patient recovers (Klebic, Salihovic, Softic, & Salihoic,
P: It’s called a . . . I have them in the garage . . . 2011). Global aphasia may continue to change to other
they are yee-ar . . . you cut the wood with them . . . aphasia types along the pathway to language recov-
it . . . sssaw! ery. A patient whose global aphasia evolves to func-
tional expressive and receptive skills is likely to have
SLP: How is it? (asking patient to repeat the name residual anomia ranging in severity from mild to
of the saw). severe. Patients with global aphasia who do not show
P: Unnn . . . (sigh) . . . sss . . . sah (cuts off the end improvement within a month following a stroke have
of the vowel quickly) . . . sah (cuts off again) . . . a poor prognosis for future improvement of language
I can’t state it. I know what it is and I could cut the skills (Alexander & Loverso, 1992).
word with it and it’s in my garage . . . and it’s . . . This transcript illustrates global aphasia in a
a . . . and also when you go out and you want to patient who had recently suffered a stroke.
stay out in the woods with people you always take
those with you and you . . . um . . . it is a . . . it’s not {R1 = Relative 1} {R2 = Relative 2}
a knife, it’s not a . . . sssah (vowel quickly cut off), R1: Alright, say hi!
I know what it is, I think.
P: (Smiling) Hi!
SLP: Well finish saying what you started to say.
R1: What’s your name?
P: I’m not sure I’m stating I right. P: What’s what?
SLP: Try again. R1: What’s your name?
P: Ssssss. No, I’m incorrect. P: Uh . . .
R1: Do you know?
The patient knows what the object is, what it does,
and why you use it. The patient recognizes his inability P: . . . John?
to retrieve the name (“I can’t state it,” “I’m not sure I’m
R1: John.
stating it right,” “No, I’m incorrect”). Several times he
almost retrieves the word (“sssah” [cut off], “ssssss”). P: Yeah, John.
R1: What’s my name? result of dementia (see section, “Language Disorders in

Dementia”). Throughout the course of PPA, dementia-
P: Wha- what’s whose name? (weakly).
like symptoms appear, but the language impairment
R1: My name (pointing to herself). remains a significant characteristic and barrier to daily
function (Montembeault, Brambati, Gorno-Tempini, &
P: Uh . . . hard to get ‘em all (weakly).
Migliaccio, 2018).
R1: Yeah. Jane. A person with PPA may have speech and lan-
guage characteristics similar to those of Broca’s
P: Yeah.
aphasia, including dysarthria and apraxia of speech
R1: Yeah. (Gross-man, 2012; Chapter 14); or may have specific
difficulties with word retrieval and word comprehen-
R2: Can you say Jane?
sion, specifically for low-frequency (unusual) words; or
P: Yeah (weakly). may have sentence repetition and comprehension
impairments together with reading and written-lan-
R2: Say it.
guage problems (Henry & Grasso, 2018). The specific
P: Nndeer (unintelligible, weakly). nature of the language impairment may suggest one
of three different types of PPA (Montembeault et al.,
R2: Jaaane.
2018).
P: I can’t say it . . . (weakly). Two different types of PPA are transcribed here.
R1: That’s okay.
Transcript 1
P: Takes me a little longer to say . . .
SLP: We can speak about your concerns . . . .hi
R1: Do you know where you are? Jane!
P: Do I know who it is? P: How are you?
R1: Do you know where we are? SLP: I’m fine! So, tell me something . . . are you,
um, are you working on your speaking?
P: Uh . . . no you live in uh . . . in the . . . park.
P: Yes.
R1: Close! We’re at the hospital.
SLP: And what are you working on, specifically,
P: Yeah.
do you know?
R1: Yeah.
P: Uuh-tho (shakes head, means to say “I don’t
know”).
The patient’s anomia is apparent in his struggle to
retrieve his own name and his daughter’s name. The SLP: Okay, so you don’t know, can you say “I,”
content of this transcript suggests a mild-to-moderate “don’t,” “know”? (slowly spoken, separating each
expressive component (R2: “Can you say Jane? P: Yeah word with a short pause).
(weakly). R2: Say it. P: Nndeer (unintelligible, weakly)” P: I tho.
as well as a mild-to-moderate comprehension impair-
ment (e.g., R1: “What’s my name? P: Wha — What’s SLP: Okay, now let’s go a little bit slower, because
whose name?”). you’ve got a little bit of a dysarthria.
P: Yeah.
Primary Progressive Aphasia SLP: And a little bit of an apraxia (P shakes head
“yes”), for your speech to come out really nice,
Primary progressive aphasia (PPA) is a rare adult we’ve got to slow it down and to put a nice space
language disorder in which a patient has an isolated between each word. Now watch me, see if you can
language problem that increases in severity over time do this: “I.”
(Mesulam, 2018). PPA does not appear to be the result
of a stroke, but rather a separate disease in which P: “I.”
deterioration of brain tissue is specific to language SLP: “Don’t.”
areas and networks. In the initial stages, the language
P: Own’t.
impairment is not accompanied by significant memory
or psychiatric problems and does not appear to be a SLP: Know.
P: (Makes face as if saying this word is going to be P: Yep.

difficult) “Owe.”
R1: Rrrrrrrr.
SLP: Good. Can you say that again? P: Rickford.
P: I tho doh. R1: Good. And, where did you meet your wife?
SLP: Good. Very good. And where do you live? P: My wife.
P: Um . . . Iss? (answered as if uncertain, with R1: Who’s your wife?
rising pitch).
P: (Looks at his spouse).
SLP: Do you know the name of the state where
you live? R1: Is it that lady over there, is that your wife?
P: Yeah. P: Mmm-hmm.
R1: And what’s her name?
SLP: What state is it?
P: Jane.
P: Biss? (again as if uncertain).
R1: Good! And then you guys moved to Arkansas,
SLP: Biss (very precisely, as if asking P if that’s
where did you live in Arkansas?
what he meant).
P: (Looks around, doesn’t answer, looks to
P: No (shaking head, meaning that’s not what he spouse).
meant).
R2: We lived in . . .
SLP: No, Miss!
R1: In Little Rockstar?
P: Yes.
P: Little Rockstar.
SLP: Oh, so you shorten it, don’t you, because it’s
got a lot of syllables in it. The patient in Transcript 1 has poor articulatory
P: Yes! skills (the sound errors in the transcript, e.g., “doh” for
“know”) and possibly poor ability to transform sounds
Transcript 2 as represented in the brain to sounds produced by the
speech mechanism. This latter problem is often referred
R1: Okay. Hey Dad, can you look at me? to as an impairment of phonological coding, which can
P: (unintelligible, does not turn to look). be independent of the ability to articulate sounds. It
is best illustrated by the patient’s labored attempts to
R1: Can you tell me your name?
position the articulators to initiate an utterance, as if
P: John Doe (spoken rapidly, with good articula- searching for the correct position. The patient’s com-
tion of sounds). prehension skills are good — she responds to questions
with no apparent problem.
R1: And how old are you?
The patient in Transcript 2 has no problems with
P: Sixteen. articulatory skills — his speech is fluent and easy to
understand. But there are word-finding problems (the
R1: When were you born?
several examples of not responding and searching for
P: . . . . . . 19xx. a response, particularly of proper nouns such as the
name of towns or of his spouse), which are solved
R1: Good! And where were you born? (P looking
when he is given a prompt (e.g., “In Little Rockstar”?
around the room as if he doesn’t understand the
and his immediate response, “Little Rockstar”). He
question; long pause). Do you know where you
may also have a comprehension impairment.
were born?
R2: In the hospital (laughter from other family
members). Apraxia of Speech
R1: Where-where did you grow up? Apraxia of speech is discussed in Chapter 15 as a child-
hood speech sound disorder. As noted there, the diag-
P: (No response).
nosis of childhood apraxia of speech (CAS) is based in
R1: Do you remember where you went to school? large part on speech characteristics in adults with brain
damage due to stroke and other brain injuries. These I have that I’m gonna remember, I’m gonna . . .
characteristics include difficulty initiating speech, artic- have this meh, ber, my (syllable cut off quickly),
ulatory “groping,” in which the patient has difficulty the word wrong. It’s struggity, wait my studiservist
positioning articulators, such as the lips and tongue, . . . ssstext derity, and . . . that’s my hand, and . . .
to produce a speech sound correctly, and speech sound that’s in my left hand (again, holding up right
errors, especially for phonetically complex words. The hand) and I’m right-handed. So . . . if you notice
patients recognize their own speech sound errors and I really don’t have a lot of . . . uh . . . makeup,
make attempts to correct them. An example of this was I tried it fir-fir, th, the-fir first time today (syllable
noted above in the transcript for the second patient repetitions in preceding phrase spoken very
with PPA. rapidly), a lot of that I couldn’t I couldn’t write,
Apraxia of speech in adults (“AAS” to distinguish du-couldn’t write, I couldn’t write, I even tried
it from “CAS”) is often observed in aphasic patients and my hair spr- my . . . my . . . umm . . . my . . .
is considered by some to be a type of aphasia. Others, bandana . . . umm . . . but I had trouble with it,
perhaps most clinicians and scientists, consider AAS to so, . . . um . . . but that’s, that’s the little pit of the
be strictly a motor speech disorder related to an impair- problem, uh, my biggest problem is . . . with my
ment of programming (planning) sound sequences. In speech. Sooo, I started with, therapy, just last
this view, correct speech sounds are represented in the week — this week. So, Monday I had . . . just,
brain but are subject to errors in planning their articula- um, an evaluation, and then, yesterday, Wednes-
tion and sequencing. . For example, phonemes for the day, . . . and then, hockey therapist, OT, I can’t
word “strategy,” /s/, /t/, /r/, /ae/ (as in “hat”), /d/, really pronounce it, opcheetherah . . . um, ah,
/ə/ (a very short “uh”), /dg/ (as in “fudge”), and /i/, oxeetelabis, and then there’s, um, filicul, filucul,
are represented in the brains of people with AAS in therapis.
the same way as they are represented in the brains of
people without brain damage. The speech impairment The individual recognizes her speech sound errors
in AAS is an incorrect plan for how these sounds are (attempts to correct “struggity” and “studiservist,”
sequenced, the muscle contraction patterns to produce and the several attempts to produce “occupational”).
the sequence, and other variables such as the timing of Simple words are well articulated, but multisyllabic,
different sounds. A word like “strategy,” which is pho- phonetically complex words such as “struggle,” “dex-
netically complex with its word-initial, three-consonant terity,” and “occupational” are not. For the most part,
cluster /str/ and another complex sound, /dg/, is par- when the individual is not making speech sound errors,
ticularly challenging for people with AAS. Clinicians her speech does not sound abnormal.
and scientists who regard AAS as a speech motor dis-
order argue that phonetically complex words are more
susceptible than phonetically simple words (such as Aphasia Due to Stroke: A Summary
“sag”) to speech motor planning disorders.
The view of AAS as an aphasia is not that pho- Language disorders in adults are a common impair-
neme sequencing is impaired, but rather that the pho- ment following stroke. The aphasia types summarized
neme representation and/or the ability to select the in this chapter are in common use among SLPs and
correct phonemes is impaired. This view does not aphasia researchers. However, aphasia types are not
regard AAS as a motor speech disorder. For example, always clearly defined — speech and language char-
the /s/ in strategy may be incorrectly represented by a acteristics of two or more types may be observed in a
/t/, resulting in a word production like “trategy.” The single patient.
patient could produce the /s/ well if the representation Indeed, “pure” aphasia types are unusual. For
of the sound in the brain was intact. Also, the process of example, patients diagnosed with Wernicke’s aphasia
selecting the sounds — a stage of word production pre- are likely to have some language impairment expected
ceding the programming of sound sequences — may be from other aphasia types; the same is true for other
impaired. Eling (2016) provides a review of the debate diagnoses (e.g., Broca’s aphasics having some compre-
concerning the nature of AAS. hension deficits). Perhaps a better way to think about
The following transcript of a patient with AAS fol- aphasia types is that any individual type reflects the
lowing a stroke is taken from a monologue in which primary deficit, with recognition that other language
she describes her speech and other impairments: deficits are likely as well. Global aphasia may be the
endpoint of these multiple language deficits — all
P: The two areas that I have problems with, um aspects of language expression and comprehension
. . . is in my left hand (holds up her right hand), are affected by a stroke that damages most perisyl-
vian speech and language areas. With this in mind,

Table 9–1 provides a summary of speech and language It just so happens that the brains of
characteristics in the “classic” aphasia types, as previ- two of Broca’s patients have been
ously discussed. preserved. These brains, located in a
Figure 9–6 presents this idea in schematic (sim- Paris museum until 2016 and now still in Paris
ple) form. One arrowhead is in Broca’s area, the other but at a different location, have been scanned by
arrowhead in Wernicke’s area. Localized lesions in the researchers using modern imaging technology.
far anterior region of the perisylvian language area are When Dronkers, Plaisant, Iba-Zizen, and Canabis
likely to result in an aphasia with a primary impair- (2007) scanned these brains, the lesions were not
ment in expression. Localized lesions in the far poste- restricted to the classical, relatively small cortical
rior region of the perisylvian language area are likely regions named Broca’s area. In fact, the lesions
to result in an aphasia with a primary impairment were much larger, extending into middle struc-
in comprehension. Lesions between these two end- tures including white matter and specifically
points, within the perisylvian language area, are likely the arcuate fasciculus, the tract that connects
to produce a “mixed” aphasia, in which expressive Wernicke’s and Broca’s areas. Research findings
and receptive impairments are observed to varying like this suggest caution in linking expressive or
degrees. Aphasiologists often use the terms “anterior receptive speech and language deficits to damage
lesion” and “posterior lesion” to indicate the general in the classical, cortical brain areas associated
location of damage because these terms are less restric- with the work of Drs. Broca and Wernicke.
tive than “Broca’s area” and “Wernicke’s area.”
Table 9–1. Summary of Speech and Language Characteristics in Aphasia Types
Aphasia Type Speech and Language Characteristics

Broca’s aphasia Expressive aphasia — nonfluent
Slow effortful speech
Agrammatism
Good comprehension
Anomia
Wernicke’s aphasia Receptive aphasia — fluent

Poor comprehension of language in the absence of hearing
loss
Articulation close to normal
Fluent, expressive speech lacks meaning
Not specifically responsive to questions
Conduction aphasia Good receptive and expressive skills

Impaired repetition
Paraphasias for speech sounds, syllables, and words
Anomia
Global aphasia Expressive and receptive aphasia — nonfluent

Short, automatic phrases may be preserved both in expression
and comprehension
Poor comprehension of complex language
Anomia
Primary progressive Like Broca’s aphasia (one type)

aphasia Poor word retrieval and comprehension (one type)
Sentence repetition and comprehension impairments
Problems with reading and written-language skills
Parietal Lobe
Wernicke’s
Area
Frontal Lobe Occipital Lobe
Broca’s Area
Temporal Lobe
Figure 9–6. The concept of anterior and posterior lesions. See text for details.
Traumatic Brain Injury and Aphasia regained, a period of confusion is typical. During this
time, the individual may be confused about where he
The definition of a traumatic brain injury (TBI) is “a is, how or even if an injury occurred, and personal
bump, blow or jolt to the head, or penetrating head information. The individual’s language may reflect
injury, that results in disruption of the normal function this confusion. The person may not be able to speak for
of the brain” (Marr & Coronado, 2004). TBI is a major a short period, may speak in disconnected sentences
health concern. According to a fact sheet published by with little content, and may have comprehension prob-
the Centers for Disease Control and Prevention (CDC), lems. After the period of confusion, most individuals
in 2013 there were approximately 2.8 million emer- improve steadily; this includes significant improve-
gency room visits for head injuries in the United States ment in language abilities.
(https://www.cdc.gov/traumaticbraininjury/get_the_
facts.html). Both adults and children are included in
this estimate. Around 80% of people who are seen in an Nature of Brain Injury in TBI
emergency room for a TBI are diagnosed with a mild
injury (Douglas et al., 2019). In this section, the focus Closed-head injuries are associated with brain dam-
is on closed head injuries, in which there is no pene age that is different from that observed in stroke. The
tration of the skull and brain by a flying object such as “bump or jolt” to the head causes the cerebral hemi-
a bullet. spheres to slam into the bony casing of the skull. In the
The evolution of TBI symptoms varies widely case of a powerful blow to the head, the cerebrospinal
across individuals and depends on factors such as age, fluid in which the brain floats is not sufficient to protect
severity of the injury, and the consciousness status of the soft tissues of the brain. The damage to brain tissue
the patient immediately after the trauma. A TBI often is often in the frontal lobes where the impact may be
evolves as a sequence of events from the time of injury greatest due to the forward, high acceleration of the
to near or full recovery of function. cerebral hemispheres in response to the blow to the
If the individual is unconscious after the head head. Structures of the temporal lobe are also likely to
injury, the time from the trauma to regaining conscious- be damaged in a closed head injury.
ness may vary from very short (e.g., a few minutes) to The blow to the head may cause the cerebral hemi-
very long (e.g., a week or more, or, in some cases, a spheres to twist inside the skull at high accelerations.
permanent comatose state). When consciousness is This results in shearing injuries to neural tissue. Shear-
ing injuries occur in axons, which are stretched and of language, is a prominent characteristic of the early
rotated so violently that they are torn. The injuries to phases of language recovery in TBI. Like morphology
axons affect the integrity of white matter. Gray mat- and syntax, anomia improves over time but may per-
ter areas, such as in the cortex or in subcortical nuclei, sist past the time when other structural language skills
are partially disconnected by these shearing injuries, have returned to normal.
which have profound consequences for the brain net-
works serving both cognitive and high-level language
Pragmatics (Social Use of Language)
functions (Douglas et al., 2019).
Language use for social communication is a primary
area of concern in adults with TBI. Higher-level, dis-
Language Impairment in TBI course-type language skills may be impaired, even
when structural components of language are intact.
Speech and language impairments are common in Comprehension of metaphors (“A veil of secrecy sur-
TBI. The impairments are typically most severe imme- rounded the committee’s work”) and figurative lan-
diately after the trauma and improve over time. The guage (“He inhaled his lunch”) may be impaired.
improvement often brings the patient back to “normal” Persons with TBI may not be able to narrate a story
language skills. in a coherent way. Individual utterances of the story
Language skills in TBI are often assessed by formal may not be sequenced correctly, and information that
tests used to evaluate individuals who have suffered a is not relevant to the story may be included in the nar-
stroke and have aphasia. These tests evaluate language rative; story details may be repeated several times.
skills such as word recall, naming of objects, and pro- As summarized by Vas et al. (2015), there are
duction and comprehension of syntax. According to a many other aspects of social language use: “we engage
different perspective, as the patient recovers from the in conversational discourse during speaker–listener
injury formal tests of aphasia may suggest a return to interactions and social exchanges, we use descriptive
normal language skills but miss a continuing language discourse to explain attributes and features of an object,
impairment of social language use. Language skills in we use narrative discourse to describe an event, we use
the area of phonology, morphology, syntax, and to a procedural discourse to explain a task procedure, and
large degree, content, are evaluated by these aphasia we use expository discourse to inform a listener of a
tests as normal or near-normal (Steel, Ferguson, Spen- topic through facts or interpretations” (p. 499). Narra-
cer, & Togher, 2015; Vas, Chapman, & Cook, 2015). The tive language impairments have been demonstrated in
remaining language impairment, potentially having individuals with moderate TBIs whose structural lan-
substantial functional consequences, is of higher-level guage skills are not impaired (Marini, Zettin, & Galleto,
skills such as expressive discourse, comprehension of 2014). When words, individual sentences, and gram-
the overall message of discourse, and social language mar appear to be normal, and people are not aware that
use (language pragmatics). the person they are interacting with has a TBI, it is easy
to understand how a social language impairment can
be a major challenge to everyday life.
Structural Components of Language
Social language use requires a synthesis of cogni-
Structural components of language include phonology, tive and linguistic skills (Steel, Ferguson, Spencer, &
morphology, syntax, and content (reviewed in Chapter Togher, 2017). Cognitive skills such as memory, pro-
3). Dysarthria, a motor speech disorder caused by dam- cessing speed, flexibility in processing, attention, and
age to the speech motor control centers of the brain, inhibition are joined with linguistic skills to make social
is not considered a phonological disorder, but may language use as effective as possible (Vas, Chapman, &
affect speech sound production in persons with TBI. Cook, 2015). An example of the interaction and mutual
Although dysarthria is not a language disorder, it is dependence of cognitive and linguistic skill is in turn
important to mention because about 33% of individu- taking during conversation. Linguistic skills of follow-
als with TBI have dysarthria (Beukelman, Nordness, & ing content and knowing when an individual is about
Yorkston, 2011). AAS following a head injury appears to finish a contribution to a discussion are joined to
to be rare (Cannito, 2014). attention to the structure of the conversation, memory
Morphology and syntax are often affected in the of what has been said, and inhibition of interrupting
initial phases of recovery from TBI. For all but the the speaker before he or she is finished.
most severe cases, morphological and syntactical skill Damage to the frontal lobes, and its connections
improve over time and are likely to return to normal with almost all other area of the brain, is almost cer-
use. Anomia, an impairment of the content component tainly the basis of impaired social communication in
individuals with TBI. A healthy frontal lobe is critical pathology is provided. More detailed accounts of brain
to everyday functioning because of its central role in pathologies in dementia are available in the scientific
executive function of the brain. literature.
Executive function is important to many aspects Speech and language disorders are a prominent
of behavior, among which are several of the cognitive characteristic of dementia. To some degree, the specific
skills mentioned previously, such as memory, process- characteristics of a speech and language impairment,
ing speed, and attention. Executive function directs the and the way they change throughout the course of the
brain to pay attention to certain stimuli, and not oth- disease, are different for the different types of dementia.
ers, directs the brain to adjust its sensitivity to stimuli,
controls impulsive behavior, and is critical to organi-
zational and emotional skills (Wood & Worthington, Brain Pathology in Dementia
2017). Impaired executive function in TBI, due to dam-
age to the frontal lobes, affects an individual’s ability to The brain pathology in AD includes an extracellular (on
use language effectively in social situations. the outside of a brain cell) accumulation of a substance
called amyloid, and an intracellular (inside the cell)
accumulation of neurofibrillary tangles which are accu-
Dementia mulations of proteins that prevent normal function of
a neuron (Kumar, Kumar, Keegam, & Deshmuk, 2018).
The CDC estimates that 5.7 million people in the These abnormal accumulations increase over time and
United States are living with Alzheimer’s disease (AD), cause dysfunction and eventual death of neurons.
the leading cause of dementia (https://www.cdc.gov/ In vascular dementia, localized or general regions
chronicdisease/resources/publications/aag/alzheim- of the brain are destroyed when blood flow to these
ers.htm). AD is most often diagnosed in people above regions is limited or blocked completely (hence, the
the age of 65 years. The growing number of elderly term “vascular” dementia). The lesions created by the
people in the United States suggests a future with ever- loss of blood flow are the cause of many of the demen-
increasing diagnoses of AD (https://www.cdc.gov/ tia problems mentioned previously. Memory problems
chronicdisease/resources/publications/aag/alzheim- are typically absent in this type of dementia, at least
ers.htm). at the beginning of the disease. O’Brien and Thomas
In a 2011 to 2013 review of Medicare data, (2015) have published an excellent review of vascular
Alzheimer’s disease accounted for roughly 45% of all dementia and the debate surrounding the diagnosis.
diagnosed dementias. The other subtypes of demen- Lewy body dementia is a broad category of differ-
tia, such as vascular dementia, Lewy body dementia ent dementias. Lewy bodies are pathological clusters of
(as well as Parkinson’s disease dementia), and fron- proteins within nerve cells. Symptoms of Lewy body
totemporal dementia, had substantially lower preva- dementia include cognitive impairment, visual diffi-
lence compared with AD (Goodman et al., 2017) but culties including hallucination, thinking and reasoning
still accounted for slightly more than half of all diag- impairments, and confusion that varies within a day
nosed dementias. Thus, the total number of people or from day to day (Walker, Possin, Boeve, & Aarsland,
in the United States living with dementia is probably 2015). At the outset of Lewy body dementia, very few
in excess of 10 million. Worldwide the prevalence of patients have speech and/or language impairments,
dementia and its impact on families is staggering. but with progression of the disease, language impair-
Dementia has many different behavioral charac- ments are likely.
teristics which depend to some degree on the specific Frontotemporal dementia (FTD) is a complex
type of the disease. Problems are observed in memory, brain disease that takes one of several forms (Olney,
general cognition, executive function, psychiatric dis- Spina, & Miller, 2017; Mesulam, 2018). Progressive pri-
turbance, depression, and agitation. These all lead mary aphasia, discussed previously, is considered to
to profound effects on everyday life functions. In all be a form of FTD, or at least to progress over time to
cases of dementia, the underlying brain disease is pro- FTD. Regardless of the form, the brain pathology com-
gressive and irreversible. Death of neurons either in mon to a diagnosis of FTD is atrophy of the frontal and
specific regions of the brain or throughout the brain is temporal lobes (Bang, Spina, & Miller, 2015). The ini-
the cause of the dementia symptoms and their worsen- tial signs of an FTD that does not begin with language
ing over time. deficits are primarily behavioral, including personality
For each of the dementias described later in this changes, loss of inhibition, and impairment of execu-
chapter, a general statement of the underlying brain tive function.
Language Disorders in Dementia “Classification of Aphasia.” Like the nonlanguage

symptoms of dementia, the speech-language symp-
Speech-language characteristics in several types of toms evolve during the progression of the disease and
dementia are listed in Table 9–2. Most of these char- may not be the same at two points in time. Speech and
acteristics have been discussed in the earlier section language impairments may be observed in the early
Table 9–2. Selected Speech and Language Characteristics of Several Types of Dementia
Type of Dementia Speech and Language Characteristics

Alzheimer’s disease Anomia
Semantic paraphasias (e.g., “toes” for “hand”)
Poor word comprehension
Reduced word fluency (poor ability to name as many animals as
possible)
Loss of narrative cohesion
Relatively preserved phonological and syntactical skills
Formulaic speech preserved (“Hi, how are you?”; “Excuse me”)
PPA (FTD): Nonfluent Nonfluent speech (effortful, slow)

aphasia Agrammatism
Phonemic paraphasias
Impaired repetition
Late-stage mutism
PPA (FTD): Semantic Fluent, empty speech

dementia Semantic paraphasias (e.g., “toes” for “hand”)
Severe anomia
Intact automatic speech (“Hi, how are you?”)
Loss of meaning in both expression and comprehension
Preservation of syntax
PPA (FTD): Logopenic Poor word retrieval

Frequent pausing (presumably searching for words)
Impaired repetition
FTD: (Behavioral) Loss of narrative cohesion
Vascular Loss of phonemic fluency (poor ability to name things when

required to confine the names to a single beginning sound)
Reduction of expressive grammatical complexity
Lewy body dementia Early preservation of language skills

Later narrative incoherence
Anomia
Reductions in word fluency
Note. Not all characteristics are seen in a specific type of dementia. FTD = frontotemporal dementia; PPA
= primary progressive aphasia. Based on information in “Cognition, Language, and Clinical Pathological
Features of Non-Alzheimer’s Dementias: An Overview,” by J. Reilly, A. Rodriguez, M. Lamy, and J. Neils-Strun-
jus, 2010, Journal of Communication Disorders, 43, pp. 438–452; “Connected Speech in Neurodegenerative
Language Disorders: A Review,” by V. Boschi, E. Catricalà, M. Consonni, C. Chesi, A. Moro, and S. F. Cappa,
2017, Frontiers in Psychology, 8, p. 269; Steel, Ferguson.
stages of dementia or become apparent with disease increases in severity over time; the brain disease selec-
progression. The possibility that type-specific impair- tively affects the frontal and temporal lobes, including
ments of speech and language may contribute to a speech and language tissue in those lobes.
differential diagnosis of dementia type is intriguing Apraxia of speech is thought by some to be a
(Reilly, Rodriguez, Lamy, & Neils-Strunjus, 2010). The motor speech disorder and by others to be an apha-
differential diagnosis of type of dementia is not always sia (or both); it is most commonly regarded as a motor
agreed upon, even by experienced medical profession- speech disorder in which the problem is poor speech
als. For example, the diagnostic distinction between motor planning.
Alzheimer’s disease and vascular dementia may be Aphasia is a frequent result of TBI, but the expres-
particularly challenging. Speech-language pathologists sive and receptive language deficits often do not easily
can have an important role in sharpening diagnostic fit the aphasia types discussed in the chapter; as peo-
distinctions in the dementias. ple recover from TBI, their structural language deficits
resolve, but in many cases, a social language use dis-
order remains.
Dementia takes several different forms, including
Chapter Summary
Alzheimer’s disease, vascular dementia, Lewy body
disease, and frontotemporal dementia.
Stroke, TBI, and dementia are common causes of acquired, Language deficits in dementia are common and
adult language disorders. vary depending on the type of dementia.
Language disorders in adults may occur because
of damage to cortical and subcortical structures asso-
ciated with speech and language skills, or as a result References
of connections between these structures, or both; these
structures are located (for most people) in the perisyl- Alexander, M. P., & Loverso, F. (1992). A specific treatment for
vian speech and language area, which includes parts of global aphasia. Clinical Aphasiology, 21, 277–289.
Ardila, A. (2010). A review of conduction aphasia. Current
the frontal, temporal, and parietal lobes.
Neurology and Neuroscience Reports, 10, 499–503.
Aphasia is an impairment of language expression Bang, J., Spina, S., & Miller, B. L. (2015). Non-Alzheimer’s
and/or comprehension, resulting from brain damage; dementia 1: Frontotemporal dementia. The Lancet, 386,
aphasia can be the result of a stroke, TBI, or dementia. 1672–1682.
Stroke, which results from a loss of blood flow to Beukelman, D. R., Nordness, A., & Yorkston, K. M. (2011).
regions of the brain, may result in an expressive, recep- Dysarthria associated with traumatic brain injury. In K.
tive, or mixed (both expressive and receptive) aphasia. Hux (Ed.), Assisting survivors of traumatic brain injury: The
There are several types of aphasia, including Bro- role of speech-language pathologists (2nd ed., pp. 185–226).
ca’s aphasia, Wernicke’s aphasia, conduction aphasia, Austin, TX: Pro-Ed.
anomic aphasia, global aphasia, and primary progres- Boschi, V., Catricalà, E., Consonni, M., Chesi, C., Moro, A., &
sive aphasia. Cappa, S. F. (2017). Connected speech in neurodegenera-
tive language disorders: A review. Frontiers in Psychology,
Broca’s aphasia is a nonfluent, mostly expressive
8, 269. https://doi.org/10.3389/fpsyg.2017.00269
aphasia likely due to anterior damage in the perisyl- Cannito, M. P. (2014). Clinical assessment of motor speech
vian speech and language areas. disorders in adults with concussion. Seminars in Speech and
Wernicke’s aphasia is a fluent, mostly receptive Language, 35, 221–233.
aphasia, likely due to posterior damage in the perisyl- Douglas, D. B., Ro, T., Toffoli, T., Krawchuk, B., Muldermans,
vian speech and language areas. J., Gullo, J., . . . Wintermark, M. (2019). Neuroimaging of
Conduction aphasia is a fluent aphasia in which traumatic brain injury. Medical Sciences, 7, 2.
comprehension and expression are more or less intact Dronkers, N. F., Plaisant, O., Iba-Zizen, M. T., & Cabanis, E.
(Broca’s and Wernicke’s areas are unaffected by the A. (2007). Paul Broca’s historic cases: High resolution MR
stroke), but repetition is impaired because of damage imaging of the brains of Leborgne and Lelong. Brain, 130,
to the arcuate fasciculus, the fiber tract that connects 1432–1441.
Eling, P. (2016). Broca’s faculté du langage articulé: Language
Broca’s and Wernicke’s areas.
or praxis? Journal of the History of the Neurosciences, 25,
Global aphasia is a nonfluent aphasia in which 169–187.
both expressive and receptive language is impaired; it Ellis, C., & Urban, S. (2016). Age and aphasia: A review of
is usually the result of both anterior and posterior dam- presence, type, recovery, and clinical outcomes. Topics in
age in the perisylvian speech and language area. Stroke Rehabilitation, 23, 430-439.
Primary progressive aphasia is not due to stroke Flamand-Roze, C., Flowers, H., Roze, E., & Denier, C. (2013).
but often begins as an isolated language problem that Diagnosis and management of language impairment
in acute stroke. In E. Holmgren & E. S. Rudkilde (Eds.), O’Brien, J. T., & Thomas, A. (2015). Vascular dementia. The
Aphasia: classification, management practices, and prognosis Lancet, 386, 1698–1706.
(pp. 91–114). New York, NY: Nova Science. Olney, N. T., Spina, S., & Miller, B. L. (2017). Frontotemporal
Goodman, R. A., Lochner, K. A., Thambisetty, M., Wingo, T., dementia. Neurologic Clinics, 35, 339–374.
Posner, S. F., & Ling, S. M. (2017). Prevalence of dementia Pandey, A. K., & Heilman, K. M. (2016). Conduction aphasia
subtypes in U.S. Medicare fee-for-service beneficiaries, with intact visual object naming. Cognitive and Behavioral
2011–2013. Alzheimer’s and Dementia, 13, 28–37. Neurology, 27, 96–101.
Grossman, M. (2012). The non-fluent/agrammatic variant of Reilly, J., Rodriguez, A., Lamy, M., & Neils-Strunjus, J. (2010).
primary progressive aphasia. Lancet Neurology, 11, 545–555. Cognition, language, and clinical pathological features of
Henry, M. L., & Grasso, S. M. (2018). Assessment of individu- non-Alzheimer’s dementias: An overview. Journal of Com-
als with primary progressive aphasia. Seminars in Speech munication Disorders, 43, 438–452.
and Language, 39, 231–241. Smits, M., Jiskoot, L. C., & Papma, J. M. (2017). White matter
Hoffmann, M., & Chen, R. (2013). The spectrum of aphasia tracts of speech and language. Seminars in Ultrasound CT
subtypes and etiology in subacute stroke. Journal of Stroke and MRI, 35, 504–516.
and Cerebrovascular Diseases, 22, 1385–1392. Steel, J., Ferguson, A., Spencer, E., & Togher, L. (2015). Lan-
Klebic, J., Salihovic, N., Softic, R., & Salihovic, D. (2011). guage and cognitive communication during post-trau-
Aphasia disorders outcome after stroke. Medical Archives, matic amnesia: A critical synthesis. NeuroRehabilitation,
65, 283–286. 37, 221–234.
Kumar, K., Kumar, K., Keegam, R. M., & Deshmuk, R. (2018). Steel, J., Ferguson, A., Spencer, E., & Togher, L. (2017). Lan-
Recent advances in the neurobiology and neuropharma- guage and cognitive communication disorder during
cology of Alzheimer’s disease. Biomedicine and Pharmaco- post-traumatic amnesia: Profiles of recovery after TBI
therapy, 98, 297–307. from three cases, Brain Injury, 31, 1889–1902.
Marini, A., Zettin, M., & Galleto, V. (2014). Cognitive corre- Vas, A. K., Chapman, S. B., & Cook, L. G. (2015). Language
lates of narrative impairment in moderate traumatic brain impairments in traumatic brain injury: A window into
injury. Neuropsychologia, 64, 282–288. complex cognitive performance. Handbook of Clinical Neu-
Marr, A. L., & Coronado, V. G. (2004). Central nervous system rology, 128, 497–510.
injury surveillance data submission standards — 2002. Centers Walker, Z., Possin, K. L., Boeve, B. F., & Aarsland, D. (2015).
for Disease Control and Prevention. Atlanta, GA: National Non-Alzheimer’s dementia 2: Lewy body dementia. The
Center for Injury Prevention and Control. Lancet, 386, 1683–1697.
Mesulam, M. M. (2018). Slowly progressive aphasia without Wood, R. L., & Worthington, A. (2017). Neurobehavioral
generalized dementia. Annals of Neurology, 11, 592–598. abnormalities associated with executive dysfunction after
Montembeault, M., Brambati, S. M., Gorno-Tempini, M. L., traumatic brain injury. Frontiers in Behavioral Neuroscience,
& Migliaccio, R. (2018). Clinical, anatomical, and patho- 11, 195. https://doi.org/10.3389/fnbeh.2017.00195
logical features in the three variants of primary progres- Yourganov, G., Smith, K. G., Fridriksson, J., & Rorden, C.
sive aphasia: A review. Frontiers in Neuroscience, 9, 692. (2015). Predicting aphasia type from brain damage mea-
https://doi.org/10.3389/fneur.2018.00692 sured with structural MRI. Cortex, 73, 203–215.
10
Speech Science I
Introduction plify the actual workings of each part, but they pro-
vide a useful organizing perspective. The anatomical
The term “speech science” is used to designate the dis- structures and functions are as follows: (a) the respira-
cipline in which normal (typical) processes of speech tory system, which is the power supply for speech; (b) the
production, acoustics, and perception are studied. An larynx, which by using airflow from the respiratory
underlying body of knowledge for these areas of study system to vibrate the vocal folds is the primary sound
includes the anatomy of the speech and hearing mech- source for speech, and (c) the upper and nasal airways,
anism and an appreciation for the basic principles of which by movements of structures modify the sound
acoustics. The value of this knowledge is similar to the source to form different speech sounds. Detailed presenta-
requirement of full understanding of anatomy, biology, tions of speech anatomy and physiology are available
chemistry, and physics for individuals preparing for a in Zemlin (1997) and Hixon, Weismer, and Hoit (2020).
career in clinical medicine, or the value of anatomy and Brain anatomy and physiology for speech production
movement in the training of physical therapists. The and hearing, which is often included as part of speech
current chapter presents information on anatomy and and hearing science, is covered in Chapter 2.
physiology of the speech mechanism. The acoustics of “Functional anatomy” is discussed for each com-
speech are presented in Chapter 11, and the anatomy ponent of the speech mechanism. Functional anatomy
and physiology of the hearing mechanism are pre- presents the anatomy of a component at a level that is
sented in Chapter 22. sufficient to understand the broad physiology (func-
tion) of the component. The functional anatomy pre-
sented in these sections is a much-simplified view of
The Speech Mechanism: the anatomy presented in courses such as the “Speech
A Three-Component Description Anatomy and Physiology” course listed in Table 1–3.
The speech mechanism is composed of three major

divisions, including the respiratory system, the larynx, Respiratory System Component
and the upper and nasal airways (Figure 10–1). Each (Power Supply for Speech)
division has many anatomical components, including
bones, ligaments, cartilages, membranes, and muscles. It is useful to think of the respiratory system as con-
Each division can also be assigned a global function in sisting of two major parts, one being the lungs and the
the production of speech. These global functions sim- other being the chest wall. The lungs are composed of
131
Upper and Nasal

Airways
Larynx
Respiratory
System
Figure 10–1. View of the three-part speech mechanism, including the respiratory

system, larynx, upper airways/nasal passages.
spongy, elastic tissues that inflate and deflate as air (forced together more closely) which increases lung
passes into and out of them. Air in the atmosphere is pressure; when the lungs are expanded, air molecules
taken into the lungs via a series of tubes (called bron- within them are expanded (pulled apart from each
chioles and bronchi) that terminate in the alveoli, where other) which decreases lung pressure.
oxygen and carbon dioxide are exchanged between
the bloodstream and air. As air is expelled from the
The Effect of Lung Pressure on Airflow
lungs, it travels through the increasingly large tubes
and passes through the largest tube, the trachea, before Changes in lung pressure are critical to moving air in
passing through the vocal folds and entering the air the expiratory (breathing out) and inspiratory (breath-
spaces in the throat (pharynx), mouth, and nose. ing in) directions. When the lungs are open to the atmo-
The chest wall includes all structures of the respi- sphere, which means there are no blockages along the
ratory system that are outside the lungs but are capable pathway from lungs to lips, or from lungs to the nares
of compressing or expanding the air within the lungs. (outlets of the nostrils), the pressure inside the lungs and
The anatomical structures that can compress or expand in the atmosphere are the same. In this circumstance,
the lungs include a large set of muscles in the thorax air does not flow between the lungs and atmosphere,
(often called the chest), abdomen, and diaphragm (a in either direction. Air flows from one point to another
large muscle that separates the thorax and abdomen). only when there is a pressure difference between the
Elastic properties of structures such as the ribs and the points. Lung pressure is raised when the lungs are
sac-like membranes that enclose the lungs also contrib- compressed by muscular contraction and other forces
ute to compression or expansion of the lungs. (such as elastic forces). When lung pressure is greater
Compression or expansion of the lungs exerts a than pressure outside the mouth, which we refer to as
force on the air within the lungs. When the lungs are atmospheric pressure, air flows from the lungs to the
compressed, air molecules within them are compressed atmosphere (as in exhalation). When lung pressure
is lowered relative to atmospheric pressure, air flows organ that stores O2 — where it is “leaked out” to the
from the atmosphere to the lungs (as in inhalation). airways and exhaled to the atmosphere. The alveoli
Speech is produced on exhaled air; inhaled air is used function as a two-way valve, storing O2 collected from
to fill the lungs and make the air supply ready for the inhaled air for delivery to the bloodstream, and stor-
next utterance. ing CO2 collected from the bloodstream’s uptake of
by-products of cellular activity for expulsion from the
body via exhalation.
The Respiratory System and Figure 10–2 is a breathing record called a spiro-
Vegetative Breathing gram, generated with an instrument called a spirometer
or respirometer. The respirometer is an instrument that
Vegetative breathing is the exchange of air between the measures the volume of air inhaled or exhaled to or
lungs and atmosphere that is required to sustain life. from the lungs. A participant breathes into the instru-
Vegetative breathing, also called rest breathing, is what ment while wearing a face mask; the device records
you do when sitting at your desk and studying, sleep- the air volumes that have been displaced into or out
ing, watching television, or the many other activities of the lungs. The spirogram shows inhalation going up
that involve quiet and seemingly effortless inhalation on the graph and exhalation going down. The x-axis of
and exhalation. The purpose of vegetative breathing is the graph is time, and there are two y-axes: the one on
simple. Air is inhaled to bring oxygen (O2) to millions the left is labeled “Percent Vital Capacity” and the one
of alveoli, the tiny, sac-like structures that are the termi- on the right is labeled “Volume in Liters.” These y-axes
nal chambers within the lungs. The O2 is stored in the are two different ways to express lung volume.
alveoli and distributed to the bloodstream in a slow, The spirogram shows three cycles of inhalation-
continuous process. As the O2 travels in the blood- exhalation that begin and end at the same height on
stream, it is picked up and used in the normal function- the y-axis, indicated by the lower, horizontal dashed
ing of every cell within the body. The functioning of line. These are rest breathing cycles, which have three
these cells produces byproducts, one of which is carbon noteworthy characteristics. First, they repeat over time;
dioxide (CO2). CO2 is a toxin that must be continuously second, the inhalation and exhalation phases are sym-
eliminated from the body. This is done by carrying metrical, both in time (it takes the same amount of time
CO2 through the bloodstream to the alveoli — the same to inhale as it does to exhale) and volume (the same
100 6
Percent Vital Capacity
Volume in Liters
35
rest breaths VC
0 2
Time
Figure 10–2. Spirogram showing lung volume events as a function of time. Three rest breaths
are followed by a vital capacity maneuver, and then by two more rest breaths. See text for addi-
tional detail.
amount of volume is inhaled and exhaled); and third, They begin speaking at around 60% of the VC, exhale
the volume exchanged is relatively small. In healthy air during speaking to about 35% VC, refill the lungs
individuals, the volume exchanged during rest breath- (without speaking) to about 60% VC, and continue
ing is no more than half a liter (500 milliliters), which is speaking. The pattern of lung volume usage for speech
about 1/8th the volume that can be exhaled after tak- is shown in Figure 10–4. Note the rest breathing cycles
ing a maximally deep breath.. The volume inhaled and immediately prior to the rapid inhalation, and the long,
exhaled during rest breathing is called tidal volume. slow decrease in lung volume for each phrase of the
Now imagine that the person who has generated utterances, “You wish to know all about my grandfa-
these rest breathing cycles is asked to inhale as deeply ther, well, he is ninety three years and he still thinks
as possible and then exhale as much air as he can from as swiftly as ever.” The speech breathing inhalation-
this maximum inhalation. The maximum inhalation is exhalation pattern is different from the pattern for veg-
shown on the spirogram as a large, upward trace fol- etative breathing.
lowing the rest breathing cycles. The subsequent long,
downward trace ending well below the volume of the
The Goal of Speech Breathing
rest breathing cycles is the total volume of air that can
be exhaled from the maximum inhalation. The act of The goal of speech breathing is to maintain a roughly
inhaling maximally and then exhaling maximally is constant lung pressure during an utterance. This is
called a vital capacity maneuver. Vital capacity (VC) illustrated in Figure 10–3, which shows lung pressure
is defined as the volume of air that can be exhaled measured before, during, and after the utterance, “This
from the lungs following a maximal inhalation. VC is is the best textbook on Introduction to Communica-
marked on Figure 10–2.1 The value of VC varies among tion Disorders.” The word “best” is emphasized by the
individuals, depending on sex, body size, age, health speaker. Time is on the x-axis, and lung pressure is on
history, and other factors. the y-axis, the latter measured in units of centimeters
The y-axis on the left of Figure 10–2 expresses the of water (cm H2O, a unit of air pressure). The “zero”
volume of air within the lungs in a way that allows value on the pressure scale, indicated by a horizontal
comparisons across individuals, even though their dashed line extending across the graph, corresponds to
actual VC volumes may be quite different. This axis atmospheric pressure. Positive pressures indicate lung
expresses all lung volumes as percentages of an indi- pressures greater than atmospheric pressure. Negative
vidual’s VC. lung pressures are less than atmospheric pressure.
The lung pressure during the entire utterance,
which has a duration slightly less than 2 s, is more or
Speech Breathing less constant at a value of around +6.5 cm/H2O. There
is a small, temporary increase in pressure to about
The term “speech breathing” is used to differenti- 7 cm/H2O (upward-pointing arrow) for the empha-
ate breathing for speech from vegetative breathing, sized word “best.” Note the small, negative pressures
breathing during exercise, breathing to sing, and other immediately before and after the utterance. These
behaviors in which the respiratory system plays an negative pressures are the result of expansion of the
important role. Speech breathing depends on increas- lungs to allow inhalations before and after the utter-
ing or decreasing lung pressure to cause air to flow out ance. The momentary pressure increase for the word
of or into the lungs. Flow from the lungs to the atmo- “best” makes it louder, consistent with the emphasis
sphere is used to produce words, sentences, and para- on this specific word in the utterance.
graphs; flow from the atmosphere is used to refill the Figure 10–4 shows percentage VC on the y-axis
lungs between utterances. and time on the x-axis for the first three utterances
What range of lung volumes do speakers use to of a famous reading passage in speech-language pa-
generate the airflows and pressures required to pro- thology, the Grandfather Passage (Darley, Aronson, &
duce audible speech? In theory, speaking can take Brown, 1975). This lung-volume-by-time graph illus-
place throughout the vital capacity — over the entire trates the small range of lung volumes used for each
lung volume range. As it turns out, most speakers of these three utterances. Two cycles of rest breathing
use only the lung volumes in the middle of the VC. are shown before a rapid inhalation to about 60% VC.
1
ypical values of VC are 4.6 liters (L) for adult males and 4.0 for adult females; values will be lower for children. VC values vary with body
T
size, sex, and age, so the values given here are averages only. When a participant has exhaled all the air he or she possibly can, there is still a
volume of air remaining in the lungs, shown in Figure 10–2 as the shaded region at the bottom of the graph, but this is typically not measured
and is not included in the VC measures.
10
Lung pressure (cm H2O)
This is the best textbook on Communication disorders
breath breath
2 seconds
Time
Figure 10–3. Graph showing lung pressure over time for the utterance, “This is the best textbook
on communication disorders,” with emphasis on the word “best.” Note how the positive pressure devel-
oped in the lungs is at a nearly constant value for the entire utterance, except for a brief and small
increase for the emphasized word “best.”
100
Percentage VC
60
35
You wish to know all about my grandfather well he is nearly ninety three years old and he still thinks as swifty as ever
Figure 10–4. Graph showing lung volume, expressed as percent vital capacity (VC) over time, for
the three utterances, “You wish to know all about my grandfather/well he is nearly ninety-three years
old/and he still thinks as swiftly as ever.” For each utterance, the talker inhales to about 60% VC and
talks on exhalation down to a lung volume of about 35% VC. Speech is produced within a limited range
of lung volumes, even across three consecutive utterances.
135
The inhalation is followed by a slow, long decrease of (“well, he is nearly ninety-three years old”). This se-
lung volume for the first phrase (“You wish to know quence is repeated for the third phrase (“and he still
all about my grandfather”). When the lung volume thinks as swiftly as ever”).
decreases to roughly 35% VC, there is another rapid The breathing cycles for speech use much less air
inhalation to approximately 60% followed by a similar volume than the VC. The breathing cycles for speech
slow, decreasing lung volume for the second phrase also differ from the rest breathing cycles before the
first phrase in Figure 10–4. The lung volumes used for
rest breathing cycles are less than the volumes used
When Less Is More for speech. In addition, the relative durations of the
inhalation and exhalation phases of the two types of
That speech is produced using a small breathing — rest breathing and speech breathing — are
range of lung volumes may seem uneconomical: different. The inhalation-exhalation phases are equiv-
use of the entire VC for speech seems to suggest alent for rest breathing, but in speech breathing, the
the possibility of more message per unit time. inhalation phase occurs very quickly, and the exhala-
But, in fact, studies of speech breathing (Hixon, tory phase is much longer. The long exhalatory phase
Goldman, & Mead, 1973) have shown that in speech breathing is largely a result of interruptions
speech produced between about 60% and 35% of the outgoing airflow at the vocal folds and at loca-
VC (about ¼ of all the air you can exhale follow- tions in the airway between the vocal folds and the lips.
ing a maximal inhalation) requires less muscular These interruptions are like valves opening and closing
activity in the thorax and abdomen, compared as air flows through a tube.
with speech produced at very high lung volumes
(near 100%) or very low lung volumes (near
Speech Breathing and Abdominal Muscles
the point in lung volume where you cannot
exhale any more air). In the midrange of lung Muscles of the thorax and abdomen contribute to lung
volume — between 60% and 35% VC — the least compression and therefore the raised lung pressures
muscular activity is required to maintain a that cause air to flow from lungs to atmosphere. Con-
constant lung pressure (see Figure 10–3). What traction of the abdominal muscles is especially impor-
we have here is biological efficiency — minimal tant for efficient speech breathing. A balloon model of
exertion with more payoff. these muscular events is shown in Figure 10–5. The bal-
loon is inflated with the open end pinched closed. The
Hands relaxed Thoracic wall Thoracic wall expiration

expiration and Abdominal wall
expiration
Larynx
Thoracic
wall
Abdominal
wall
Figure 10–5. Balloon model of how muscular effort in the respiratory system can be
applied under inefficient (middle) and efficient (right ) conditions to maintain the constant,
positive lung pressures required for speech production.
hands around the balloon represent the compressive The muscular actions in the respiratory system for
muscular effects of the thorax (top hand) and abdo- the generation of utterance pressure are as shown in
men (bottom hand). In the left-hand image, the hands the balloon model. The expiratory muscles of the tho-
are relaxed, as if the muscles of the thorax and abdo- rax do the primary job of lung compression, while the
men are relaxed, exerting no force on the lungs. The abdominal muscles do the primary job of holding in
middle image shows a squeeze by the upper hand, as the abdominal wall, preventing it from bulging out and
if the muscles of the thorax exert compression of the causing the lungs to lose the pressure raised by con-
lungs; the bottom hand is relaxed. The upper squeeze traction of thoracic muscles The constant contraction
momentarily raises the pressure in the balloon, but the of the abdominal muscles during speech is an efficient
incompressible nature of air in a closed volume results solution to generating a constant lung pressure for
in the bulge of the balloon’s bottom half. The pres- speech utterances.
sure increase of the air inside the balloon is “lost” in
the expansion of the bottom half — this is not what you
Speech Breathing and Voice Loudness
want to maintain a constant, positive pressure during
a speech utterance as shown in Figure 10–3. The lung pressure value of 6.5 cm H2O in Figure 10–3
How can the pressure be maintained at a con- does not mean much to someone who has not worked
stant level when the expiratory muscles of the thorax with air pressures in the speech mechanism, so a real-
are contracted? The answer is in the contraction of the world reference is offered here. If you are speaking
abdominal muscles. A constant squeeze of the balloon with someone in a quiet room and the two of you are
by the bottom hand (right image, Figure 10–5) allows standing about 1 meter apart, speech with a lung pres-
squeezes at the top of the balloon to maintain the con- sure of 6.5 cm H2O sounds comfortably loud — not too
stant pressure required for speech utterances, with loud, not too soft. It is a more or less typical value of
relative ease. The thoracic squeezes do not have to be lung pressure used by people speaking to each other at
excessive to maintain the pressure. In addition, small fairly close range. A lung pressure of 9 cm H2O makes
squeezes of the top hand during speech can raise the speech seem loud at this distance, and lung pressures
pressure with relative ease (e.g., emphasis on “best” in approaching 12 cm H2O produce a speech loudness
Figure 10–4). that seems like shouting.
The Balloon Model Clinical Applications: An Example

Provides a Clinical Hint
Lung pressure is the primary determinant of speech
The balloon model of muscle activity of the loudness. Greater lung pressure typically results in
thorax and abdomen, and its effect on pres- louder speech (assuming a constant distance between
sure in the lungs, is more than a simple way a speaker and listener). Lung pressure is therefore an
to explain speech breathing. Individuals who important factor in the intelligibility of speech; softer
have weak or paralyzed abdominal muscles speech is more likely to suffer from intelligibility prob-
and normal or near-normal functioning of the lems than louder speech. Speech-language clinics see
thoracic muscles can compress the lungs with patients whose primary complaint is an inability to be
their thoracic muscles, but the increased lung understood because they cannot generate an adequate
pressure is partially or completely reduced or amount of lung pressure.
lost when weak or paralyzed muscles of the Speech intelligibility may not be the only problem
abdomen cannot hold in the abdominal wall. associated with a speech breathing problem. A person
These speakers have to work much harder to who has speech breathing problems and difficulty
produce acceptable levels of lung pressure for producing changes in lung pressure and therefore
speech, which can have significant effects on speech loudness may experience a reduced ability to
their ability to communicate. A low-tech clini- convey emotional states. We use the loudness of our
cal approach to the speech breathing problem voices to express a variety of emotions, and a patient
of ineffective abdominal muscles is to use an who does not have good control over lung pressures
adaptive device (like a thick belt) to compress may suffer in this area. This paralinguistic function of
the abdomen during speech. The belt takes over speech breathing (“paralinguistic” meaning the use of
the role of the abdominal muscles, holding in nonverbal cues such as loudness and pitch to convey
the abdomen to prevent it from being pushed mood, emotion, and so forth) is an important aspect
outward. of communication, especially in social situations (see
Chapter 3).
The Larynx (Sound lages of the larynx, as well as the hyoid bone which is
Source for Speech) attached to the larynx, are shown in Figure 10–6.
Figure 10–6 shows the position of the larynx
As shown in Figure 10–6, the larynx is a structure that within the neck. Immediately above the larynx is the
is composed of cartilage, muscle, ligaments, and mem- bottom of the throat, and immediately below the lar-
branes. The vocal folds are the component of the larynx ynx is the upper edge of the windpipe (trachea). The
that generates the sound source for all vocalic sounds two red bands between the arytenoid cartilages and
(vowels, semivowels, diphthongs, and nasals) as well as the front of the thyroid cartilage represent the vocal
a subset of consonants. This sound source comes from folds — the tissue whose vibrations create the sound
the vibration of the vocal folds, as discussed below. source. The larynx is very small — from top to bot-
tom about 45 mm (1.75 inches) in men, 36 mm in
Laryngeal Cartilages women (1.40 inches), and much smaller in children.
The length of the muscular part of the human vocal
The laryngeal cartilages form a strong but flexible folds is equally tiny — about 14 mm (0.58 inches)
framework to support a collection of soft tissues (mus- in men, and 11.1 mm (0.44 inches) in women, and
cles, ligaments, and membranes). The major carti- smaller in children (for adults, see Su et al., 2002).
Hyoid bone
Thyroid cartilage
Epiglottis
Arytenoid cartilages
Cricoid cartilage
Figure 10–6. The position of the larynx in the neck, with four
cartilages (cricoid, arytenoid [paired], thyroid, and epiglottis)
and one bone (hyoid bone) labeled. Note the top of the trachea
immediately below the cricoid cartilage.
Laryngeal Muscles and Membranes
Laryngeal muscles are classified broadly into one

of two categories according to their points of attach-
ment. Anatomists refer to muscle attachment points
as origins and insertions.2 Intrinsic muscles of the larynx
have both points of attachment within the larynx (e.g.,
from the arytenoid cartilages to the thyroid cartilage).
Extrinsic muscles have one point of attachment within
the larynx and one on a structure outside the larynx
(e.g., from the breastbone to the thyroid cartilage).
Extrinsic muscles are primarily responsible for posi-
tioning the larynx within the neck; they are the ones
that cause the Adam’s apple to bob up and down dur-
ing speech.
The intrinsic muscles of the larynx open and close
the vocal folds, stretch them, and adjust muscular ten-
sion to create different types of vocal fold vibration
and, therefore, different voice qualities.
Among the five intrinsic muscles of the larynx
three can close the vocal folds, one can open the vocal
folds, and one can stretch the vocal folds. Several (if Figure 10–7. Insertion of a laryngeal mirror to the
not all) of these muscles serve double duty. For exam- throat to view the vocal folds (enclosed in the black oval ).
ple, one of the muscles can both close and tense the The tongue is gently pulled away from the mouth to move
vocal folds, another “closer” can also change the con- structures forward that are likely to prevent a clear view of
figuration of the vocal folds, which affects the quality the vocal folds.
of the voice.
folds, the target of the examiner’s view. A photograph

The Vocal Folds
can be taken of the image in the mirror. Devices are also
The vocal folds are bands of muscular and nonmuscular available for recording successive images of the vocal
(e.g., membranes and ligaments) tissue that run from folds over time, during speech.
their forward point of attachment on the inside of the Two images of the vocal folds are shown in Fig-
thyroid cartilage to the posterior point of attachment ure 10–8. The image on the left shows open vocal folds.
on the vocal process of the arytenoid cartilage (red lines The bands of vocal fold tissue, pearly gray with a touch
in Figure 10–6). The muscular part of the vocal folds is of pale pink, extend from the back to the front of the
one of the five intrinsic muscles of the larynx. larynx. One end of the bands of vocal fold tissue form
Figure 10–7 illustrates how an examiner obtains the point of a “V,” where the two vocal folds come
a “live” view of the vocal folds. A laryngeal mirror is together at the front of the larynx, on the inner sur-
inserted into the mouth and positioned close to the face of the thyroid cartilage. The arms of the “V”
back of the throat, just under the flap of tissue called diverge as they move to the back of the larynx. Follow
the soft palate. The mirror is tilted at an angle relative the bands from the front (at the point of the “V”) to
to the stem of the instrument. The tongue is pulled gen- the back and along the edge of each vocal fold, and
tly forward to move structures such as the tongue and you see a change of color from pearly gray/pinkish
epiglottis forward and out of the way for a clear view of white to pale white. The pale white is the posterior
the vocal folds. The vocal folds are reflected in the mir- attachment of the vocal folds. These posterior points
ror, which is illuminated by a bright light fastened to a of attachment are to a part of the arytenoid cartilage
band around the examiner’s head, much like a miner’s (called the vocal process) on the same side as the vocal
light. In Figure 10–7, a black oval surrounds the vocal fold (see Figure 10–7).
2
raditionally in anatomical descriptions, when a muscle contracts, it pulls from the point of insertion toward its origin, and hence, the distinc-
T
tion between two points of attachment. Usually, the origin is thought of as the more “fixed” point of attachment, whereas the point of insertion
moves the structure to which it is attached. This is convenient anatomical terminology but, in reality (and especially for many muscles of the
speech mechanism), it is not quite so simple.
Vocal folds open Vocal folds closed
(Anterior)
FRONT
Figure 10–8. Two views of the vocal folds from above. In both images, the front of the neck is at the
bottom of the image, where the vocal folds come together on the inner surface of the thyroid cartilage.
The left image shows the vocal folds open; the glottis is wide. The vocal folds form the point of a “V”
where they meet at the thyroid cartilage. The right image shows the vocal folds closed. Technically, when
the vocal folds are closed there is no glottis. However, the term “glottis” is widely used to refer to the
vocal folds, whether closed or not.
The opening between the vocal folds is called the fold. Very importantly, even though the membranous
glottis (see Figure 10-8). With the vocal folds open, air cover and muscular body are part of the same structure
can flow from the lungs and through the larynx to the — the vocal fold — the two parts can move somewhat
airways of the throat, nose, and oral cavities. Think of independently of each other. Moreover, the degree of
the vocal folds shown in the left image of Figure 10–8 their independent movement can be adjusted in fine
as an open valve between the lungs on the one hand increments, depending on the contraction pattern of
and the upper airways on the other hand. laryngeal muscles.
The open vocal folds allow a view of rings of tissue The fine structure of vocal fold tissues, as viewed
beneath the glottis. The structure containing these rings under a microscope, has been studied in a fair amount
is the tube-like windpipe (trachea). This is the large air of detail with some surprising results. The eminent
tube that gives off increasingly smaller tubes that reach Japanese physician Minoru Hirano (1932–2017) de-
deep into the lungs where air can be absorbed by alve- voted much energy to histological study of vocal folds
oli to carry oxygen to cells of the body, and that allows in humans and various animals (“histology” is the
CO2 to flow from the alveoli to the atmosphere. study of the microscopic characteristics of biological
The right side of Figure 10–8 shows a photograph tissue).
of closed vocal folds. Vocal fold closure in the absence Hirano discovered that the adult, human vocal
of phonation, the term used for the sound made by the fold has a complicated tissue structure unlike that of
vibrating vocal folds, occurs during exertion (such as any other species. This complicated tissue structure is
lifting a heavy object, or going to the bathroom), and also not seen in human infants but develops as children
importantly, as part of the swallowing process. Vocal mature into adulthood. The difference between human
fold closure during swallowing is critical to protect- and animal vocal folds was surprising because the
ing the airway during the passage of fluids and solid primary function of the vocal folds is often regarded
food from the mouth and pharynx into the esophagus as protection of the airway, as noted earlier. Because
(Chapter 20). protection of the airway is so important for health and
The structure of the vocal folds is complex and even life in humans and animals, the vibratory func-
includes both muscular and nonmuscular tissues. The tion of the vocal folds — their sound producing capa-
nonmuscular parts include a membranous casing that bilities — had sometimes been regarded as a secondary
covers the main bulk, or muscular body of the vocal capability of the mammalian larynx, as if the vocal
folds were adapted to a secondary purpose (phona- tographed from the reflection of the vocal folds in a
tion) while maintaining its primary role of protecting laryngeal mirror, as previously described (see Fig-
the lungs. ure 10–7). In each image, the front of the vocal folds
Hirano’s research suggested otherwise. His dis- is at the bottom, and the back of the vocal folds is at
covery of the elaborate tissue structure of the human the top. The successive images are recorded so rapidly
vocal folds suggested strongly that they were special- because a cycle of vocal fold vibration lasts only a frac-
ized for human voice. This specialization explained tion of second (e.g., in women about .005 seconds). One
the remarkable range of voice qualities produced by cycle of vocal fold vibration is defined as movement
humans and the role played by phonation in the fine from the closed position (frame 1 in Figure 10–9) to the
nuances of human communication. most open position (frame 5), and back to the closed
position (frames 9 and 10).
When the series of still images in Figure 10–9 is
Phonation watched as a video, the highly complex motions of
the vocal fold tissues are revealed (Google “vocal fold
Phonation is the production of sound by the nearly motion,” select “video,” and many options for view-
periodic (repeating over time) vibration of the vocal ing vocal fold motion are available (see Box, “Waving
folds. The repetitive opening and closing of the vocal in the Wind”). The membrane that encases the muscu-
folds during phonation is controlled by air pressures, lar bulk of the vocal fold vibrates somewhat indepen-
air flows, and elastic characteristics of vocal fold tis- dently of the muscle, giving the motion of the vocal
sue. While it is true that laryngeal muscles can open folds during phonation a rippling appearance across
and close the vocal folds, as described earlier, opening their top surface. In other words, during typical pho-
and closing movements of vocal fold vibration are not nation the vocal folds do not vibrate like rigid pistons
produced by repetitive muscle contractions. moving back and forth. Rather, the motions appear
To initiate phonation, a speaker brings her vocal complex, almost wavy. This complex motion is what
folds together through the action of laryngeal muscles. gives the human voice its unique sound. The pattern of
Voice scientists refer to this action as adduction of the vocal fold movement varies depending on factors such
vocal folds. At the same time, the speaker develops a as the pitch of the voice and the smoothness or rough-
positive lung pressure, using the muscles and elastic ness of the voice quality
properties of the respiratory system. Because the tra-
chea contains air continuous with that of the lungs,
pressure in the lungs and in the trachea, immediately Waving in the Wind
below the closed vocal folds, is essentially the same.
The positive tracheal (lung) pressure acts as a force The opening and closing motions
against the closed vocal folds. This pressure overcomes of the vocal folds during vocal fold vibration
the force of the muscles that adducted them and blows are controlled by aeromechanical forces. The
the vocal folds apart. Through a complex interaction opening and closing motions are not produced
of mechanical (e.g., elasticity of vocal fold tissue) and by rapid muscular “pulls and pushes”; in fact,
aeromechanic (e.g., air pressures and flows) forces, the the muscles cannot respond quickly enough
vocal folds are displaced outward and then return to to produce the extremely rapid motions of the
the midline, closing rapidly and firmly. With the vocal vocal folds. The fact that aeromechanical forces
folds again in the closed position, this cycle of events control vocal fold vibration raises an interesting
is repeated, because the tracheal (lung) pressure is still question about the motions (or lack thereof)
positive and blows the vocal folds apart, each time they of a paralyzed vocal fold. Vocal fold paralysis
close. Recall from the first section of the chapter that usually occurs only on one side (one paralyzed
the goal of speech breathing is to maintain a constant vocal fold, the other healthy). The paralyzed
lung pressure during speech. In the present case, we vocal fold still moves during phonation, because
can say the constant lung pressure is maintained as it is moved to the middle of the glottis and
long as phonation is produced. away from the middle by aeromechanical forces.
Figure 10–9 shows a series of images from a high- However, the vibration is “floppy” and ineffec-
speed digital recording of one cycle of vocal fold vibra- tive due to the loss of muscular control for vocal
tion. The instrument used to record these images is a fold tension. Vocal fold paralysis is discussed at
sophisticated digital camera that records sequences of greater length in Chapter 18.
images in very rapid succession. The images are pho-
1 2 3 4 5
Bottom edge
Top edge
6 7 8 9 10
Figure 10–9. Sequential photographs of one cycle of vocal fold vibration as imaged via a laryngeal mirror. The cycle
begins in the upper left-hand frame, continues along the top row of images from left to right, and then along the bottom row
of images, from left to right. The cycle is complete at the right-hand image in the bottom row. Frame 6 is labeled to show
the upper and lower edges of the vocal fold; this illustrates that the vocal folds do not vibrate like a “solid” piston but have
complex, wave-like motion. Adapted from Hixon, T. J., Weismer G., and Hoit, J. D. (2020). Preclinical Speech Science: Anatomy,
Physiology, Acoustics, Perception (3rd ed.). San Diego, CA: Plural Publishing.
Characteristics of Phonation The typical F0 for young adult females is roughly

190 to 200 Hz (Hz, short for hertz, is the abbreviation
Three variables are useful in a description of phona- for cycles per second, named after Heinrich Rudolf
tion. The variables include fundamental frequency (F0), Hertz, a 19th-century German physicist). Each of the
voice intensity, and voice quality. The variables are not 190 to 200 cycles per second is similar to every other
necessarily independent from each other and are not cycle, but every cycle is not exactly of the same duration.
the only ways to describe phonation. They are conve- The description of vocal fold vibration as nearly periodic
nient for the purposes of this introductory discussion. recognizes these slight differences (less than 1/1,000th of
a second) in the durations of successive cycles.
F0 varies with age and sex (as well as some other
Fundamental Frequency (F0)
factors) and is the primary determinant of the pitch
Fundamental frequency, abbreviated as F0 (spoken of a talker’s voice. Pitch is the perceptual correlate of
as “F subzero” or “F-oh”), is the rate of vibration of the physical measurement, F0. The higher the rate of
the vocal folds, expressed in cycles per second. One vibration of the vocal folds, the higher is the perceived
complete cycle is defined as the motion of the vocal pitch of the voice. This can be illustrated by compar-
folds from the closed position to the open position ing typical F0s and pitches of children’s, women’s, and
and back to the closed position (see Figure 10–9). When men’s voices. Five-year-old children typically have F0s
100 of these cycles occur in 1 s, then F0 = 100 cycles around 350 Hz (for either sex), young adult women
per second. have F0s of 190 to 200 Hz, and young adult men have
F0s around 125 Hz. Thus, children at age 5 years typi- children, women, and men assumes that the vocal
cally have higher-pitched voices than adult women, folds in the respective age and sex groups are in the
who have higher-pitched voices than adult men. unstretched state.
Why does the F0 differ so much across these three F0 is more than a marker of a person’s age and sex.
speaker groups? What causes the vocal folds of a 5-year- A single individual uses a wide range of F0s for nor-
old child to vibrate at such a faster rate as compared to mal speech communication. The variation of F0 during
adult women, or the faster rates of vibration for women normal speech is an important component of prosody,
as compared to men? A simplified explanation to this a term that refers to the melody and rhythm of speech.
complicated question is that, across speakers, the length When you listen to a speaker who seems particularly
of the vocal folds is a primary factor in the typical expressive, you are probably reacting to (among other
F0 of the voice (Titze, 2011). For example, children have factors) relatively large variations in F0 throughout
shorter vocal folds than adult women, who in turn the speaker’s utterances. When you listen to the voice
have shorter vocal folds than adult men.3 Figure 10–10 of someone who is exceptionally sad, her voice may
illustrates the difference in length of the vocal folds sound as if it is produced on a single pitch, or nearly so.
for adult males, adult females, and 5-year-old children Speakers use F0 to convey emotion, to subtly change
(Rogers, Setlur, Raol, Maurer, & Hartnick, 2014). the meaning of the same words, to be sarcastic, and to
Notice that we have said that across speakers the be playful.
length of the vocal folds is a primary determinant of
F0. Within a given speaker, however, F0 is increased
Intensity
by contraction of a paired muscle that stretches the
vocal fold. This is not a contradiction to the statement Intensity is a term used to describe the amount of
of vocal fold length as the primary anatomical reason energy in a sound wave. Intensity is a physical measure
for different F0s in children, women, and men. When that reflects the amount of acoustic energy generated
a given speaker stretches his or her vocal folds, the tis- by the vibrating vocal folds. Unlike F0, it is difficult
sue is not only longer but also thinner and more tense. to offer “typical” values of voice intensity because
These factors cause the vocal folds to vibrate at a faster they depend so much on characteristics of the com-
rate (i.e., with a higher F0). The comparison across munication situation, such as how much noise there is
View from above vocal folds

Front (toward thyroid cartilage)
13 mm
11 mm
7.5 mm
Vocal fold
Adult Male Adult Female 5-year old child
Figure 10–10. Vocal folds viewed from above, showing the different lengths of the adult male, adult female, and child
(5 years old) vocal folds, as well as the sizes of laryngeal cartilages.
3
hese age- and sex-related differences in the length of the vocal folds follow differences in overall size of the larynx. Children have smaller
T
larynges (plural of larynx) than female adults, who have smaller larynges than male adults.
when speaking and the distance between the speaker cartoon, the sound waves travel to the listener’s ear
and listener. where they also can be measured for intensity. These
The distance between a speaker and listener has measurements reveal that intensity measured at the
an effect on the voice intensity reaching the listener’s listener’s ear is less than the intensity measured at
ear. The sound waves produced by a speaker saying a the speaker’s lips. The decreasing intensities of sound
sustained “ahhhh” have a sound intensity that can be waves are shown as the decreasing sizes of the arcs
measured at the lips. As shown in Figure 10–11, top from the speaker to the listener. The fundamental prin-
Figure 10–11. Cartoon illustrating the effect of distance on voice intensity. The person producing
the vowel “ahhh” does so with the same speech intensity measured at his lips, in both the top and bottom
cartoons. The intensity decreases steadily as the sound waves move away from him. At the listener’s ear,
the intensity is less when she is at a greater distance from the speaker.
ciple of acoustics that applies to the measurements is presence of “noise” in the sound wave. “Noise” is an
that sound intensity decreases over distance, from the acoustic term indicating energy that is not periodic but
source of the sound (in this case the speaker’s lips) to has random frequency and intensity variations.
the receiver of the sound energy (in this case the lis- To demonstrate the difference between the tonal
tener’s ear). Figure 10–11, bottom cartoon, shows what quality of nearly periodic vibrations and the noisy
happens when the distance between speaker and lis- quality of nonperiodic vibrations, say “ah” for a few
tener is doubled compared to the distance shown in seconds in a normal voice, followed by a long whis-
the top cartoon. The speaker in the lower cartoon pro- pered “ah” sound. The “ah” said with a normal voice
duces the same sound intensity as in the top cartoon. consists primarily of periodic energy — in fact, it is
At this doubled distance, the sound intensity reaching the periodic nature of the energy that makes the “ah”
the listener’s ear is even less than in the top cartoon. sound tonal. The whispered “ah,” conversely, does not
The greater the distance between speaker and listener, sound tonal but rather has a hissing quality of uncer-
the greater is the loss of sound energy from source to tain pitch. The whispered “ah” sound is an example of
receiver. Loudness is proportional to sound intensity, noise energy.
so the listener hears the “ahhh” much less well in the
bottom cartoon than in the top cartoon, even though
the speaker’s sound intensity at the lips, in both cases, Clinical Applications: An Example
is the same.
When the vocal folds vibrate for phonation, they open
and close rapidly at a very fast rate, as discussed earlier.
Commonsense Speech Look again at Figure 10–9 and notice that at the begin-
and Listener Therapy ning (frame 1) and end (frame 10) of this single cycle,
the vocal folds are tightly closed. This is characteristic
Parkinson’s disease (PD) is a degenerative disease of healthy phonation, although there are variations in
of the nervous system in which movement the population in which the closure is not so tight but
functions, including articulatory and laryngeal the phonation remains perceptually “normal.” There
movements, are affected. A prominent charac- are clinical cases, however, in which the voice is too
teristic of the speech and voice disorder in PD is breathy on the one hand, or too “tight” on the other
reduced speech intensity. Persons with PD have hand. The underlying cause of these voices is often a
very stiff respiratory muscles and therefore cannot failure to close the vocal folds adequately on each cycle
generate the lung pressure to produce speech of vibration (breathy), or a closure that is too fast and
with normal intensity. To compound the problem, forceful (“tight”).
vocal fold vibration is inefficient. This results in An SLP who treats either one of these voice prob-
air leaking through the glottis, further affecting lems must know the structure and function of the vocal
the buildup of pressure below the vocal folds folds and determine a way to modify vocal fold closure.
that is so essential to generating adequate speech In the case of a chronically breathy voice, the SLP may
intensity. Speech-language pathologists often teach the client to exert more effort when phonating,
focus on training the person with PD to produce which often has the effect of a better approximation
greater intensity so that listeners will hear them of the vocal folds during vibration. The person who
more easily. Especially with family members, SLPs closes the vocal folds too forcefully and has the “tight”
can also suggest reducing the distance between voice may be shown how to relax neck muscles, or
the relative with PD and the listener. It sounds even be a candidate for an external (on the neck) laryn-
obvious, but listeners do not always get it unless geal massage to reduce overall muscle tension that
they are told about distance and intensity. Plus, may be contributing to the excessive phonatory effort.
this kind of speech therapy is simple to imple- Knowledge of the anatomy and physiology of the
ment, very effective, and best of all, free. larynx is essential to considering the best options for
voice therapy.
Quality Upper Airway (Consonants

Quality is a perceptual term, like pitch and loudness. and Vowels)
The terms breathy voice, harsh voice, rough, and metallic
voice, are some of the perceptual descriptions of voice Figure 10–12 shows an artist’s rendition of upper airway
quality. The physical basis of differences in voice qual- and nasal cavity structures. The view is in the sagittal
ity includes variations in frequency, intensity, and the plane, as if you are looking toward the side of the head.
Nasal cavities
Soft palate
Tongue Pharynx
Hard palate Vocal

folds
Thyroid
cartilage
Figure 10–12. Sagittal view of the upper and nasal airways,

with important landmarks labeled.
The upper airway can be thought of as two col- standing normal speech production and certain speech
umns of air, one with variable shape, the other with pathologies.
relatively fixed shape. The air tube with variable shape
is called the vocal tract and extends from the vocal folds
to the lips. In Figure 10–12, this airway is filled with a Muscles of the Vocal Tract
dark blue color. The boundaries of the vocal tract tube
include the walls of the throat, the soft and hard palates Approximately 30 muscles contribute to movements of
(the latter typically called the roof of the mouth), the the structures that change the length and shape of the
lips, and the tongue. The length and shape of the vocal vocal tract (Hixon, Weismer, & Hoit, 2020). Muscles of
tract can be changed by the action of most of these the vocal tract include about a half-dozen in the throat,
movable structures (such as the throat, soft palate, lips, three or four in the soft palate, about eight that can
and tongue; the hard palate is not movable) plus the move and shape the tongue, and a dozen or so that can
movements of the lower jaw (mandible) which is con- open and close the lips and shape their configuration.
nected anatomically to the lower lip and tongue. Thus Combinations of these muscles, working at the same
movements of the jaw can affect movements of the lips time and with intricate timing, create many different
and tongue. shapes of the vocal tract.
The column of air with relatively fixed shape is Several muscles control the outlet of the nasal pas-
called the nasal tract (see Figure 10–12, light blue). The sages. These outlets, called the nares, can be flared and
shape of the nasal tract does not change much during constricted during speech. The size of the nares may
speech because the structures forming its boundaries also be adjusted during breathing for such behaviors
do not move, at least not to a significant degree. As as singing and exercise.
described in greater detail in section “Velopharyngeal
Mechanism,” during speech the nasal tract is intermit-
tently connected to and disconnected from the vocal Vocal Tract Shape and
tract. Figure 10–12 shows the nasal tract connected Vocalic Production
to the vocal tract airways by the lowered soft palate
(lighter red) and disconnected from the vocal tract air- The vocal tract can be considered as the “shaper of
ways (darker red). The mechanism of this connection/ speech sounds” because the shape of the column of air
disconnection is of the utmost importance for under- between the vocal folds and lips determines the acous-
tic characteristics of a speech sound. Different vocal of the vocal tract, roughly from the middle of the hard
tract shapes result in different acoustic sound waves palate to the teeth, is tightly constricted by the tongue;
emerging from the lips, which in turn result in the dif- the slim black channel between the tongue and pal-
ferent sounds we call speech sounds. In some cases, a ate at the constriction shows the narrow, open airway.
connection between the vocal tract and nasal cavities is Behind the constriction, in the back of the oral cavity
made for a class of sounds called nasals (such as /m/ and in the pharynx, the large black area shows that the
and /n/ in English). vocal tract opens wide into a large airway.
Recall that the source (phonation) for voiced The vowel /u/ has a tight constriction further
speech is the sound produced by the vibrating vocal back in the vocal tract, in the region of the soft pal-
folds. This sound source is basically the same for all ate and the upper pharynx. In front of this constriction,
voiced, vocalic speech sounds such as vowels, diph- the vocal tract opens until a point at the lips where the
thongs, nasals, liquids, rhotics, and glides; the pho- vocal tract is again narrowed down.
nated sound source is also used for voiced stops and The difference between vocal tract shapes for /i/
fricatives. Different shapes of the vocal tract, and con- and /u/ is easy to see, and consistent with the pho-
nections to the nasal cavities, modify the acoustic char- netic description of /i/ as a high-front vowel and /u/
acteristics of the vibrating vocal folds to form different as a high-back vowel (Chapter 12). Vocal tract shape
speech sounds. More specific information on individ- differences between other vowels may be more subtle
ual speech sounds is provided in Chapter 12. (compare /ɑ/ to /u/ in Figure 10–13) but are sufficient
Figure 10–13 shows vocal tract shapes for five to result in different speech sounds.
vowels. These images were obtained from magnetic
resonance (MR) scans published by Zhou, Woo, Stone,
Prince, and Espy-Wilson (2013). The black parts in Velopharyngeal Mechanism
each image are airways. The vocal tract airway is seen
extending from the level of the vocal folds (shown in The velopharyngeal mechanism includes the soft pal-
the left-most image) to the lips, and the nasal airways ate and surrounding pharyngeal walls. The soft palate
are shown above the soft and hard palates. The airway is a soft-tissue, movable structure of the speech mecha-
below the vocal folds is the windpipe (trachea). nism with critical importance to speech production.
Note the different shapes of the vocal tract among Figure 10–12 shows the soft palate in two positions,
these five vowels. For example, compare the vocal tract one hanging down as it does at rest (lightly shaded,
shapes of /i/ and /u/. In the case of /i/, the front part solid line following the contour of the structure) and
Velopharyngeal port
Vocal folds /a/ /e/ /i/ /o/ /u/

“ah” “ay” “ee” “oh” “oo”
“hot” “hay” “heat” “hope” “hoop”
Figure 10–13. Magnetic resonance images of an individual producing five different vowels. The black areas above
the vocal folds are airways. The varying shapes of the airways from vocal folds to lips — excluding the airways in the nasal
cavities — show how the vocal tract is shaped by the articulators for the different vowels. The openness of the velopharyn-
geal port also varies across the vowels. Adapted from Zhou, X., Woo, J., Stone, M., Prince, J. L., and Espy-Wilson, C. Y. (2013).
Improved vocal tract reconstruction and modeling using an image super-resolution technique. Journal of the Acoustical
Society of America, 133, EL439–445.
one raised and pushed against the back of the pharynx during swallowing to prevent movement of food and
(colored red). The hanging-down position allows air to liquid into the nasal passageways (see Chapter 20).
flow from the lower pharynx into the nasal cavity. The And, when you blow your birthday candles out, the
lifted position in which the velum is pushed against velopharyngeal port is closed to prevent the escape of
the back of the pharynx seals off the air in the pharynx air through the nasal cavities which might reduce the
from the air in the nasal cavities. The region around the flow of air through your lips and therefore reduce the
soft palate and posterior pharynx is called the velopha- likelihood of extinguishing all the candles (and having
ryngeal port. The velopharyngeal port is a valve that can your wish come true).
be opened or shut by action of a group of muscles that The closure of the velopharyngeal port for stops,
control the movement of the soft palate and pharynx. fricatives, and affricates (collectively called obstruents,
The velopharyngeal port is a critical component because in varying degrees they obstruct airflow) is
of the speech mechanism because some sounds require necessary to build up air pressure in the vocal tract.
the port to be open (e.g., nasal sounds such as /m/ The positive air pressure creates the conditions that are
and /n/), some sounds require it to be completely unique to the sound of these consonants (see Chap-
closed (e.g., stop and fricative consonants), and some ter 12). Stop consonants are produced when there is a
sounds are produced with the port partially open (e.g., complete blockage of the airstream flowing in the vocal
certain vowels, see later in chapter). (Movements of tract. During this blockage, air pressure will build up,
the soft palate, which are frequent and rapid during but only if the vocal tract is sealed at the velopharyn-
speech production, can be seen at https://www.you- geal port to prevent leaks of air through the nasal cav-
tube.com/watch?v=T4KRbENmFDk; if that video is ity. This is illustrated in the left image of Figure 10–14,
not available, search on “MRI speech” for additional which shows closed lips (red pointer) and a closed
relevant videos). The velopharyngeal port also plays velopharyngeal port (green pointer). This bilabial stop
a role in rest breathing and swallowing. The velopha- consonant (/p/ or /b/ in English) has a closed volume
ryngeal port is typically open in rest breathing, dur- of air behind the blockage: one seal at the lips and the
ing which most people exchange air through the nasal other at the velopharyngeal port. Pressure can build
passageways. The velopharyngeal port must be closed up behind the labial blockage for the unique sound
/b/ /ʃ/
Figure 10–14. Left, closure of the vocal tract at the lips (red pointer) and velopharyngeal port (green
pointer) for the stop consonant /b/. Right, tight constriction of the vocal tract in the palatal-alveolar region (red
pointer) and closure of the velopharyngeal port for the fricative /ʃ/. Note the tight velopharyngeal port closure
for both sounds, necessary for the buildup of air pressure behind the constrictions.
characteristics of a stop consonant — a “popping” noise ments are available at https://www.youtube.com/

when the block is broken (in this case when the lips watch?v=Nvvn-ZVdeqQ and https://www.youtube.
are separated). com/watch?v=ezOwCf835YA (enter “speech move-
The right image of Figure 10–14 shows a constric- ment” as a key phrase for more examples).
tion for the fricative /ʃ/ (as in the first sound of “shoe”). Knowledge of how articulatory movements relate
Fricative constrictions, or blockages, are not complete to speech sounds is likely to be beneficial when attempt-
as in the case of stops but are sufficiently narrow to ing to treat a speech disorder in which the underlying
allow pressure to build up behind them. The /ʃ/ con- problem is one of motor control (as in, for example,
striction is shown at one location by the red pointer to Parkinson’s disease or cerebral palsy: see Chapter 14).
one part of the tongue, but as the image shows, the con- For the time being, our knowledge of the relationship
striction is long and narrow. Pressure built up behind of articulatory movements to specific speech sounds is
this constriction pushes air through it and creates a very limited. It is a complex issue.
hissing noise (try producing a long “sh” and listen to
the hissing noise). As in the case of stop consonants,
a closed velopharyngeal port is required for efficient Coarticulation
pressure buildup. The closed velopharyngeal port is
shown clearly in the upper right of the /ʃ/ image. The Coarticulation is the term used to describe the influence
lips, however, are open to allow the passage of air that of one speech sound on another speech sound. For
emerges from the tight but not complete constriction. example, even though the words “sad” (/sæd/) and
“sag” (/sæg/) share the vowel “ae” as in “hat,” tongue
movements throughout the vowel (i.e., across time)
Valving in the Vocal Tract and the differ because of the difference in the final stop con-
Formation of Speech Sounds sonant. The “d” sound affects the “ae” in one way, the
“g” sound in another way. Now consider the word pair
“Articulatory behavior for speech” refers to the posi- “sheik” (/ʃik/) and “shock” (/ʃɑk/), in which the final
tions and motions of the jaw, tongue, lips, pharynx, consonants are the same and the vowel is either “ee” or
and velum, as well the contacts between these movable “ah.” Here the different vowels affect the articulation of
structures and the rigid boundaries of the vocal tract the /k/; the movements and location of the complete
(such as the hard palate). As might be imagined, obtain- constriction for /k/ differ depending on the vowel. In
ing information on positions and motions of certain this case, the articulation of the consonant is affected by
articulators, especially the tongue, velum, and phar- the preceding vowel; in the first case, the articulation of
ynx, is exceedingly difficult because they are largely the vowel is affected by the following consonant.
hidden within the oral and pharyngeal cavities. Speech Coarticulation is everywhere in speech produc-
scientists have developed several different techniques tion. The articulation of any speech sound is simul-
over the years to study motions of these hidden artic- taneously affected by the sounds preceding it and
ulators. Many of these techniques involve x-ray and the sounds following it. Notice the word “sounds”
magnetic field technologies. In recent years, magnetic in the preceding sentence. The influence of coarticu-
resonance imaging (MRI) techniques have been used lation is present not only for adjacent sounds, as in
to study speech movements. Regardless of the tech- the examples of “sad” versus “sag” and “sheik” ver-
nique, one fact remains startlingly obvious: During sus “shock,” but it can extend across multiple sounds.
speech, each of the articulators is in constant motion, A good example of this is the word combination “least
making it nearly impossible to connect specific move- soon,” in which lip rounding for /u/ may be observed
ments with specific sounds. A speech scientist can- on the “st” consonant sequence in “least.” When a sin-
not look at the motion of a particular articulator and gle sound — what is often called a sound segment — is
identify where it begins its contribution to a sound affected by so many variables, no wonder it is nearly
(such as a vowel) and where it ends its contribution. impossible to connect specific articulatory movements
For example, when a speaker says the word, electric- with specific sound segments.
ity, we can represent its 11 component sounds — ee, l,
eh, k, t, r, ih, s, ih, t, ee — as a series of discrete sym-
bols (/əlEktrIsIɾi/, in phonetic transcription, Chapter Clinical Applications: An Example
12), but the articulatory motions used to produce the
word appear as a rapid, sometimes jerky blur of undif- Knowledge of the location and function of valves in the
ferentiated gestures. Good examples of speech move- speech mechanism is essential to accurately diagnose
The Weedy Garden of Speech

So . . . articulatory movements for any sound includes speaking rate (how fast a person
specific speech sound, and their acoustic results, talks), dialect, speaking style (e,g., casual versus
depend on the sounds preceding and following formal), and the level of emphasis placed on the
the specific speech sound. This is variability speech sound, age, and sex of the speaker. Speech
of a specific speech sound due to its phonetic scientists have spent years researching coarticula-
context — the identity of the surrounding speech tion and the variability that results from it, due to
sounds. But wait: phonetic context is hardly the all of these variables. This research has produced
only variable that affects the movements and many answers, but along the scientific way, new
acoustic characteristics of a speech sound. The list questions have popped up like weeds in a well-
of other variables that can cause variability of the fertilized garden.
articulatory and acoustic characteristics of a speech
and properly treat speech disorders. A good example Knowledge of this basic anatomy and physiology
is the case of a client who is perceived to speak with is important to understanding the many disorders and
excessive nasality. In many cases, excessive nasality diseases that affect the speech mechanism.
can be traced to a problem with control of the velo- The respiratory system functions to support life,
pharyngeal port. Structural problems, as in the case exchanging O2 and CO2 at the alveoli; air is brought
of a repaired cleft palate, or muscle weakness in the into the lungs by expanding them (inhalation) and
absence of an obvious structural issue, can prevent the transported out of the lungs by compressing them
velopharyngeal port from closing sufficiently during (exhalation).
the production of vowels. This may result in hyper- The respiratory system is composed of muscles
nasality. Specialized techniques for diagnosis and in the thorax and abdomen, plus a large muscle (the
therapy can be used by the person who understands diaphragm) that separates the thoracic and abdomi-
both the anatomical structure of the speech mechanism nal cavities; many nonmuscular tissues (membranes,
valves — in the example, the velopharyngeal port — and bone, cartilage) play an important role in the respira-
the physiology (function) of the valves. tory system.
Rest (vegetative) breathing involves inhalations
and exhalations of small volumes and roughly equal
durations.
Chapter Summary
Speech breathing involves very rapid inhalations
in preparation for an utterance and relatively long
The speech mechanism can be thought of as consisting exhalations during which speech is produced.
of three major components. The main goal of the respiratory system is to main-
The respiratory system is the power supply, gener- tain a constant, positive pressure in the lungs and tra-
ating pressures and flows. These pressures and flows chea during speech utterances; this pressure is critical
initiate and maintain vibration of the vocal folds, which to vibrating the vocal folds for phonation.
generates the sound source for speech. The pressure generated in the lungs (and trachea)
The acoustic signal generated by the vibrating for speech is related to the loudness of speech: the
vocal folds is shaped into different speech sounds by the higher the pressure, the greater is the loudness.
moving structures and fixed boundaries of the upper During speech, the lungs are compressed by the
airway. muscular actions of both the thorax and abdomen;
The respiratory system, larynx, and upper airways abdominal muscles are critical to maintaining an
are composed of many muscles, membranes, ligaments, increased lung pressure for speech at a constant level.
bones, and cartilages, all of which are covered in detail The larynx is composed of a framework of carti-
in speech anatomy and physiology coursework taken lage, membranes, ligaments, and muscle; the bottom
by students majoring in Communication Sciences and part of the larynx sits on top of the trachea, and its top
Disorders. part can be considered the lower end of the throat.
The vocal folds are two bands of tissue that run The articulators shape the vocal tract for vocalic
from the front to the back of the larynx, between the sounds, in which the vocal tract is open and air can
thyroid cartilage (front point of attachment) and aryte- flow freely to the atmosphere, and a class of consonants
noid cartilages (back point of attachment). called obstruents in which there is a constriction that
There are five intrinsic muscles of the larynx. Three completely or partially blocks the airflow for a short
of these muscles can close the vocal folds, one can open time interval.
the vocal folds, and one can stretch them. One of the An important valve between the vocal and nasal
muscles that closes the vocal folds also tenses the main tracts is the velopharyngeal port, where air coming
muscular bulk of the vocal folds. through the throat can be blocked from entering the
The vocal folds open and close. When they open, nasal cavities. The velopharyngeal port opens and
the space between them is called the glottis, which is closes during speech production, according to the
the opening into the trachea. The vocal folds open to requirements of the speech sound being produced.
allow air into and out of the lungs, via the trachea, for
breathing purposes.
The protective function of the larynx is to shut References
down the airway (to close the vocal folds rapidly and
forcefully) when food, liquid, or other material enters Darley, F. L., Aronson, A. E., & Brown, J. R. (1975). Motor
the top part of the larynx. speech disorders. Philadelphia, PA: Saunders.
The vocal folds also open and close to vibrate for Hixon, T. J., Goldman, M. D., & Mead, J. (1973). Kinematics
the purposes of phonation; phonation provides the of the chest wall during speech production: Volume dis-
placements of the rib cage, abdomen, and lung. Journal of
sound source for speech, sometimes referred to more
Speech and Hearing Research, 16, 78–115.
specifically as voice production. Hixon, T. J., Weismer G., & Hoit, J. D. (2020). Preclinical speech
The opening and closing motions of the vocal folds science: Anatomy, physiology, acoustics, perception (3rd ed.).
for phonation are controlled by aerodynamic (pres- San Diego, CA: Plural Publishing.
sures and flows) forces, rather than directly by muscle Rogers, D. J., Setlur, J., Raol, N., Maurer, R., & Hartnick, C. J.
contraction. (2014). Evaluation of true vocal fold growth as a function
The rate at which the vocal folds vibrate depends of age. Otolaryngology–Head and Neck Surgery, 10, 681–686.
primarily on the length of the vocal fold. Su, M. C., Yeh, T. H., Tan, C. T., Lin, C. D., Linne, O. C., &
The tissue of the human vocal fold is specialized Lee, S. Y. (2002). Measurement of adult vocal fold length.
for purposes of phonation. Journal of Laryngology and Otology, 116, 447–449.
Important characteristics of vocal fold vibration Titze, I. R. (2011). Vocal fold mass is not a useful quantity for
describing F0 in vocalization. Journal of Speech, Language,
include the fundamental frequency (F0 = rate of vibra-
and Hearing Research, 54, 520–522.
tion), intensity of the sound generated by the vibration, Zemlin, W.R. (1997). Speech and hearing science: Anatomy and
and quality of the voice resulting from the vibration. physiology (4th ed.). Boston, MA: Pearson
The upper airways include the vocal tract, which Zhou, X., Woo, J., Stone, M., Prince, J. L., & Espy-Wilson, C. Y.
is the tube of air between the vocal folds and the lips, (2013). Improved vocal tract reconstruction and modeling
and the nasal tract, the airway between the velopha- using an image super-resolution technique. Journal of the
ryngeal port and the nostrils. Acoustical Society of America, 133, EL439–445.
11
Speech Science II
Introduction symbols are enclosed in forward slashes. For quick ref-

erence, the phonetic symbols used in this chapter are
In the previous chapter, reference was made to the listed in Table 11–1.
acoustic signal emerging from the vocal tract (or nasal
tract, or both at the same time). For the remainder of
this chapter, we refer to this as the speech acoustic signal. Table 11–1. Phonetic Symbols Used in This Chapter
The speech acoustic signal is the product of respiratory,
laryngeal, and upper airway behaviors described in the /ɑ/ “ah” in “hot”
previous chapter.
The current chapter presents a brief introduction /i/ “ee” in “beat”
to the speech acoustic signal. Speech production can /u/ “oo” in “boot”
be thought of as the concerted action of all structures
of the speech mechanism to produce an acoustic signal /æ/ “ae” in “hat”
that can be recognized by a listener as an intelligible // “urr” in “bird”
message. The connection between the speech acoustic
signal and speech intelligibility suggests the impor- /2/ short “rr” in last syllable of “longer”
tance of knowing about speech perception as well. In /ɾ/ tapped “d” in “butter”
the absence of a listener who performs perceptual anal-
ysis of the speech acoustic signal, the signal is a little /dZ/ “j” as in “judge”
like the proverbial tree that falls in a forest empty of /z/ “zee” as in “zebra”
hearing organisms. The human forest is full of hearing
organisms, lots of them human, so speech perception is /ð/ “th” sound in “the”
also considered in the current chapter. /ɔ/ in “bought”*
To avoid confusions between orthographic and
phonetic representations of sounds, phonetic sym- *The vowel /ɔ/ is pronounced as /ɑ/ in many dialects of Ameri-
bols (discussed in Chapter 12) are used here, usually can English, such as the dialect heard in many parts of Califor-
nia. Words such as “caught” and “cot” are pronounced with the
following the orthographic and phonetic representa- same vowel (“caught” /kɑt/, “cot” /kɑt/). This is in contrast to the
tion: “beat” /bit/, “hat” /hæt/. Orthographic repre- “caught” /kɔt/ and “cot” /kɑt/ heard in eastern Pennsylvania and
sentations are enclosed by quotation marks; phonetic eastern Maryland, among other regions of the United States.
153
The Theory of Speech Acoustics The sound source is the same for all vowels; it is
not adjusted for different vowels. This is an important
Theories are common in all academic disciplines. It is aspect of the theory and is returned to following a
relatively uncommon, however, for the majority of sci- description of the sound filter.
entists in one discipline to agree on a particular theory.
An exception to this is the acoustic theory of speech
production, developed in the 1940s and 1950s by Dr. The Sound Filter
Gunnar Fant (1919–2009), an eminent speech scientist
who for many years was the director of a well-known In the acoustic theory of vowel production, the sound
speech research laboratory in Stockholm, Sweden (the filter is the vocal tract. Fant showed that the vocal tract
Speech Transmission Laboratory). was like a resonating tube. Tube resonators have more
In 1960, Fant published a textbook called (not than one resonant frequency, or frequencies at which
surprisingly), Acoustic Theory of Speech Production. The they vibrate with maximal amplitudes. The air within
book provided a detailed account of how the speech a tube is an example of a multiple-frequency resona-
mechanism generated the speech acoustic signal. In tor. When the air within a tube is set into vibration, the
most respects, this theory is accepted as correct by acoustic result is a signal with multiple resonant fre-
speech scientists.. The theory has been refined and quencies. A good analogy to the acoustics of the vocal
elaborated by other speech scientists, most notably, tract filter is the acoustics of organ pipes.
James Flanagan (1972) and Kenneth Stevens (1998). Fant showed that the precise frequencies at which
Fant’s theory of speech acoustics can be summa- these resonances occur depend on the shape of the
rized in a single sentence: The output of the vocal tract tube. Different shapes of the vocal tract tube, created by
(that is, the speech acoustic signal) is the product of different positions of the jaw, tongue, lips, and throat
the acoustic characteristics of a sound source combined (pharynx), result in different resonant frequencies.
with a sound filter. Fant’s theory was developed using This was illustrated in Chapter 10 by the MRI images
elegant mathematics that were most precise for the case in Figure 10–13. Selected vocal tract shapes for three
of vowel production. Fortunately, explanation of the different vowels are shown in Figure 11–1 in a more
theory does not require expertise in mathematics. schematic way. The shape of the vocal tract — the air
column between the vocal folds and the lips — is shown
in dark blue for the English vowels /ɑ/ “ah” (as in
The Sound Source “hot”), /i/, “ee” (as in “beat”), and /u/ “oo” (as in
“boot”). The shapes are different for the three vowels.
The sound source in the acoustic theory of vowel pro- The different shapes change the resonant frequencies
duction is the signal generated by the vibrating vocal of the vocal tract, even though the source signal remains
folds. Vibration of the vocal folds generates an acous- the same for all three vowels. The different shapes of
tic signal that has a fundamental frequency (F0) and a the vocal tract produce different sounds — they create
series of harmonics (sometimes called “overtones”). F0 different vowels.
is defined in Chapter 10 as the number of full cycles of The acoustic result of different vocal tract shapes
vocal fold vibration per second. The speech acoustic is shown in Figure 11–1 by the three graphs in the right
signal for vowels includes higher-frequency compo- column. These are vowel spectra, showing energy
nents as well, called harmonics. The harmonic frequen- peaks (the y-axis, labeled amplitude) as a function of
cies of F0 are located at whole number multiples of the frequency (the x-axis, with frequency ranging between
F0, a fact best explained with a simple example. If an 0.0 and 5.0 kHz (0 Hz to 5000 Hz). The vowel spectra
adult male phonates the vowel “ah” with his vocal are ordered from top to bottom just like the vocal tract
folds vibrating with an F0 of 100 Hz, acoustic energy shapes: /ɑ/ at the top, /i/ in the middle, and /u/ at
is generated at the F0 as well as at frequencies that are the bottom. The peaks in each spectrum show the first
whole number multiples (2, 3, 4, 5, 6, . . . n) of the F0. three resonant frequencies of each vowel; these peaks
This acoustic signal has energy at 100 (F0), 200, 300, 400, are indicated by pointers from the labels “F1,” “F2,”
500, 600, . . . n × 100 Hz. A periodic signal with multiple and “F3.” Notice how the peaks are at different fre-
frequency components is called a complex periodic sig- quency locations for the three vowels. The frequency
nal. Vocal fold vibration produces a complex periodic of these peaks can be estimated by dropping a verti-
signal, and it is this signal that serves as “input” to the cal line from a peak to the x-axis and noting where the
vocal tract — it is the source for the speech acoustic sig- line intersects the frequency scale. The use of the “F” to
nal. It is also the signal that people refer to as “voice.” identify the peaks is explained below.
F1
F2
F3
/ɑ/
F1
F3
F2
Amplitude (dB)
/i /
F1
F2
F3 /u /
1.0 2.0 3.0 4.0 5.0
Frequency (kHz)
Figure 11–1. Vocal tract shapes (left ), shown in dark blue, for three vowels (/ɑ/, /i/,
/u/). On the right side are spectra associated with the vowels. F1, F2, F3 indicate peaks in
the spectrum, called “formants.” Note how the peaks in these spectra occur at different
frequencies. Those differences are related directly to the differences in vocal tract shapes
for the three vowels.
Vowel Sounds Result From amplitudes are “shaped” by the resonant frequencies
the Combination of Source of the vocal tract filter (vocal tract tube). The source
and Filter Acoustics has energy at a series of frequencies, and the different
shapes of the vocal tract “pick out” different frequen-
Now that the acoustic characteristic of the source and cies to emphasize or reject. The signal coming from the
filter have been identified, a more specific statement vocal tract consists of frequencies in the source that are
of Fant’s theory can be made. The speech acoustic emphasized by the resonant frequencies of the vocal
signal results from a source whose frequencies and tract tube. This brings us back to the single-sentence
statement of Fant’s theory given previously: The out- Resonant Frequencies of Vowels Are
put of the vocal tract (that is, the speech acoustic signal) Called Formants: Spectrograms
is the product of the acoustic characteristics of a sound
source combined with a sound filter. The difference in vowels is primarily a result of their
different resonant frequencies. This can be demon-
strated in a simple way. Figure 11–2 shows a spectro-
Form and Function gram, a type of visual record of the speech signal. The
sentence, “The dude that dotted the deed was Dad”
Students typically do not encounter was spoken by the author into a microphone connected
the word “formant” until they have lectures on to a computer. A speech analysis program called TF32
speech acoustics. The word is used to describe (Milenkovic, 2001) generated the spectrogram and
speech spectra, and specifically, the peaks in was used to edit and analyze this speech signal. Spec-
the spectrum (frequencies at which sound trograms, which are related (but not identical) to the
intensity is the greatest). Resonators such as voiceprints used in legal cases of speaker identification,
organ pipes have spectra with multiple peaks show time on the x-axis and frequency on the y-axis. In
too, but these are not called “formants,” they Figure 11–2, frequency is marked in steps of 1.0 kilo-
are called — well — peaks. “Formant” has a Latin hertz (1000 Hz) from 0 to 4.0 kHz (4000 Hz). On the
root that means “forming.” The idea behind the x-axis, time is marked by a series of hash marks, each
word “formant” is something being formed, in separated from the next by 1/10th s (100 ms).
this case, a resonant frequency resulting from the The sentence shown in this spectrogram contains
formation of a vocal tract shape. The word was examples of the “corner” vowels of English. Vowels are
coined to designate a resonance that was formed classified by phoneticians according to three categories,
by the vocal tract — it is a speech-specific term. two of which describe the position of the tongue in the
vocal tract (Chapter 12). The vowels “oo” /u/, “ah”
/u / /ɑ/ /i / /æ /
“The dude that dotted the deed was Dad”
4.0
3.0
Frequency (Hz)
2.0 F3
F2
1.0
F1
baseline
100 ms
Time Beginning End
Measurement point
Figure 11–2. Spectrogram showing the first three formant frequencies for the corner vowels of American English, in
the sentence, “The dude that dotted the deed was Dad.” The approximate formant frequencies for /æ/ are marked by a
short red horizontal bar halfway between the beginning and end of the vowel. Formant frequencies can be estimated for
the other vowels in the same way. Starting from the bottom of the frequency (y-axis) scale, the first dark band is F1, the next
one F2, and the next one F3 (see labeled formants for “oo” /u/). The red contours for F1, F2, and F3 of /ɑ/ show formant
movement (change of formant frequencies over time) throughout the vowel.
Speech Scientists Vow to Study Vowels

Many speech scientists have studied the second-language learner absorbs an unknown
the acoustic characteristics of vowels. Vowels are vowel type like /I/ to a vowel type in her native
fascinating for several reasons. Some languages use language (in this case, /i/). Another interesting
just a few vowels, others use relatively many. For example of an English vowel that is unusual in
example, Greek and Spanish are languages with other languages is the /æ/ “ae” sound in words
five vowels, in contrast to the 12-vowel system of such as “cat,” “bat,” and “and.” This English vowel
English. Differences between the vowel systems of may not be as challenging for native speakers
two languages have a significant influence on the learning English as a second language, possibly
ability of a native speaker of one language (say, because their native languages do not have vowels
Spanish) to produce “native-sounding” vowels of close to /æ/, eliminating the competition, so to
a second language (English, for example). Spanish speak. But /æ/ is also interesting in American
has the vowel /i/ as in the English word “beat,” English because it is produced in so many different
but does not have the English vowel “ih” /I/ as ways depending on a talker’s dialect group. Some
in the word “bit.” These two English vowels have talkers — such as in Wisconsin — produce this vowel
very similar vocal tract shapes and, as expected, in a very distinctive way, almost as if they are
similar formant frequencies. The vowel /I/, saying “kyat” for “cat.” Other talkers, especially
however, is quite rare in languages other than young adults from the east and west coasts (and in
English. So, when a native speaker of Spanish (or parts of the country’s interior), produce this vowel
of Greek, or Korean, or Japanese, as well as many as something between the vowel of “cat” and of
other languages) is learning English and attempts “cot.” Vowel changes such as this are common in
to say a word such as “bit,” he or she is likely to languages of the world, adding yet another reason
produce the vowel in an /i/-like way (i.e., to say why speech scientists tend to go ga-ga over vowels
something closer to “beat” than “bit”). It is as if and their acoustics.
/ɑ/, “ee” /i/, and “ae” (as in “hat”) /æ/, marked in in the middle of the vowel’s duration: halfway between
Figure 11–2, represent the extremes of English vowel its beginning and end. To measure a formant frequency,
articulation — they define the “corners” of the space a single point in time must be chosen for the measure-
within which all vowels are articulated. The highest ment, because formant frequencies change through-
and most back vowel is /u/, the lowest and most back out a vowel (see later in text). The short horizontal
vowel /ɑ/, the highest and most front vowel /i/, and lines intersecting the vertical line show where the for-
the lowest and most front vowel /æ/. Formant fre- mant measurements are made at the halfway point of
quencies distinguish different vowels, so the distinc- the vowel.
tions should be obvious in an acoustic record of these For example, the red cross for the F1 frequency of
vowels. /æ/ is below 1000 Hz (1.0 kHz) a little less than half of
In Figure 11–2, parts of the spectrographic dis- the difference between 1000 Hz (1.0 kHz) and the base-
play appear as dark bars. An example of these bars line. Referring to the frequency lines for the eyeball esti-
is found above the vowel /u/ in the word “dude.” mate, this places the F1 around 600 Hz. The F2 frequency
Arrows labeled “F1,” “F2,” and “F3” point to the first, is about one third of the way down from 2000 Hz to
second, and third dark bars above the baseline of the 1000 Hz; its frequency appears to be about 1700 Hz. The
spectrogram (bottom of the spectrogram). These bars F3 cross is halfway between 3000 Hz and 2000 Hz; its
indicate the regions where there is maximum energy frequency can be estimated by eye as 2500 Hz.
in the signal — that is, they are the formant frequencies How accurate are the formant frequency esti-
of the vowel. mates for the vowel /æ/? Comparison data from Hil-
The first three formant frequencies for each vowel lenbrand, Getty, Clark, and Wheeler (1995) for a large
can be estimated by making an eyeball measurement group of adult males who produced the vowel /æ/
halfway between the beginning and end of each vowel, show an average F1 = 588 Hz, F2 = 1952 Hz, and F3 =
as marked for the formant frequencies of the vowel 2601 Hz — not bad for our eyeball estimates.
/æ/ in “Dad.” The upward pointing arrows show the The first three formant frequencies for /æ/ can
beginning and end of the vowel. The red line is placed be used as reference points for comparison with the
first three formant frequencies of the other corner Table 11–2. Average Values of the First Three Formant
vowels. For example, compared to /æ/, /u/ clearly Frequencies (F1, F2, F3, in Hz) for the Vowel /E/ “eh” (as in
has a lower F1 frequency, a lower F2, and a lower F3. “head”) as Spoken by 45 Adult Males, 48 Adult Females,
The vowel /i/ is even more different than /æ/, with and 46 Ten- to Twelve-Year-Olds
a very low frequency F1 and the highest F2 of all the
corner vowels. Decreasing Vocal Tract Length
The precise values of the formant frequencies are
not of concern to the present discussion. Rather, Figure
Adult Men Adult Women Children
11–2 demonstrates that vowels are distinguished from
each other by the frequency values of their formants, F1 580 731 749
even by crude (eyeball) estimates of the frequencies.
F2 1799 2058 2267
F3 2605 2979 3310

The Tube Model of Human Vocal
Tract Makes Interesting Predictions Note. The data show how the formant frequencies for the same
vowel vary depending on who is producing the vowel. Values
and Suggests Interesting Problems reported by “Acoustic Characteristics of American English Vowels,”
by J. M. Hillenbrand, L. A. Getty, M. J. Clark, and K. Wheeler, 1995,
The part of Fant’s theory that considers the vocal tract Journal of the Acoustical Society of America, 24, pp. 175–184.
a resonating tube, with multiple resonant frequencies,
makes an interesting prediction. Tubes that are exactly
the same in every way except for their length differ shortens, formant frequencies increase, just as pre-
in their resonant frequencies. Casual familiarity with dicted from Fant’s theory. No one has quite figured
the construction of a pipe organ, with pipes (tubes) of out how these differences are heard as the same vowel,
many different lengths, suggests that the shorter pipes even though it is a central problem in speech percep-
produce higher-pitched tones, the longer pipes lower- tion research (see later in this chapter).
pitched tones. The higher-pitched tones of short as
compared with long organ pipes result from the higher
resonant frequencies of shorter pipes. A Spectrogram Shows Formant
The vocal tract tube is like a resonating pipe and Frequencies and Much More
is subject to the same acoustical principles. A shorter
vocal tract has higher resonant frequencies than a lon- The continuous red lines in Figure 11–2 show contours
ger vocal tract. Five-year-old children have shorter of the first three formant frequencies for the vowel
vocal tracts than adult women, who in turn have /ɑ/ from its beginning to end. The red contours show
shorter vocal tracts than adult men. Fant’s theory pre- that the formants move up and down the frequency
dicts that when the same vowel is spoken by children, scale throughout the vowel. For example, F1 for /ɑ/
women, and men, the formant frequencies are highest starts at a relatively low frequency, rises to its high-
for children, next highest for women, and lowest for est frequency close to the end of the vowel, and then
men. This prediction has been confirmed many times decreases. Similarly, the F2 contour falls in frequency
in the research literature (Hillenbrand et al., 1995; from its beginning and then rises a small amount.
Peterson & Barney, 1952). As noted in Chapter 10, during speech the articula-
Earlier it was noted that perceptual differences tors are in constant motion which create vocal tract
between vowel sounds can be explained by their differ- shapes — and therefore formant frequencies — that are
ent formant frequencies. If vowel formant frequencies constantly changing over time. Formant transitions
for a given vowel depend on who is speaking, there is are the acoustic result of the constantly changing vocal
the interesting problem of how listeners perceive the tract shapes.
same vowel spoken with such different acoustic char- The speech acoustic signal reflects everything
acteristics. For example, Table 11–2 shows the average taking place in the vocal tract as a speaker produces
F1, F2, and F3 frequencies reported by Hillenbrand, speech. Courses in acoustic phonetics cover informa-
Getty, Clark, and Wheeler (1995) for the vowel “eh” (as tion on the acoustic characteristics of different speech
in “head”) spoken by relatively large groups of adult sounds, voice qualities, and prosody (melody of voice).
males, adult females, and children aged 10 to 12 years. This information is critical to understanding how lis-
The values of the first three resonances, or formants, teners perceive speech and its various meanings. It is
for this vowel are clearly different depending on the also critical to designing computers programs for the
speaker group. More specifically, as the vocal tract synthesis and recognition of speech.
speech synthesis is so good as to be nearly indistin-

Letters and Sounds Are guishable from genuine human speech. Acoustic pho-
Not the Same netics research made this possible.
Relatively cheap, high-quality speech synthesis
Phonetic transcription symbols underscore the is more than an academic exercise for geeks who like
difference between the speech sounds that are speech research. Speech synthesizers are an impor-
actually produced compared with letters in an tant communication option for persons who cannot
orthographic representation of a word. The spell- communicate orally because of neurological disease.
ing of “dotted” has a double “t” at the end of Speech synthesis by computer offers a wonderful
the first syllable, which may lead to the expecta- option for persons who, despite a neurological status
tion of the speech sound /t/. Most speakers of that prevents speech production, have sufficient con-
American English (as compared with English trol of their hands or another part of the body to issue
spoken in the United Kingdom) do not say commands for the rapid synthesis of a message. For
/dɑtəd/, however, or even /dɑdəd/. Instead, example, some children with cerebral palsy cannot
they produce the middle consonant as a flap speak intelligibly but can control a joystick or a head
(Chapter 12), which is a very short (~25 ms pointer, which in turn can be used to control a speech
or 0.025 sec) /d/-like sound, much like the synthesizer. Speech synthesis is truly a case where basic
one in “butter” or “sitter.” This sound, whose research has been translated to clinical application.
phonetic symbol is /ɾ/, is a good example of the
difference between a letter (orthographic) and
phonetic representation of words. Speech Recognition
The flip side of speech synthesis is called automatic

speech recognition. This is the process of converting
Speech Synthesis human speech into text, or into an action (e.g., oper-
ating a door) based on speech commands. The use of
Scientists have been trying to make machines talk for automatic speech recognition for simple decision-mak-
many years, perhaps even centuries. Speech synthesis is ing is familiar to everyone who has called a business
the term used to describe the production of speech-like (such as an airline) and been led through a maze of
sounds, words, and sentences by machines. In the early, speech-guided options to achieve a goal (like talking to
modern days of speech synthesis, these machines were a human, for example). These speech recognizers oper-
large, awkward electronic devices. Computers now ate in a relatively simple way, storing a limited num-
generate high-quality synthetic speech using sophisti- ber of acoustic phonetic “templates” for words such as
cated software. If you are interested in the history of “yes” and “no,” numbers from one to ten, or the letters
speech synthesis, a good place to start is the historical of the alphabet (“em,” “dee,” “ex”). Not surprisingly,
account published by Dennis Klatt (1987), a pioneer in the greater the number of options to be distinguished,
the field. Enter the phrase, “history of speech synthe- the more complex the speech recognizer must be for
sis,” in a search engine to get a listing of many sites recognition success.
devoted to the topic, some with audio examples of One component of modern speech recognition
synthesized speech from 1939 to the current time. Story devices is an acoustic analysis program that deter-
(2019) has published a more recent account of the his- mines the sounds that make up words. This capability
tory of speech synthesis. is based on the speech acoustic work described above,
The quality of speech synthesis improved dramati- in which the acoustic characteristics are determined for
cally in the 1960s and 1970s. Much of this improvement each speech sound. Although modern speech recogniz-
can be attributed to studies of the acoustic characteris- ers use much more information than just the acoustic
tics of speech sounds, as produced by real talkers. These characteristics of speech sounds to “hear” speech accu-
studies were often completed using spectrographic rately, all recognizers must incorporate information on
analysis, as in Figure 11–2, or with more sophisticated these characteristics for successful performance.
acoustic analysis tools. As knowledge of the speech
acoustic characteristics of human speech became more
detailed, programs for the synthesis of speech were Speech Acoustics and
improved, and a better “product” emerged from these Assistive Listening Devices
talking machines. Increases in computing power and
capacity allowed more complicated speech synthesis Speech acoustic characteristics are important because
programs to synthesize higher-quality speech. Modern they play a critical role in the design and programming
of assistive listening devices such as hearing aids and the people, characteristics, and actions to which the
cochlear implants. These devices, described in Chap- speech refers. But this question is not an answer to
ter 24, function as amplifiers and filters. They contain a “how.” Rather, it is a statement of what happens, not
microphone to sense the speech signal, which is “pro- an explanation. A deeper question, the question of how
cessed” by electronic components contained within the speech perception takes place, is what the listener’s
device. The processing may include selection of certain brain does with the acoustic signal to hear words and
frequencies to be amplified, and suppression of other determine the meaning of a speaker’s utterances
frequencies. What is the rationale for amplification of A brief consideration of some major issues in
certain frequencies and suppression of others? The speech perception research demonstrates that under-
most obvious answer is that the device is configured to standing the “how” of the process is challenging. In
amplify frequencies most crucial to the understanding the discussion that follows, the term “objects of speech
of speech. Much of the acoustic phonetics research has perception” is used. This refers to the possible goal of
been devoted to determining the important frequency the brain in perceiving speech. What is the final prod-
characteristics of human speech sounds. Acoustic pho- uct of the brain’s processing of the incoming speech
netics research informs the engineer who designs hear- acoustic signal?
ing aids or cochlear implants about optimal processing
characteristics to achieve the best speech perception
performance. The Perception of Speech:
Special Mechanisms?
Speech Perception A very old but enduring theory of speech perception is

that part of the brain is devoted exclusively to speech
Many speech scientists, experimental psychologists, perception (Liberman, Cooper, Shankweiler, & Stud-
and linguists throughout the world devote their dert-Kennedy, 1967). This mechanism is thought to be
research careers to speech perception. As often hap- speech- and species-specific. The term speech-specific
pens when many scientists work on a single problem, means that the special brain mechanism is “turned on”
different camps have been developed to defend one of when a speech acoustic signal enters the ear. The special
several theories that seek to explain how humans per- mechanism is not used for general auditory perception,
ceive speech. such as music, the sound of breaking glass, or the rus-
All theories of speech perception require that the tling of leaves that suggests the approach of a summer
acoustic characteristics of speech sounds and perhaps storm. The term species-specific means that only humans
of larger “units” (such as syllables, or perhaps even are endowed with this special mechanism, and that the
whole words) be processed by the human auditory special part of the brain for speech perception evolved
system. Children presumably learn a great deal about over thousands of years to create a communication link
the acoustic characteristics of speech as they are devel- with speech production.
oping language skills. At some point, children learn to The theory is called the motor theory of speech percep-
use their knowledge of the acoustic characteristics of tion. In its simplest form, the theory states that speech
speech sounds and prosody (the melody and rhythm is perceived by reference to articulation. The idea is
of speech) to recognize words, emotional states of talk- that speakers produce a speech acoustic signal that
ers, and more subtle aspects of communication such as encodes (like encryption) the articulator movements
anger or joking. Knowledge of speech acoustic infor- that produced the signal. When the speech acoustic
mation is also likely to play a role in the development signal enters the ear and the brain of the listener, the
of speech production skills. The best evidence for this code must be broken to perceive the sounds produced
is the connection between “prelingual,” severe hear- by the speaker. The special mechanism in the brain
ing loss (loss suffered before the age of about 5 years) performs this decoding. In the motor theory, the role
and speech production that is significantly unintel- of the speech acoustic signal is to carry the code that
ligible. Persons with severe hearing loss are likely to allows the brain to unlock the underlying articulation
produce speech that is difficult to understand. Clearly, that produced the signal.
the speech acoustic signal plays an important role in Why did the motor theorists reject the speech
normal and disordered communication. acoustic signal as a reliable source of the perceptual
Many issues have been investigated to understand identification of speech sounds? In the view of the
the perception of speech. The question may seem triv- motor theorists, the speech acoustic signal for any
ial: after all, as infants we are exposed to speech and given sound was too variable and therefore not reli-
language and like many other skills we master, speech able as a cue to the identity of the sound. The acoustic
perception is learned by connecting heard speech with characteristics of /b/, for example, vary depending on
such factors as the phonetic context in which the con- sounds. In other words, the theory did not have much to
sonant is produced (e.g., /bit/ “beat” versus /bæt/ say about how meaning is obtained from speech
“bat”), who is producing the sound (e.g., men, women, perception.
children), how quickly the sound was produced (e.g., The implications of the motor theory are inter-
the speaking rate), as well as other variables (see the esting. First, because the special brain mechanism for
earlier section, “The Theory of Speech Acoustics”). speech perception is an outcome of human evolution,
The variable of the speaker is especially instruc- it must be present in the brains of human infants. Of
tive. As discussed earlier, the vocal tract lengths of men, course, even specialized mechanisms that are part of
women, and children are different, so the acoustic char- the human brain endowment are likely to develop
acteristics of all sounds depend to a significant degree throughout childhood, but nonetheless, the neural
on who is doing the speaking. The motor theorists hardware is there and presumably ready for operation.
asked, “How does a human store all the many acous- In fact, there is a good deal of research (as reviewed
tic characteristics of a single sound? Their answer was, in Galle & McMurray, 2014) that shows infant speech
“The speech acoustic signal for a speech sound may perception for sound category distinctions such as /p/
be highly variable, but the articulatory movements versus /b/ to be similar to the same distinctions per-
are not. All speakers — men, women, children — pro- ceived by adults. This finding seems to be consistent
duce (for example) a /b/ by closing the lips and closed with the idea that the speech perception mechanism is
velopharyngeal port so pressure can be built up in the a special part of human brains, whether infant or adult.
vocal tract.” Another implication is that animals should not be
Hence, the idea was formed of focusing on the artic- able to make speech sound category distinctions simi-
ulatory characteristics of speech sounds as the “objects” lar to those observed in humans. Animals do not talk,
of speech perception. or at least they do not articulate sequences of speech
The coupling between speech production and sounds in a human-like way. According to the motor
speech perception is critical to this theory. Over the theory of speech perception, animals should not hear
course of evolution, the special encoding/decod- phonetic distinctions because they do not produce pho-
ing process served the needs of human communica- netic distinctions. This makes sense from the evolu-
tion, which depends equally on the performances tionary perspective as previously discussed. Because
of a speaker and a listener. The specialized coupling (according to the motor theory) speech production and
is relevant to the ability to produce and hear speech speech perception co-evolved in humans, the absence
Alex’s Magic Trick

The hypothesized match of the brain modified by moving articulators to create different
mechanisms for speech perception and speech acoustic signals. Watching an African grey’s beak
production is complicated by talking parrots (as and tongue as he says a word or sentence, you
well as other talking birds like mynahs and para- will not see much movement, certainly not the
keets [budgerigars]). Some of you may be familiar kind required to produce consonants. You cannot
with Alex (1977–2007), the African grey parrot see their lips move because, well, they do not
written about extensively by the MIT scientist have lips. Where, then, does the magic happen?
Irene Pepperberg and her colleagues (Patterson & Like many birds, African greys have a syrinx, a
Pepperberg, 1998). Alex talked a lot, had a huge complex anatomical structure deep in their chests
vocabulary, and had a remarkable ability to imitate (Figure 11–3). This is where the magic happens, but
a wide range of sounds, including speech. Alex exactly how it occurs is not clear. The anatomy of
was not the only African grey to produce words the syrinx is understood well (Habib, 2019). It is
and sentence; in fact, they have the reputation of located in the chest and composed of muscularly
being chatterboxes and of continuously learning controlled valves that are “powered” by an air sac,
utterances over the course of their long lifetimes. somewhat like the human lungs. African greys
African greys and other parrots may “articulate” present a problem to the motor theory of speech
speech, partially by moving their tongues, but perception because they do not have the speech
the evidence for this is scant or based on rather production mechanism that, over the course of
artificial experiments (Becker, Nelson, & Suthers, evolution, would evoke a special speech percep-
1994; Patterson & Pepperberg, 1994). Parrots do not tion mechanism. Hmm. Learn more about Alex at
have a sound source (like the vibrating vocal folds) https://en.wikipedia.org/wiki/Alex_(parrot)
These kinds of concern about the motor theory of

speech perception (and there are other concerns: see
Hixon, Weismer, & Hoit, 2020) convinced a group of
researchers to develop theory and data that support the
use of general auditory mechanisms in the perception
of speech. Auditory theories of speech perception are
built around the idea that the very sophisticated audi-
tory and general computing capabilities of the human
brain are well suited to make phonetic distinctions: no
special, species–mechanism is required.
The Perception of Speech:

Auditory Theories
Auditory theories of speech perception are based on

a simple idea. The auditory mechanisms used for the
perception of any acoustic event are used to perceive
speech. There is no special mechanism required. Audi-
tory regions of the brain, as well as areas associated
with these regions, are sufficiently sophisticated to pro-
cess and identify phonetic events.
Figure 11–3. Photo of African grey parrot with an art- Auditory theories are not free of problems. An
ist’s image of the syrinx superimposed on the parrot’s chest. auditory theory is only as good as the speech acoustic
The syrinx is located deep in the parrot’s chest. It consists data presented for auditory processing. For example,
of muscles and membranes, as well as cartilage, that form an auditory analysis that distinguishes between a /p/
valves that can be opened and closed in complex ways to and a /b/ must have some brain representation of the
produce human-sounding speech as well as a wide variety acoustics of the two sounds, and especially how the
of other sounds. This 29-year-old’s name is Friday, and he acoustics differ between the two. The only way the brain
lives with the author’s brother and his spouse. Friday says can have such an acoustic representation is to learn
a lot, even producing utterances that are not mimicked but what makes a /p/ and what makes a /b/. This learn-
are contextually “correct.” These utterances include saying ing, presumably in the earliest stages of speech and
“goodnight” when the lights go out, calling out the dog’s language acquisition, must be based on a consistent
name when the dog is present, and saying “hello” when the speech acoustic characteristic that is always produced
phone rings. Friday also has lots of other sound productions when a person — any person — says a /p/ or /b/ (or
— he does a microwave beep, he whistles, he barks, he does any speech sound). And, if the speech acoustic signal
bodily noises really well, and, of course, does bird whistles demonstrates this consistency for each speech sound in
and bird chatters. All this with an open beak and no lips. a language, where in the brain are these representations
stored, and what is the form of the stored data?
Speech scientists have entertained the idea that
of speech production skills in animals predicts the the brain stores templates (also called prototypes) of the
absence of human-like phonetic perception. acoustic characteristics of speech sounds (sometimes,
If only it were that simple. Experiments have of the acoustic characteristics of syllables). Templates
shown that animals such as Japanese quail (Kluender, can be thought of as “ideal” representations of these
Diehl, & Killeen, 1987) and chinchillas (Kuhl & Miller, speech sound acoustic characteristics. When a speaker
1975; Kuhl & Miller, 1999) perceive phonetic distinc- produces a /b/, for example, as the first sound in the
tions in much the same way as humans. This is a dif- word “badgers” (/bæ2z/), the speech acoustic signal
ficulty for the motor theory of speech perception, for entering the auditory system is compared with all the
the reasons stated previously. Why should animals stored templates to determine a match or mismatch.
perceive these distinctions in the same way as humans A match to the /b/ template is equivalent to percep-
if they, animals, do not have the special speech perception of the /b/ sound. In auditory theories of speech
tion mechanism in the brain — how could they have perception, the “objects” of perception are the acoustic
this mechanism when they lack speech production characteristics of speech sounds, perhaps represented
capabilities? in the brain as acoustic templates.
perception and an auditory theory of speech per-

Matching Speech ception. Part A shows the speech acoustic signal as
Perception and Speech the input to the auditory system. The speech signal is
Production Without a the sentence, “The blue dot is a normal dot.” There are
Special Mechanism 19 speech sounds in this sentence — in the phonetic tran-
scription of the sentence (/DɔbludɑtIzeI nɔrmldɑt/),1
An influence of speech perception skills on each symbol is assumed to be a separate speech sound.
speech production skills does not require a The middle image is a view of the surface of the left
special mechanism to match the two. But, speech hemisphere of the brain, widely considered as the
perception skills may still play a crucial role in hemisphere that contains the tissue used to produce
speech production skills. After all, the speech and perceive speech. The oval encloses the perisylvian
acoustic signal you produce is heard not only speech and language areas, including Broca’s and Wer-
by a listener, but by the producer as well. The nicke’s areas as well as auditory cortex (see Chapter 2).
speech perception mechanism is your monitor The right side of Figure 11–4A presents a “black box”2
for the quality of your speech production. This containing the special mechanism that decodes the
idea has application in the speech and hearing encoded speech acoustic signal. The decoded signals
clinic, where children with delayed mastery of for each speech sound are the articulatory movements
speech sounds (see Chapter 15) are sometimes that produced the speech sounds. The objects of speech
thought to have poor auditory representations perception in the motor theory are these articulatory
(i.e., templates) of the speech sounds they movements.
produce incorrectly. The clinical application is A similar summary diagram is shown in Figure
called auditory training, or ear training, and is 11–4B for an auditory theory of speech perception. The
implemented by stimulating the child’s auditory left and middle images are the same as in Figure 11–4A;
system with multiple, correct repetitions of the the input is the speech acoustic signal and the relevant
incorrectly-produced sounds. Improvement in brain areas indicated by the tissue within the oval. On
speech perception skills is assumed to result the right side of the figure is a single sound segment,
in improved auditory representations of the the vowel /ɑ/ in “spot” (/spɑt/), which is analyzed
incorrectly-produced sounds, which become the by auditory mechanisms that are not specialized for
basis for improved speech production skills. speech perception. The auditory cortex analyzes each
sound, which it delivers to a long-term memory bank
of templates, each one representing a different sound
Is it true that each of the many speech sounds segment. Using /ɑ/ as an example, the auditory analy-
within a language has unique and consistent speech sis is delivered to a mechanism that compares the anal-
acoustic characteristics no matter who produces the ysis to various vowel templates. The best match — in
sounds? Some scientists say “yes” to this question (e.g., this case the acoustic template for /ɑ/— is perceived.
Diehl, Lotto, & Holt, 2004), and others say “no” (e.g., In an auditory theory of speech perception, the objects
Liberman, Cooper, Shankweiler, & Studdert-Kennedy, of perception are these best matches to acoustic tem-
1967). Other scientists place themselves somewhere plates that contain the ideal acoustic characteristics for
between these two positions, arguing that speech per- each sound segment.
ception is based on articulatory gestures but assisted by
speech acoustic data (Fowler, Shankweiler, & Studdert-
Kennedy, 2016). Top-Down Influences: It Is Not
All About Speech Sounds
Motor Theory and Auditory Perhaps the most obvious way to imagine how we
Theory: A Summary perceive speech is to assume that each sound — what-
ever the object of speech perception may be — is ana-
Figure 11–4 presents a schematic summary of the pri- lyzed as it comes into the auditory system. When these
mary difference between the motor theory of speech sequential analyses (one analysis per speech sound)
1
/eI/ is italicized to indicate that this diphthong (as in the word “take” /teI k/) is considered a single sound even though it is represented by
two symbols.
2
“ Black box” is a term used to designate an unknown mechanism for a hypothesized or known process. In this case, the black box is the special
mechanism for perceiving speech.
164
Motor theory of speech perception
Speech acoustic input to auditory system Black box
Special brain
4.0
mechanism transforms
acoustic input to
3.0 articulatory
representations.
The articulatory
2.0 representations are
the objects of speech
perception.
1.0
Frequency (kHz)
Broca’s Wernicke’s
area Primary area
The blue spot is a normal dot auditory
A cortex
Auditory theory of speech perception

Speech acoustic input to auditory system
4.0
3.0
2.0
1.0
Frequency (kHz)
Broca’s Wernicke’s
area Primary area
The blue spot is a normal dot auditory
cortex ɑ
B Time (ms)
Figure 11–4. Schematic summary of the primary difference between the motor theory of speech perception and an auditory theory of speech perception. A. Motor
theory, showing the speech signal input (left ), the left hemisphere speech and language areas (middle), and the “black box” special mechanism for converting the
acoustic signal into an articulatory representation (right ). The conversion is a speech- and species-specific mechanism. In the motor theory, the objects of speech per-
ception are the articulatory events that produced the acoustic input. B. Auditory theory, left and middle images same as in (A), right image showing a vowel identified
by its acoustic properties. General auditory mechanisms perform the analysis. The objects of speech perception are the acoustic characteristics of each speech sound.
are completed. the sounds are put together to iden- acoustic analysis, and the top-down part is the lexical
tify the spoken word. For example, recognition of the search supplemented by other sources of knowledge
word “badgers,” which consists of the five sounds /b/, that contribute to word identification.
/æ/, /dZ/, /2/, and /z/, takes place by analysis of the
sequence of sounds, followed by a process that groups
the sequenced sounds to determine if they form a word. Speech Intelligibility
A process such as this requires very large storage in
long-term memory of all the acoustic sound sequences “Speech intelligibility” describes the degree to which
that can form words. Each time an auditory (acoustic) speech is understandable. The idea of a degree of intel-
analysis is performed on a sequence of sounds, the ligibility is appropriate. Speech can be intelligible or
grouping of a specific sequence can be matched to a unintelligible by degrees; it is not an either-or phenom-
word pattern in memory, if it exists. enon. Everyone has had the experience of not quite
There is good evidence that the process just des- understanding what someone has said or catching just
cribed is not the way word recognition happens. Listen- a few words of an utterance that is for the most part
ers are not passive perceptual beings — they are active unintelligible.
in the process of speech perception. Listeners use the Speech intelligibility is an outcome of the pro-
acoustic analysis of the first or second sound of a word cesses of speech perception regardless of the theory of
and then begin to search quickly all the words known to speech perception to which you pledge allegiance. The
them that may be good “word recognition candidates.” degree to which an utterance is intelligible depends on
This process is called “searching the lexicon.” When many variables. Let’s first assume that an utterance is
this search produces a likely word match, the listener perfectly intelligible when spoken in a quiet environ-
moves on to the acoustic analysis of the next word and ment. In a noisy room, the intelligibility of the same
another active word search as previously described. An utterance may decrease to a degree depending on the
important aspect of this process is that listeners make level of the noise. Speak that same utterance over a
word identification choices before they have completed wireless transmission system (e.g., a cellphone) with
the acoustic analyses of all the sounds in the word (an limited quality, and the intelligibility may be affected
excellent review of these findings is found in Gelfand, (especially if the listener has a hearing loss — see later
Christie, & Gelfand, 2014). The word choices can be in this chapter).
made this way because other aspects of the communi- These examples are based on a hypothetical refer-
cation setting, including the words already recognized, ence utterance with perfect intelligibility, heard in an
the conversational setting, and even the very general optimal listening environment. Other variables may
topic being discussed, help a listener to find the word affect speech intelligibility as well, as is the case when
being analyzed — to predict it, in a sense — before all the person speaking the utterance has a speech impair-
the component sounds have been analyzed. When you ment or the listener has a hearing loss.
think about it, this is a much more efficient way to per- Speech intelligibility tests have been developed to
ceive speech compared with an analysis of all sounds measure the degree of intelligibility loss among speak-
in a word before a word decision can be made. ers or listeners with communication impairments. For
The use of various kinds of processes and knowl- example, the speech of individuals with cleft palate,
edge, such as searching a specific part of the lexicon neurological disorders, and congenital hearing impair-
(e.g., all words beginning with /b/), using situational ment are unintelligible to various degrees. Speech
context, the topic under discussion, and other sources intelligibility tests, in which the listeners have normal
of information to guide word-choice decisions, is hearing, provide an index of the individual’s intelligi-
called top-down processing. It is “smart” perceptual bility deficit. This index can provide objective data to
processing, much more efficient than a passive process track progress due to surgery (as in the case of cleft
of analyzing incoming data in steps and building up palate), speech therapy (as in the case of neurological
a perception. This latter approach is called bottom-up disease), and amplification (as in the case of speakers
processing, and most scientists believe that as a pri- with hearing loss who are fitted with hearing aids).
mary psychological process it is a poor model for any Speech intelligibility can be measured using scal-
form of perception, including speech perception. ing techniques, word lists, and sentences. In addition,
In the example previously given, in which the intelligibility for specific speech sounds has been mea-
acoustic analysis of the first one or two speech sounds sured using phonetic or orthographic transcription.
of a spoken word is used to initiate a focused, active Scaling techniques use a number scale, such as
lexical search, both bottom-up and top-down processes a 7-point scale with 1 = least impaired and 7 = most
are used. The bottom-up part of the process is the initial impaired. Another version of scaling, called Visual
Analog Scaling, shows the listener a continuous line The shape of the vocal tract is changed by motions
with numbers ranging from 1 to 100. One end of the of the jaw, tongue, lips, and pharynx.
scale is defined as completely unintelligible and the Vowels can be described acoustically by the first
other end as completely intelligible. The listener hears three resonant frequencies of the vocal tract tube.
an utterance and operates a slider to place a pointer on Speech scientists call these resonances formants. Differ-
the scale that corresponds to the perceived degree of ent vowels are heard because different shapes of the
intelligibility. vocal tract produce different formant frequencies.
Word and sentence tests are often-used mea- Because the vocal tract resonates like a tube, or
sures to index speech intelligibility. Listeners hear a pipe, shorter vocal tracts have higher formant frequen-
list of words or sentences and write down (or enter cies; longer vocal tracts have lower formant frequen-
into a computer) what they heard. As one example, cies. This explains why children have higher formant
when 50 words are presented and the listener writes frequencies than adult women, who have higher for-
the correct word for 40 of them, speech intelligibility mant frequencies than adult men.
is indexed as 80%. Sentence intelligibility works the Speech acoustics is important to assistive listening
same way. In a sentence test having an overall num- devices, theories of speech perception, speech synthe-
ber of 100 words, correct orthographic transcription sis and recognition, and language development.
of 50 words is indexed as 50% speech intelligibility. In The motor theory of speech perception states that
some cases, an index of the percentage of speech sound speech is perceived by a special, species-specific mech-
intelligibility is desired. Long passages of speech such anism in the human brain; the objects of speech percep-
as reading or conversation are transcribed orthographi- tion are the articulatory movements that generated the
cally or using phonetic transcription to obtain a count acoustic signal.
of the number of sounds that are heard correctly for Auditory theories of speech perception state that
the entire passage. For example, a measure called Per- the speech acoustic signal for any speech sound is suf-
centage of Consonants Correct (PCC) (Shriberg, Aus- ficiently stable to be analyzed reliably by general audi-
tin, Lewis, McSweeney, & Wilson, 1997) is a ratio of the tory mechanisms; the objects of speech perception are
number of consonants heard correctly to the total num- the acoustic characteristics of speech sounds.
ber of consonants in the passage (PCC = percentage Many speech perception theorists believe that the
of consonants correct/percentage of consonants in the sound-by-sound analysis of the speech acoustic signal
passage). PCC is used frequently to document phonetic entering the auditory system is supplemented by top-
development and disorders in children. down processes.
Overall, speech intelligibility measures are use- Top-down processes are essential to an efficient
ful for indexing the degree to which a person’s speech speech perception process; the listener’s knowledge
or hearing loss affects the transmission of information and expectations allow her to identify words before the
between speaker and listener. Speech-language pathol- completion of the sound-by-sound analysis.
ogists and audiologists value these measures due to Speech intelligibility measures use scaling tech-
their straightforward clinical application. niques or word and sentence lists to estimate a person’s
ability to hear speech or the effect of a speech disorder
on the ability of others to perceive their speech. These
tests are applied frequently in clinical settings.
Chapter Summary
The theory of speech acoustics, formulated by Gunnar References

Fant, states that the output of the vocal tract (that is, the
speech acoustic signal) is the product of the acoustic Becker, G. J., Nelson, B. S., & Suthers, R. A. (2004). Vocal-tract
characteristics of a sound source (the vibrating vocal filtering by lingual articulation in a parrot. Current Biology,
folds) combined with a sound filter (the vocal tract). 14, 1592–1597.
The sound source consists of energy at the F0 (the Diehl, R. L., Lotto, A. J., & Holt, L. L. (2004). Speech percep-
tion. Annual Review of Psychology, 55, 149–179.
rate of vibration of the vocal folds) plus energy at har-
Fant, G. (1960). Acoustic theory of speech production. The Hague,
monic frequencies, which are whole-number multiples the Netherlands: Mouton.
of the F0. Flanagan, J. L. (1972). Speech analysis, synthesis, and perception
The sound filter can be described as the resonant (2nd ed.). Berlin, Germany: Springer-Verlag.
frequencies of the vocal tract, which change depending Fowler, C. A., Shankweiler, D. P., & Studdert-Kennedy, M.
on the shape of the tract. The vocal tract, like any tube (2016). “Perception of the speech code” revisited: Speech is
(pipe) resonator, has multiple resonant frequencies. alphabetic after all. Psychological Review, 123, 125–150.
Galle, M. E., & McMurray, B. (2014). The development of sives. Science, identification functions for synthetic VOT
voicing categories: A quantitative review of over 40 years stimuli. Science, 190, 69–72.
of infant speech perception research. Psychonomic Bulletin Liberman, A. M., Cooper, F. S., Shankweiler, D. P., & Studdert-
and Review, 21, 884–906. Kennedy, M. (1967). Perception of the speech code. Psycho-
Gelfand, J. T., Christie, R. E., & Gelfand, S. A. (2014). Large- logical Review, 74, 431–461.
corpus phoneme and word recognition and the general- Milenkovic, P. (2001). TF32 [Computer software]. Madison,
ity of lexical context in CVC word perception. Journal of WI: Author.
Speech, Language, and Hearing Research, 57, 297–307. Patterson, D. K., & Pepperberg, I. M. (1994). A comparative
Habib, M.B. (2019). New perspectives on the origins of the study of human and parrot phonation: Acoustic and artic-
unique vocal tract in birds. PLoS Biology, 17, e3000184. ulatory correlates of vowels. Journal of the Acoustic Society
https://doi.org/10.1371/journal.pbio.3000184 of America, 96, 634–648.
Hillenbrand, J. M., Getty, L. A., Clark, M. J., & Wheeler, K. Peterson, G. E., & Barney, H. L. (1952). Control methods used
(1995). Acoustic characteristics of American English vow- in a study of the vowels. Journal of the Acoustical Society of
els. Journal of the Acoustical Society of America, 24, 175–184. America, 24, 175–184.
Hixon, T. J., Weismer, G., & Hoit, J. D. (2020). Preclinical speech Shriberg, L. D., Austin, D., Lewis, B. A., McSweeney, J. L., &
science: Anatomy, physiology, acoustics, perception (3rd ed.). Wilson, D. L. (1997). The percentage of consonants correct
San Diego, CA: Plural Publishing. (PCC) metric: Extensions and reliability data. Journal of
Klatt, D. H. (1987). Review of text-to-speech conversion for Speech, Language, and Hearing Research, 40, 708–722.
English. Journal of the Acoustical Society of America, 82, Stevens, K. N. (1998). Speech acoustics. Cambridge, MA: MIT
737–793. Press.
Kluender, K. R., Diehl, R. L., & Killeen, P. R. (1987). Japanese Story, B. H. (2019). History of speech synthesis. In W.F. Katz
quail can learn phonetic categories. Science, 237, 1195–1197. and P.F. Assmann, The Routledge handbook of phonetics,
Kuhl, P. K., & Miller, J. D. (1999). Speech perception by the 31–55. London, UK: Routledge.
chinchillas: Voice-voiceless distinction in alveolar plo-
12
Phonetics
Introduction the child produces — uses phonetic transcription to gen-

erate the record. Or, a researcher who studies dialect
Speech sounds are the phonetic components of lan- variation within a language uses phonetic transcription
guage (Chapter 3). Speech sounds are often referred to to record all the sound variants in different dialects of a
as speech sound segments to indicate that words can be specific language. A good example of this is the many
broken down into their individual, component sounds. ways the “eh” sound in words such as “bed,” “head,”
In American English, examples of speech sounds and “lead” is spoken in American, British, Scottish,
include the vowels in words such as “bead” and “dot” and Irish English, as well as among dialect variations
(“ee” and “ah,” respectively), the nasals in words such within any of these languages.
as “Mom” and “never,” and the “f” sound in “rough.” The term “phonetics” is often broken down into
In these examples, the speech sounds do not always three subareas. These areas are articulatory phonetics,
match the orthographic representations of the words. acoustic phonetics, and perceptual phonetics. Articulatory
The “ah” in “dot” is represented orthographically by phonetics is the study of speech movements associated
an “o,” and the “f” in “rough” is represented by “gh.” with speech sounds. For example, a speech scientist
The mismatch between orthography and sound does may be interested in documenting tongue movements
not apply to all languages; in languages such as Finnish for the vowel /u/ in words (as in “boot”) and how the
and Japanese, the match between the written and spo- movements are modified by variables such as speak-
ken form of words is very good (but not always perfect). ing rate, voice loudness, and speech style (casual or
Phonetic transcription is a tool for representing formal). Acoustic phonetics is the study of the acous-
speech sounds by means of a special set of symbols. tic characteristics of a vowel like /u/ in words spoken
A trained transcriber uses the symbols, drawn from in different speaking conditions (Chapter 11). Articu-
the International Phonetic Alphabet (IPA), to record latory and acoustic phonetics are not the same thing,
the sounds of speech independently of the words they because knowledge of the articulatory characteristics
form. High-quality phonetic transcription requires of a speech sound does not guarantee precise knowl-
extensive training. Speech-language clinicians and edge of the acoustic characteristics of that same sound.
researchers who have this training make extensive Finally, perceptual phonetics is the study of how lis-
use of phonetic transcription to generate a record of teners hear articulatory and acoustic characteristics of
produced or perceived speech sounds. For example, speech sounds. For example, a vowel such as /u/ may
a speech-language pathologist (SLP) who generates a be heard differently when spoken in a word at a fast
record of a child’s phonetic inventory — all the sounds versus slow speaking rate.
169
This chapter deals with all three types of phonetics,

but its focus is on perceptual phonetics, because the IPA Front Central Back
is a tool for recording heard sounds. Nevertheless, an
IPA symbol that represents a heard sound implies some-
thing about the articulatory and acoustic phonetics of
High i y ɨ ʉ ɯ u
the sound, as discussed in the material of this chapter. ɪ ʏ ʊ
The IPA is a universal tool for all people who study Mid-High
languages and are interested in phonetic descriptions. e ø ɘ ɵ ɤ o
No matter the dialect, the language, the potential Mid ə
speech and/or language disorder, even the sounds
produced by babbling infants, the IPA is meant to be ɛ œ ɜ ɞ ʌ ɔ
universally applicable and usable for the transcription Low-Mid
of speech sounds produced by any speaker.
æ ɐ
The purpose of this chapter is not to make the Low a ɶ ɑ ɒ
reader a transcription user. Rather, it presents the con-
cepts that support use of the IPA for transcription of
speech sounds and provides examples of transcrip- Figure 12–1. The International Phonetics Association
tions. A “convention” (a rule we can all agree on) is vowel diagram. Vowels of American English are enclosed
followed in this chapter when IPA symbols are used. by red circles. The tongue height dimension of the quadri-
When the symbol is not the same as its English orthog- lateral is shown on the vertical axis and the tongue advance-
raphy counterpart, the symbol is accompanied by a ment dimension (front-back) on the horizontal axis.
word example that includes the sound. For example,
/ʃ/ “shave”, /E/ “bet”, /dZ/ “jazz.” Cases in which the
orthography and phonetic symbol match, such as the Figure 12–2, where the quadrilateral is superimposed
/p/ in “pack,” are not followed by a word example. on the oral cavity, and the “corner vowels” are indi-
cated by IPA symbols (see later in chapter). Different
vowels can also be made by adjusting the shape and
International Phonetic length of the lips.
Alphabet In Figure 12–1, phonetic symbols for American
English vowels are circled in red. Table 12–1 lists these
The history of the IPA extends back to the late 19th cen- phonetic symbols, each of which is paired with a word
tury, when it was developed for precisely the reasons containing the vowel sound. Vowels are typically cat-
noted before — to create a universal symbol system for egorized using three descriptors: tongue height, tongue
the speech sounds of the world’s languages. Over the advancement, and lip rounding. The goal of the fol-
years, the IPA has been revised several times for bet- lowing discussion is not to promote learning of the IPA
ter accuracy as well as addition of sounds that were symbol system, but rather to use the phonetic symbols
not included in the original version. In the early 1990s, to make broader points about phonetics and its appli-
the IPA was adapted by the International Clinical Lin- cation to language studies and speech disorders.
guistics and Phonetics Association (ICPLA) for specific
use in clinical settings (see Shriberg, Kent, McAllister, Tongue Height (High Versus Low Vowels)
& Preston, 2019).
Tongue height is a description of the height of the
tongue relative to the fixed boundary of the hard palate
Vowels and Their Phonetic Symbols (roof of the mouth). High vowels have a tongue posi-
tion very close to the hard palate, and low vowels have
An inventory of vowels in languages of the world is a tongue position relatively far from the hard palate.
shown in Figure 12–1.1 The four sides of this diagram The upper left and right panels of Figure 12–3 show
form a vowel quadrilateral. The vowel quadrilateral the height of the tongue for two superimposed vowel
shows (theoretically) where vowels can be articulated pairs — /i/ versus /æ/ (left) and /u/ versus /ɑ/ (right).
using tongue heights and tongue forward-backward The tongue heights for /i/ and /u/ are very close to
positions within the oral cavity. This is illustrated in the hard palate and are called high vowels; in fact, these
1
echnically, there are more vowels than those shown in Figure 12–1. Some languages like Japanese use differences in duration for the same
T
vowel sound as different phoneme categories. For example, Japanese has a short /i/ “ee” that contrasts with a long /i/.
12 Phonetics 171
Upward movements of the mandible carry the tongue

to the hard palate, in the direction of high vowels. In
many cases, the movements of the tongue and man-
dible are in the same direction, and much of the differ-
ence in vowel height is due to mandible movements
with smaller contributions of tongue movement. The
i u IPA description of vowels is based on tongue positions,
so mandible positions are not considered further.
æ a
Vowel Phonetics Trivia
The sentence, “The vowel quadri-
lateral shows (theoretically) where vowels can
be articulated using tongue heights and tongue
forward-backward positions within the oral
cavity” is a simplification of vowel phonetics. It
is more accurate to say that the diagram shows
Figure 12–2. The human vocal tract (the airway from the vowels of the world’s languages that are
the vocal folds to the lips) with the vowel quadrilateral currently known to function in a categorical
superimposed on the oral cavity. The quadrilateral shows fashion (hence, “theoretically”). “Categorical
the area in the oral cavity in which the tongue moves for fashion” means the vowels can function as
different vowels. phonemes, which are described in Chapter 3.
Certainly, vowel-like sounds may be produced
“between” some of the symbols shown in
Figure 12–1 but, to date, they have not been
Table 12–1. List of Phonetic Symbols for Each
identified or confirmed as actual phonemes in a
American English Vowel, Together With a Word
language. In addition, the true vowel inventory
Containing the Vowel Sound
of the world’s languages includes lip articula-
tion, changes in the shape of the pharynx, and
Front Vowels Central Vowels Back Vowels
possible changes in voice (laryngeal behavior).
/i/ (beet) /ə/ (agree) /u/ (boot) Two famous phoneticians, Peter Ladefoged and
/I/ (bit) /2/ (brother)* /U/ (book) Ian Maddieson, estimated that a truly complete
inventory of vowels in languages of the world
/e/ (bait) // (bird)* /o/ (boat) may total 100 to 200 vowels (Ladefoged &
/E/ (bet) / / (buck) Maddieson, 1996). The vowel diagram in
Figure 12–1 reflects a 2018 update by the IPA;
/æ/ (bat) /ɔ / (bought)
the known phonetics of the world’s languages
/ɑ / (bog) are apparently always in transition and
accumulating.
Note. The “r” colored vowels (identified by asterisks) are not
included in Figure 12–1.
Tongue Advancement (Front

Versus Back Vowels)
vowels are the highest of the American English vowels.
In contrast, tongue heights for /æ/ and /ɑ/ are rela- Tongue advancement is a description of the extent to
tively far from the hard palate and are called low vow- which the tongue is forward or back in the vocal tract.
els; these two vowels are the lowest of the American As shown in the two bottom panels of Figure 12–3, the
English vowels. Note in Figure 12–1 how other Ameri- tongue blade and tongue dorsum — the parts of the
can English vowels fall between the highest and lowest tongue extending approximately 20 to 25 mm behind
vowels along the vertical axis. the tongue tip — can be placed as far forward as the
Vowel height differences are made not only by front of the hard palate. For the most back position, the
the tongue, but also by the mandible (lower jaw). The tongue blade/dorsum is pulled back about 15 mm from
tongue is attached to the mandible, but not completely. the most front position. Vowels with a tongue blade/
[i] [u]
[ɑ]
[æ]
[ɑ]
[u] [æ]
[i]
Figure 12–3. Tongue positions for /i/ versus /æ/ and /u/ versus /ɑ / (left and right upper
panels) showing differences in tongue height between front (/i/-/æ/) and back (/u/-/ɑ /) vowels.
Tongue positions for /i/ versus /u/ and /æ/ versus /ɑ / (left and right lower panels) showing dif-
ferences in tongue advancement between high (/i/-/u/) and low (/æ/-/ɑ /) vowels.
dorsum forward in the vocal tract are called front vow- (left), and /ɑ/ versus /æ/ (right). The tongue position
els; vowels with a tongue blade/dorsum pulled back for the vowel /i/ is more advanced (more front) than
are called back vowels. These forward-backward posi- the tongue position for /u/, as is the tongue position
tions of the tongue blade/dorsum are illustrated in for /æ/ compared with the tongue position for /ɑ/.
the lower two panels of Figure 12–3 for /i/ versus /u/ The difference for the latter pair of vowels is subtle.
Small Movements, Big Effects

Vowels are produced in a mini-world of these differences in position are surprisingly small
tongue position differences. The same point on the for the big effect of easily hearing the difference
tongue — such as on the tongue blade — may differ between an /i/ and /u/, or between any other pair
in position by no more than 20 mm for the most of vowels in which the position differences are
front high vowel (/i/) and most back vowel (/u/). even smaller. For readers more comfortable in the
Measure 20 mm, and you will see what we mean; world of inches, 20 mm is about 0.8 inch.
12 Phonetics 173
Lip Rounding meaning it has a lot of vowel categories relative to many

other languages. Greek, for example, has only five vow-
Lip rounding is a description of the configuration of the
els, as does Spanish. These languages, like many others,
lips for vowel production. In many languages, some
have “sparse” vowel systems. The relative density or
vowels are described as rounded, in contrast to other
sparseness of a vowel system has no effect on commu-
vowels described as unrounded. Rounded vowels are
nication efficiency within a native language.
produced with a narrow opening between the lips, and
American English has several vowels that are
sometimes with the lips protruded, to create a narrow
relatively rare among languages of the world. These
air channel. For example, most (if not all) textbooks
include /I/ “bit,” /E/ “bet,” /U/ “book,” and /æ/
on the phonetics of American English describe /u/ as
“hat.” These vowels often present pronunciation prob-
a rounded vowel. Ask a friend — preferably someone
lems for adult speakers learning English whose native
in your grandparents’ generation — to say word pairs
language (including such languages as Greek, Span-
like “boot”-“beat” or “food”-“feed.” Observe the con-
ish, French, Korean, and Mandarin Chinese) does not
trast between the two vowels in the formation of the
include at least one of these vowels.
speaker’s lips. Make the observations from the front (to
Finally, in virtually all languages of the world,
see the narrow opening between the lips for /u/) and
the most extreme vowels — the highest and most front,
from the side (to see the protruded lips). Other vowels
the highest and most back, and the lowest and most
in American English, most notably /U/ (“book”) and
back — define the limits of vowel articulation. These
/o/ (“boat”), are also described as rounded, although
vowels are /i/, /u/, and /ɑ/, respectively (see Fig-
not to the same degree as /u/. The remaining vowels
ure 12–2). In theory, these vowels enclose the remain-
of American English are considered to be unrounded.
der of the vowels in any vowel system.
Vowels of American English Compared

With Vowels of Other Languages Tricky Vowels
Figure 12–1 shows a total of 28 vowel symbols, of which Vowels are tricky. Uniform agreement
12 are used to describe the vowels of American English. among phoneticians on the sound associated with
America English has a relatively “dense” vowel system, a particular phonetic symbol may be difficult to
obtain. However, whatever disagreements exist
are probably few in number. Organize a group of
This Is Not Your 50 phoneticians in a room, show them a vowel
Grandparent’s Vowel symbol, and ask each phonetician to say the
sound. Your author believes 90% of the symbols
Why ask a grandparent to speak the words? in Figure 12–1 will elicit the same vowel from
And why not include /æ/ in the corner vowels? each phonetician. The trick is, what does “same”
Let’s take the grandparent question first. Across mean in this context? Most American English
generations, there are always changes in the way vowels are produced in many different dialects —
sounds are produced. Take /u/, for example. does the vowel /E/ (“bet”) sound the same in
College-age speakers especially, and even people Northern, Southern, and Western dialects?
approaching age 40 years, have done two things Almost certainly not. But listeners can assign the
to this vowel. One, they have moved the tongue spoken vowel to the same category — they are all
forward to produce it, in the direction of /i/. Two, examples of the /E/ in “bet” — even when
they have stopped rounding their lips for the hearing the dialect-bound difference between
vowel. So, if you ask a friend to do the “boot”- them. Vowels are tricky for additional reasons as
“beet” “food”-“feed” thing, they may not show well. Take the vowels /ɑ/ and /ɔ/, for example.
much lip rounding for /u/. Second, let’s consider These are separate vowel categories in dialects
/æ/. It may be a corner vowel in English, but it spoken in places such as Philadelphia and
is so odd — pronounced so differently in different Baltimore (e.g., /dɑk/ “dock” versus /dɔg/
dialects, and absent from many vowel inventories “dog”) but not so much in places such as South-
in languages of the world that it may be better ern California /dɑk/ “dock” versus /dɑg/
to leave open the question of the lowest, most “dog”). So, in SoCal, are the vowels in “cot” and
forward vowel position in American English. Take “caught” the same or different? And what about
a look at the /æ/-/ɑ/ contrast in Figure 12–3; it the Bostonian or Pittsburgher who says /hɔt/
is not easy to see much of a difference between “hot” when almost everyone else in the country
the tongue positions for these two vowels. says /hɑt/ “hot”? Vowels are tricky.
Consonants and Their American English consonants are made with con-
Phonetic Symbols strictions at the lips (/p/, /b/, /m/, and /w/2); between
the upper teeth and lower lip (/f/, /v/); at the teeth or
An inventory of selected consonants in languages of the between the tongue and the teeth, called dentals /ɵ/
world is shown in Figure 12–4. Some phonetic symbols “think,” /D/ “these”; just in back of the teeth, called
for consonant sounds in languages other than Ameri- alveolars (/t/, /d/, /s/, /z/, /n/, /l/, and /ɹ/ [like the
can English have been excluded from this image to “r” sound in “rose”])3; along the hard palate behind
simplify the accompanying discussion. Selected non- the alveolars — hence postalveolars (/ʃ/ as in “shave,”
American English consonants are included to make a /Z/ as in “beige” or “azure”); in the velar region (/k/
specific point. and/g/, with the constriction close to the location
Consonants are categorized by the IPA using three where the hard palate joins with the velum); and at the
major descriptors: place of articulation, manner of level of the vocal folds (glottis) (/ʔ/, like the first sound
articulation, and voicing. in “ever” with emphasis on the first syllable, and the
In Figure 12–4, American English consonant sym- sound heard at the end of words such as “right” in
bols are shown in red. Selected consonants from other Cockney English; and /h/). American English conso-
languages, that are not found in American English, are nants are not produced at the retroflexed, uvular, or
shown inside a green box. pharyngeal places of articulation.
Place of Articulation Manner of Articulation

Consonants are made by forming a constriction some- If place of articulation is the “where” of consonants,
where in the vocal tract, between the lips and the vocal manner of articulation is the “how.” Manners of artic-
folds. The location of the constriction is the consonant’s ulation in American English include stops, fricatives,
place of articulation. Place of articulation is shown on the affricates, nasals, approximants, and flaps (taps). Conso-
horizontal axis, from left to right, with “bilabial” at the nants can have different manners of production, even
left-most extreme, and “glottal” (at the vocal folds) at at the same place of articulation. Examples in American
the right-most extreme. English include the different manners of articulation
Place of Articulation
Bilabial Labiodental Dental Alveolar Postalveolar Retroflex Palatal Velar Uvular Pharyngeal Glottal
ʈ ɖ q ɢ ʔ
Manner of Articulation
Stop p b t d k g
Fricative ɸ β f v θ ð s z ʃ ʒ ʂ ʐ x ɣ χ ʁ h ɦ
Affricate tʃ dʒ
Nasal m n ɲ ŋ ɴ
Approximant w l ɹ j
Flap ɾ
Figure 12–4. Chart showing selected phonetic symbols for consonants in languages of the world, adapted from the
most recent revision from the International Phonetics Association. Place of articulation is on the horizontal axis, from
the bilabial place (left-most column) to the glottal place (right-most column). Manner of articulation is shown on the vertical
axis, from top-to-bottom in the order stops, fricatives, affricates, nasals, and flaps (taps). Consonants in American English
are shown by red symbols. Selected consonants from other languages are enclosed by a green box. When two consonants
have the same place and manner of articulation (e.g., /t/ /d/), the first consonant is voiceless, and the second is voiced.
2
/w/ is a special case because it technically has two places of articulation, one at the lips and the other similar to the high-back vowel /u/.
3
/ɹ/ and /l/ are also special cases because they can have places of articulation different from “alveolar” yet be heard as correct versions of
these sounds.
12 Phonetics 175
produced at the alveolar place of articulation. Two stop palatal place of articulation. (“palatal place” is behind
consonants (/t/, /d/) are alveolars. Stops, also called the postalveolar place, see Figure 12-4).
plosives, are produced by a brief, complete constric- Finally, flaps, also known as taps, are a type of stop
tion, blocking the airstream from lungs to atmosphere consonant but are usually given status as a separate
for a brief time interval. Brief, in this case, means about manner of articulation. One way to think of taps is as a
0.06 to 0.1 s (60–100 ms). Because the velopharyngeal very brief /d/, produced not by placing the tongue tip
port (the passageway between the throat and nasal cav- just behind the alveolar ridge and blocking the air-
ities, which can be open or shut) is completely closed stream for 100 ms, but rather by “flicking” the tongue
during this constriction, air pressure is built up and tip against the alveolar ridge in a quick touch-and-re-
released suddenly, creating the signature stop “pop” lease gesture. A good way to describe taps is by exam-
when the constriction is released. Put your hand close ple. Words like /bɾ2/ “butter” and /lEɾ2/ “letter,” in
to your lips, say /ɑpɑ/ “ahpah,” and feel the puff of air which the first syllable is stressed and the second sylla-
when the /p/ is released into the second /ɑ/. ble unstressed, are produced in American English with
Two fricatives (/s/, /z/) also have an alveolar a tap separating the two syllables. If the stress pattern
place of articulation. Fricatives have a tight constric- is reversed, the middle consonant sounds like a /t/.4
tion, but not the airtight constriction characteristic of
stops. Air from the lungs flows to the fricative constric- Consonant Voicing
tion and results in a pressure buildup behind it, which
The question, “What is the voicing status of a specific
forces air to flow through the narrow constriction pas-
consonant?” asks whether the consonant is voiceless or
sageway. As air is forced through the constriction, it
voiced. Speech sounds categorized as “voiceless” are
makes a hissing noise. This hissing noise is a signature
produced without vibration of the vocal folds. Speech
characteristic of fricatives. Produce an /s/ for a few
sounds categorized as “voiced” are produced with
seconds, and you will hear the hissing noise. Try the
vibration of the vocal fold.
same exercise with other American English fricatives
In American English, the speech sounds of seven
(e.g., /f/, /T/ “think,” /ʃ/ “shave”).
pairs of consonants are differentiated by their voicing
Affricates have a manner of articulation that is
status. Stop consonants are in pairs at each of the three
like a stop followed by a fricative. Affricates are not
places of articulation; one is voiced, and the other is
just stops followed by fricatives, but a unique manner
voiceless (bilabial, /p/ /b/; alveolar, /t/ /d/; velar,
of articulation. In American English, /tʃ/ “chair” and
/k/ /g/). Fricative pairs are differentiated by voicing
/dZ/ “judge” are made slightly posterior (in back of)
at four of the five places at which fricatives are pro-
the alveolar place of articulation. In fact, these post-
duced (labiodental, /f/ /v/; dental, /T/ “think,” /D/
alveolar affricates are the only affricates in American
“then”; alveolar, /s/ /z/; postalveolar, /ʃ/ “shave,” /Z/
English. Many other languages have affricates at dif-
“beige”). The single affricate pair /tʃ/ “chair” and /dZ/
ferent places of articulation.
“judge” is differentiated by voicing (/tʃ/ voiceless, and
There is one nasal consonant, /n/, produced at the
/dZ/ voiced). The fricative /h/, produced at the glottal
alveolar place of articulation. Nasals are produced by cre-
place of articulation, is voiceless but does not have a
ating a complete constriction in the vocal tract but open-
voiced counterpart in American English.
ing the velopharyngeal port so air can flow through the
In American English, all nasals and approximants
nasal passageways and to the atmosphere. Other nasals
are voiced.
include the bilabial /m/ and the velar /ŋ/ “running.”
Approximant is a manner of articulation in the
Consonants of American English Compared
vicinity of the alveolar place — this type of speech
sound is categorized as an alveolar in Figure 12–4. With Consonants of Other Languages
Approximants have a constriction that is not as tight Figure 12–4 shows 61 consonant symbols available to
as in fricatives but tighter than in vowels. The hissing represent consonants used in languages of the world.
noise associated with fricatives is not produced in most American English uses 26 of these. The 26 consonants
approximants because their constriction is not suffi- used in American English are an average number among
ciently narrow. Approximants produced at the alveolar languages of the world, similar to the number of con-
place of articulation include /l/ “long” and /ɹ/ “right.” sonants in German, Italian, Norwegian, and Turkish.
Note that /w/ is an approximant at the labial place of Languages such as Estonian and Bulgarian have conso-
articulation, and /j/ “yes” is an approximant at the nants that number in the upper 30s to low 40s.
4
aps are much less frequent for words like “butter” and “letter” in British English, in which the consonant that separates the two syllables is
T
likely to be a full /t/ (btə).
cal condition for a nasal, which is airflow through the

That’s a Lot of velopharyngeal port to the nasal cavities cannot be met
Consonants if the place of articulation is in back of the velopharyn-
geal port. In this case, the airflow is blocked before it
As with vowels, the 61 consonant symbols can reach the velopharyngeal port; it is therefore not
underestimate the total number of consonants possible to produce a pharyngeal nasal.
used in languages of the world. There are two
reasons for this, one obvious and the other
less so. First, the consonant symbols shown in Clinical Implications of
Figure 12–4 are selected, as noted in the text. This Phonetic Transcription
decision was made to simplify the discussion
of consonants; this is the obvious reason. The The relative density or sparseness of a vowel system
less obvious reason is that there are certainly may affect a speaker’s ability to master the vowel sys-
consonant sounds in other languages that have tem of a second language. American English, for exam-
not yet been identified, and there are variants of ple, has a high-front vowel pair /i/-/I/ (“beat”-“bit”)
consonant symbols shown in Figure 12–4 that and a high-back vowel pair /u/-/U/ (“kook”-“cook”)
are the same phoneme category but have slightly that are notoriously difficult to learn for speakers
different sounds. Such variants are called “allo- whose native language does not have the vowels /I/
phones” of the phonemic category. For example, and /U/. American SLPs can offer services to people
the stop consonant /t/, a phoneme category in who are not native speakers of English and who want
American English, is produced in (at least) two to improve their English pronunciation. Knowledge of
ways. One is called aspirated (symbolized /t/, different vowel systems and the IPA symbol system for
see section, “Broad Phonetic Transcription”), vowel transcription is important for this “accent reduc-
and the other unaspirated (usually symbolized tion” therapy. A component of the therapy is likely to
as /t/). The /t/ occurs when the /t/ begins a be improvement of the client’s ability to both perceive
word, as in “tough”; /t/ is usually found at the and produce the difference between vowels, such as
end of words such as “cat.” These allophones the /i/-/I/ distinction, when only /i/ is part of the
of the /t/ phoneme are all consonant sounds in vowel system of the client’s native language.
American English. If you are tallying up the total Let’s turn the issue around and consider a native
number of consonant sounds in languages of the speaker of American English and her attempt to learn
world, the number is certainly greater than the Swedish. Swedish has a denser vowel system than
61 symbols shown in Figure 12–4. Peter Lade- American English; there are between 15 and 17 vow-
foged, the great British-American phonetician, els in Swedish. Swedish has four vowel pairs in which
thought there may be as many as 800 consonant the contrast between them is based primarily on lip
sounds produced throughout languages of the rounding versus no lip rounding. Figure 12–1 shows
world. the American English vowel symbol /i/ to be very
close to the symbol /y/, which is a rounded version
of /i/ (and, in many Swedish dialects, a little lower
The green boxes in Figure 12–4 enclose consonants and more forward than /i/). Similarly, Swedish /e/, a
from other languages. Clearly, there are several places vowel in American English, contrasts with its Swedish
of articulation that are not used in the phonetic inven- rounded version /ø/. (Try prolonging the vowels /i/
tory of American English. Good examples are the uvu- and /e/, then round your lips without moving your
lar (at the back end of the soft palate) and pharyngeal tongue — you will hear something like the Swedish /y/
stops and fricatives that are prominent in Arabic lan- and /ø/, respectively). A native speaker of American
guages. Also, places used in English but restricted to English who is learning to speak Swedish might ben-
one or two manners of articulation are used in other efit from a Swedish-speaking SLP who is well trained
languages for an additional manner of articulation. in phonetic transcription and its application to vowels.
German and Hebrew, for example, have velar frica- An understanding of the American vowel sys-
tives; American English does not. Other examples are tem may also be useful to an SLP who records vowel
found in Figure 12–4. errors in children for the purpose of making a diag-
The shaded boxes in Figure 12–4 have an interest- nosis between a speech delay of unknown origin and
ing story. These are articulations that are considered a speech disorder called childhood apraxia of speech
“impossible” for consonant production. For example, (Chapter 15). Speech delay of unknown origin rarely
a pharyngeal (place) nasal (manner) is considered has vowel errors as a prominent problem; vowels are
impossible. This is because the necessary physiologi- mastered relatively early in the course of speech sound
12 Phonetics 177
development, even among children with delayed mas- intricacies of diacritics. But one example can make the
tery of consonants. In childhood apraxia of speech, transcription process clear.
vowel errors may be a key diagnostic sign of the dis- Suppose you are a trained phonetician and you
order. An accurate understanding and application of hear a speech sound that seems like an /s/, but made
IPA transcription for vowels are required to describe a with the tongue tip placed very close to the upper teeth.
child’s vowel inventory and to compare that inventory The speech sound is not interdental like /T/ “thin” but
to age expectations for vowel development. seems close to it (see Figure 12–4). How do you tran-
Knowledge of consonant phonetics is directly rele- scribe this sound? One way is to use a diacritic symbol.
vant to the clinical practice of SLPs. SLPs must provide Diacritics symbols are appended to an IPA vowel or
a transcription record of the sound patterns produced consonant symbol to indicate a subtle modification of
by children who are evaluated for possible speech the speech sound. An /s/ that sounds too close to the
delay. “Speech delay” is a term used in the evaluation teeth is said to be dentalized, or more forward than the
of children’s speech to describe speech sound develop- expected alveolar place of articulation. The diacritic
ment that significantly lags the expected sound skills symbol for “dentalized” is  and for the case described
at a given age. For example, studies of typical speech here is transcribed as /s/ (dentalized /s/). The “den-
sound development have established the sounds that talized” diacritic can be used with (in theory) any
should be mastered by age 5 years. The speech of a sound but is especially relevant to other alveolars such
5-year-old child who is evaluated for speech delay as /t/, /d/, and /n/. A dentalized /d/, for example, is
requires an accurate phonetic transcription of his cor- transcribed as /d/. In-depth presentations of phonetic
rectly and incorrectly produced consonants. The inven- transcription and diacritics are available in Bauman-
tory of correct and incorrect consonants can then be Waengler (2016) and Shriberg et al. (2019).
compared to the speech sound mastery of a typically
developing child of the same age. Not only is a skilled
transcription of consonants required for this compari-
Chapter Summary
son, but the universal nature of the IPA symbols allows
the phonetic transcription of a child’s vowels and con-
sonants to be understood by any SLP. The IPA is a transcription tool that allows listeners
to use a universal set of symbols to represent heard
speech sounds. The tool is applicable to any language,
Broad Phonetic Transcription
and it attempts to represent all the sounds known to
The phonetic transcription described to this point in the occur in languages of the world. Phonetic transcription
chapter is called broad transcription. The phonetic sym- is a highly developed skill.
bols represent catgeories of sounds that often function Vowels are described in phonetic terms along a
as phonemes. Narrow phonetic transcription is a kind of high-low dimension (the distance between the tongue
fine-tuned transcription of these broad symbols. Narrow surface and the roof of the mouth (hard palate), a
transcription can be especially useful in clinical settings. front-back dimension (the advancement of the tongue
For example, in broad transcription, a speech toward the front of the mouth versus the retraction
sound is either an /s/ or an /ʃ/ “shave.” In clinical of the tongue toward the back of the mouth), and a lip
phonetics, however, children and adults often produce rounding dimension (rounded versus unrounded lips).
sounds that are neither /s/ nor /ʃ/ but something in Vowels vary among different dialects in the United
between the two; or, a sound may be recognized as an States, and in the dialects of all other countries, as well.
/s/ or /ʃ/ but not as a “good” version. How does the Listeners can hear differences in the pronunciation of a
IPA handle such occurrences? vowel even when they assign the different pronuncia-
The basic symbol system of the IPA is supple- tions to the same vowel category.
mented by a series of symbols called diacritics, which The vowel system of American English is differ-
allow narrow transcription. These symbols are meant to ent from the vowel systems of many other languages,
designate subtle changes in articulation that make a including Greek, Spanish, and Swedish.
speech sound “different” from a “good” version of the Consonants are represented by IPA symbols that
sound, but not to the extent that the sound belongs to a vary along the dimensions of place of articulation,
different category. Diacritic symbols are almost always manner of articulation, and voicing status.
“attached” to the phonetic symbols described earlier, At a given place of articulation (such as alveolar),
to indicate the subtle articulatory changes heard by a consonants are produced with different manners of
transcriber. The current chapter is a first introduction articulation (such as stops, fricatives, affricates, nasals,
to phonetic transcription and cannot delve into the and approximants).
American English consonants represent only a References

small proportion of consonants in the languages of the
Bauman-Waengler, J. (2016). Articulation and phonology in
world.
speech sound disorders: A clinical focus (5th ed.). Boston, MA:
IPA transcription is a useful clinical tool for SLPs Pearson Education.
so they can use a universal symbol system to document Ladefoged, P., & Maddieson, I. (1996). The sounds of the world’s
correct and incorrect speech sounds in individuals with languages. Oxford, UK: Blackwell.
speech and hearing disorders. Shriberg, L. D., Kent, R. D., McAllister, T., & Preston, J. L.
Diacritic symbols are used to show subtle modifi- (2019). Clinical phonetics (5th ed.). Boston, MA: Pearson
cations in the main phonetic symbols of the IPA. Education.
13
Typical Phonological
Development
Introduction (“beer” versus “peer” versus “fear”). More precisely,

a phoneme includes a class of sounds, all of which are
This chapter presents information on typical phonetic treated as belonging to the same category (i.e., the same
and phonological development. As defined in Chap- phoneme). For example, in English, the “b” sound is
ter 3, phonology is the study of the sound systems of sometimes produced with the vocal folds vibrating
languages. Casual observation of small children plus throughout the entire sound, but at other times, vocal
years of scientific research point to a common con- fold vibration may be delayed until just after the lip
clusion — children do not master all speech sounds at contact is released. These two ways to produce a “b”
the same time. There is no mystery that some speech are phonetic variants or allophones of the phoneme “b.”
sounds are easy for the developing child, and some are Other allophones of English “b” can be defined as
very difficult. There is continuing mystery about why well. In saying the word “beer,” for example, a speaker
this is so. of American English may produce the “b” as if he or
The continuum of speech sound difficulty in the she is swallowing the sound (the way some country-
typically developing child is not the whole story of western singers might produce the sound for extra
speech sound development. Phonology is the study of emphasis in a song, or to identify their dialect as “real
the sound system of a language. A speech sound in a country”). No matter, in English, the swallowed “b” is
particular language is linguistically effective because still an allophone of the “b” phoneme. All three of the
it plays a role relative to other speech sounds in the “b” allophones just described — the vocal folds vibrat-
language. The speech sound is part of a system. ing throughout the entire sound, vocal fold vibration
A phoneme is defined as a speech sound that delayed until just after the lip contact is released, and
changes the meaning of a word when it replaces country-western swallow — are members of the same
another speech sound in the same position in the word. phoneme. It does not matter which one starts the word
For example, in English, the speech sounds “b,” “p,” “beer”; the meaning of the word will be clear to a native
and “f” are phonemes because word examples can speaker of American English.
be identified in which meaning is changed when the Allophones are an important part of the “system”
sounds replace each other at the beginning of the word of sounds in a language. The question of how children
179
learn this aspect of phonology is made more compli- we hear as an “accent” when a nonnative speaker of
cated, and more interesting, by different languages hav- English produces multisyllabic English words. Addi-
ing very different phoneme/allophone relationships. tional information on the development of lexical stress
For example, in Uduk, a language spoken in parts of is not included in this chapter.
Ethiopia and Sudan, the country-western-type “b” The bulk of this chapter is devoted to the pattern of
and the “b” with continuous vocal fold vibration are sound learning in typically developing children. Some
members of different phoneme categories (Ladefoged material on the learning of phonotactic constraints and
& Maddieson, 1996). Swapping these sounds before the phonological processes is also presented. It is important
same vowel will create different words in Uduk but not to keep in mind that the “typical” pattern presented in
in English. Somehow a child must learn not only the this chapter is an idealized description of normal pho-
phonemes of her language but also the various allo- nological development. Not every typically developing
phones included within each phoneme category. child follows this pattern — there is a lot of individual
The child must also learn the phonotactics of variation in “normal” phonological development (Vih-
her language. Phonotactics concern the sequences man, 2004).
of sounds that form words. In a particular language, Finally, comments are made at the end of the chap-
some sequences are allowed to form words; others ter on the interaction between vocabulary growth and
are disallowed. These word-form rules are called pho- phonological development. Recent scientific develop-
notactic constraints. In English, a word can start with ments suggest that development of the sound system is
an “l,” “r,” “s,” and so on, but cannot start with the not a foundation for the development of words. Rather,
“ng” sound (the sound at the end of the word “rang”). word learning may be the foundation for the develop-
The “ng” sound can occur, however, at the beginning ment of phonology.
of words in several languages (such as Swahili, a lan-
guage spoken in a good portion of East Africa, and
in Mandarin Chinese). The “ng” sound is a phoneme Phonetic and Phonological
category in both English and Swahili, but the phono- Development: General
tactic constraints on the sound are different in the two Considerations
languages.
Finally, the child must learn the prosodic charac- The development of the speech sound system goes by
teristics of the language, often considered as a compo- different names. It is variously called speech sound
nent of phonology. Prosody includes the melodic and development, articulatory (phonetic) development,
rhythmic features of spoken language. Of particular and phonological development. Two of the terms,
interest for the present chapter is the role of prosody “speech sound development” and “articulatory (pho-
in languages such as English, where multisyllabic netic) development,” can be used interchangeably;
words — words with more than one syllable — have “phonetic development” is used in this chapter to refer
a lexical stress pattern. The lexical stress pattern of a to these two terms. “Phonological development” refers
word indicates which of the multiple syllables receive to something different from phonetic development, as
linguistic stress and which are unstressed. In the Eng- described below. The use of phonetic versus phono-
lish word “cinnamon,” for example, the first syllable logical development is not an academic exercise — it
is stressed, and the next two are unstressed (diction- can have important implications for the diagnosis and
ary entry: sin-uh-muhn, where bolding indicates the treatment of developmental speech sound disorders
stressed syllable). Say the word to yourself several (Chapter 15).
times, and you will see that the first syllable is more
“prominent” as compared to the second and third syl-
lables. Speakers stress syllables by producing them Phonetic and Phonological
with slightly longer duration, higher pitch, greater Development
loudness, and more precise articulation as compared to
unstressed syllables. These factors combine to make a What is the difference between speech sound, articula-
stressed syllable “stand out” for a listener — this is what tory (phonetic), and phonological development? The
is meant by the syllable being “more prominent.” Now terms “speech sound development” and “articulatory
say the word with stress on the second syllable — “sin- (phonetic) development” share a similar meaning, as
nah-muhn.” Strange, isn’t it? Children must learn the discussed later. We use the term “phonetic develop-
stress characteristics of multisyllabic words. Lexical ment” as the cover term for this aspect of speech sound
stress patterns vary across languages, and this cross- development. The primary distinction for this discussion
linguistic variation often contributes heavily to what is between phonetic and phonological development.
Phonetic Development words like “fuss” and “fuzz” with a good /s/
and /z/, respectively. The morphophonemic rule
Phonetic development is the sequence of mastering is phonological, not phonetic.
the articulatory movements, positions, and shapes n A child produces word forms with word-initial
required to produce speech sounds. Phonetic devel- “ng” /ŋ/. The word-forms are protowords
opment reflects the maturation of speech motor skills, (phonetically consistent forms) discussed in
also called articulatory skills. Can a child position and Chapter 5. These are phonetic sequences that are
shape the tongue to produce an /s/? Can he move the used consistently to identify objects, people, or
tongue sufficiently forward to produce an /i/? Ques- possibly actions, but are not “real” words in the
tions like these can be asked about any speech sound child’s native language. The child’s use of word-
in a language. initial “ng” violates the phonotactic rules of
English. Word-initial “ng” cannot initiate a word
but can appear in word-medial and word-final
Phonological Development positions (as in “penguin” and “sing”).
Phonological development refers to the role of speech Is it meaningful to consider a speech sound differ-
sounds in the sound system of a language. Rather than ence from the “correct” adult form as phonetic versus
asking if a child has the speech motor capability to phonological? Many clinicians and scientists believe
produce an /s/, the question is, does the child under- so. Typical speech development includes errors in indi-
stand the role of /s/ in the sound contrasts that can vidual sounds as well as patterns of sound errors, an
distinguish words, the role of /s/ in the morphopho- example of the latter being the deletion of word-final
nemic aspects of a language, and the allowable sound consonants. When a speech sound error is regarded
sequences for word formation? as phonetic in origin, there is a tendency to attribute
Examples of the three aspects of phonological the error to immature speech motor skills. In this view,
development mentioned in the previous paragraph speech motor skills for correctly produced speech
and their differences from phonetic development are sounds are sufficiently developed for some sounds, but
as follows: not for others.
Patterns of speech sound errors in typical speech
n A young child produces a good /s/ for words development suggest an alternative view for the ori-
such as “sip,” “Sue,” and “see,” but produces gin of the errors. Take the case of a typically devel-
the same /s/ for words such as “ship,” “shoe,” oping child who says “da” for “dog,” “ka” for “cat,”
and “she.” The use of /s/ for /ʃ/ may not and “doh” for “those.” The phonetic view considers
indicate an absence of speech motor control the child’s errors as articulatory problems (immature
skills (i.e., a phonetic issue) for /ʃ/, but rather the speech motor skills) in the production of the individual
absence of recognition of the /s/-/ʃ/ contrast as sounds /g/, /t/, and /z/ in the word-final position. In
phonemic. In this view, the /s/-/ʃ/ contrast has the phonological view of these errors, the sounds are
not yet become one of the phonemic contrasts omitted as a result of the phonological process of dele-
in the phonological system of a child’s native tion of word-final stops. Phonological processes are
language. regarded as one of many cognitively based language
n A young child produces a good /s/ and a good rules. The child probably has the speech motor skill to
/z/ at the end of words but does not use the produce the sounds but produces the intended conso-
sounds appropriately when marking plurals nant-vowel-consonant (CVC) words as CV syllables
for words such as “tacks” and “tags.” The child to simplify the task of learning to produce the words.
produces these words as /tæks/ and /tægs/, “Da” may not be correct phonetically, but the child’s
when they should be /tæks/ and /tægz/. The use of this CV form clearly means “dog.”
morphophonemic rule (that is, the interaction
between morphemes and phonemes) in the
phonological system of English is that plural -s Typical Speech Sound
is phonetically /s/ when it follows a voiceless Development
stop (the “t” in “tack”) and z when it follows
a voiced stop (the “g” in “tag”). The error of Figure 13–1 is a representation of the ages at which
applying the /s/ for the plural in both words children learn the speech sounds of English. The pro-
is not due to a speech motor control problem, cess of speech sound development in a typically devel-
as evidenced by the child’s ability to produce oping child is not predictable in its details. From child
MONTHS
24 28 32 36 40 44 48 >48
vowels, diphthongs
nasals, glides
stops
lateral, rhotic (liquids)
fricatives, affricates
clusters
Figure 13–1. A schematic summary of speech sound learning in typically

developing children. The left-hand edge of each bar indicates the age at which
about half of typically developing children produce the indicated sounds cor-
rectly, and the right-hand edge is the age at which most children have learned
correct production of the sound (90% to 95% of tested children). The right-hand
arrows extending past 48 months indicate that mastery of some sounds (laterals,
rhotics, fricatives, affricates, consonant clusters) may extend, on average, well
past 4 years of age.
to child, there is a fair degree of variation in the order stop consonants correctly a little before 24 months, and
in which sounds are learned, and the ages at which the nearly all children produce all stops correctly just after
sounds are “mastered.” For a large group of typically 48 months.
developing children, however, general trends in speech Vowels, diphthongs, nasals, and glides are learned
sound development can be identified. In fact, the steps early and probably mastered no later than shortly after
in speech sound mastery that are regarded as “typical” the third birthday (36 months). Stops are also learned
enjoy broad agreement across studies (see summary early but may have a lengthier period of development
in Smit, Hand, Freilinger, Bernthal, & Bird, 1990; and than vowels, nasals, and glides. The mastery of liquids
more recent summaries in Stein-Rubin & Fabus, 2012; — the “l” and “r” sounds — lags vowels, diphthongs,
and Bauman-Waengler, 2015). Figure 13–1 shows these nasals, glides, and stops in two ways. First, the age at
trends. Studies on the developmental course of speech which half of typically developing children produce
sound learning have included as few as 90 children liquids correctly is nearly a year later than the age for
(Bricker, 1967) and as many as 997 (Smit et al., 1990). the earlier-mastered sounds (compare in Figure 13–1
The ages shown in Figure 13–1 reflect the fact that the starting age for liquids to that for the three sound
the bulk of speech sound learning takes place between categories shown above the liquids). Second, the devel-
the ages of 2 and 4 years, even though learning begins opment of liquids may extend well beyond 4 years of
before age 2 years and continues past age 4 years. The age, as indicated in Figure 13–1 by the arrow pointing
left-hand edge of each bar in Figure 13–1 indicates the past the 48-month landmark.
age at which about half of typically developing chil- Mastery of fricatives and affricates lags that of liq-
dren produce the indicated sounds correctly, and the uids slightly and extends well past 48 months. Finally,
right-hand edge is the age at which most children have correct production of consonant clusters, such as the
learned correct production of the sound. For example, “sp” sounds in “spot,” the “skr” sounds in “scratch,”
roughly half of typically developing children produce the “pl” sounds in “play,” and the “rst” sounds in
“first,” begins to be mastered at around 3 years of age Table 13–1. Example of a Transcription Analysis of a
and may not be fully mastered until well past 4 years Speech Sound Production Task
of age.
The last speech sounds acquired throughout the Word-Initial Word-Medial Word-Final
course of sound development are often referred to as Target /b/
“The Late Eight” (Bleile, 2018). These sounds include
“baby” beIpi beIpi
voiceless and voiced “th” (/T/, /D/), voiceless “s” and
voiced “z” (/s/, /z/), “l” (/l/), “r” (/r/), “sh” (/ʃ/), and “tub” tp
“ch” (/tʃ/). As Bleile notes, the majority of children Target /s/
with speech sound disorders have errors on one or
more of these sounds. Some of the late-eight errors may “sun” Tn
be present past 8 or 9 years of age, the absolute upper “whistle” wIf ə
limit for typical development of the sound system. “rice” wɑIf
Many children with these errors correct them sponta-
neously — the errors are “normalized” without therapy.
A small number of early and late-age teenagers and
even some adults continue to make these errors, most Correct production of one allophone of a phoneme
likely on /s/, /z/, /l/, and /r/. Late-eight errors that category does not guarantee correct production of all
persist into the teenage and adult years are called per- allophones of that category.
sistent or residual errors (Flipsen, 2016). When a complete set of words has been devel-
oped and judged to be familiar to children as young
as 2 years of age, pictures or photographs of the object,
Determination of Speech person, or action are created for each word. The child
Sound Mastery in Typically is shown the picture or photograph and asked to say
Developing Children what he or she sees. Ideally, the child produces the
word without hearing it spoken by the experimenter.
The arrangement of speech sounds in Figure 13–1, from When a child cannot name a picture, the experimenter
top to bottom, is from the earliest- to latest-appearing provides a spoken model, and the child’s production of
sounds in the typical child’s development of English the word is an imitation of the experimenter’s model.
speech sounds. This pattern of sound development Each child’s naming of a picture or photograph (or
can be considered a kind of “average” developmental word imitation) is transcribed by a skilled phonetician,
pattern; as noted earlier, many departures from this using the symbols of the International Phonetic Alpha-
“average” pattern are normal and not cause for con- bet (Chapter 12). This transcription shows the sounds
cern. With this in mind, it is useful to consider how used by the child to produce the word. In some cases,
these average patterns were determined, and why it the transcription may include allophonic detail (nar-
is important to be familiar with an average develop- row phonetic transcription).
mental sequence, even if it does not represent every
typically developing child.
The basic research strategy for obtaining the infor- Possible Explanations for
mation summarized in Table 13–1 is to select a set of the Typical Sequence of
words requiring a child to produce a target sound (e.g., Speech Sound Mastery
the stop “b” or fricative “s”) in two or three positions-
in-word. The term “position-in-word” is typically Two major explanations for the sequence of sound
reserved for consonant production and refers to the mastery shown in Figure 13–1 have been considered in
location of a consonant as word-initial, word-medial, the research and clinical literature. These explanations
or word-final. For example, the word “baby” contains are not mutually exclusive; they may both explain, in
the “b” sound in the word-initial and word-medial small or large ways, the early learning of sounds such
positions, and the word “tub” has the “b” sound in the as vowels, stops, and nasals as compared to the later
word-final position. The decision to test the produc- learning of sounds such as liquids and fricatives and
tion of a given sound in different positions-in-word of sound sequences such as consonant clusters. One
emerged from clinical experience. This experience sug- explanation concerns the maturation of speech motor
gested that a child’s ability to produce a sound correctly control capabilities. The other explanation concerns the
in any one position did not necessarily mean that he maturation of auditory mechanisms for speech sound
could produce the sound correctly in other positions. identification.
Maturation of Speech Motor Control know them as demonstrated in comprehension tasks.

This knowledge is not yet useful for expressive (pro-
Children do not control their posture, the use of their duction) language.
hands, or any other aspect of movement or body posi- What is the assumed difference between early ver-
tioning in an adult-like way. The movements of an sus advanced speech motor control skill? Figure 13–2
18-month-old child are immature, becoming more shows the difference in tongue shape for a sound mas-
refined and accurate over many years. As noted many tered early (/t/, green surface) and one mastered late
years ago by the famous speech scientist R. H. Stet- (/s/, black surface). The view is from the front looking
son (1872–1950), speech production is a collection of back into the vocal tract, as if the mouth and teeth are
articulatory movements made audible (Stetson, 1951). transparent to reveal the contrasting tongue shapes,
Speech movements result in positions and shapes of both of which are made at the alveolar place of articu-
articulatory structures. For example, an adult-like /s/ lation. Only the surface of the tongue at a specific loca-
— a “good” /s/— requires not only movement of the tion (a cross-section, along the width of the tongue) is
tongue to the correct location within the vocal tract, but shown. The shape for /t/ is more or less flat, with the
also an overall positioning and shaping of the tongue. tongue pressed against the alveolar ridge (the front of
For the remainder of this chapter, the term “speech the hard palate) to form a complete blockage of the
movement” is used to denote these aspects of motor airstream for about 1/10th of a second. The shape for
control. /s/ is more complex as shown by the groove in the
Like other movements of body structures, move- central part of the tongue width. The groove is narrow
ments of the speech mechanism develop and become and tight. When air flows from the lungs to the upper
increasingly adult-like as the child matures. The term airways, pressure builds up behind the groove and
“speech motor control” denotes the concept of ner- forces air through it, creating the hissing noise typical
vous system mechanisms (and the muscles they con- of fricative consonants. The grooved tongue shape for
trol) required for the execution of speech movements. fricatives is assumed to require more skilled speech
Maturation of speech motor control refers to the way
in which it changes over time from infant to adult-like
control capabilities. Maturation of speech motor con-
trol is necessary for an individual to transition from
infant- and toddler-like sound production to the fully
intelligible speech of the typical adult.
For the child with less mature speech motor con-
trol, some speech movements are likely to be challeng-
ing, whereas others may be less so and perhaps even
simple. Scientists have argued that fricatives and liq-
uids require rather precise and difficult movements
and positions and are therefore mastered later in devel-
opment, when the child’s speech motor control is “up
to the task.” Vowels, nasals, and stops are believed to
require relatively simple speech motor control capabili-
ties and can be produced accurately by younger chil-
dren. According to some studies, very young children
who are just beginning their speech sound develop-
ment may avoid producing words composed of “dif-
ficult” sounds such as fricatives and liquids (Vihman,
2004). For example, at 2 years of age, a child may pro-
duce words such as “dog,” “cat,” “poppa,” and “no,”
but not words such as “sun,” “shell,” and “rag.” Does
the child lack the latter three words in her lexicon (have Figure 13–2. Tongue shapes for /t/ (green) and /s/
they not yet been learned?), or does she have the words (black ). The view is from the front, looking into the mouth;
but avoid saying them because the component sounds the image is drawn as if the mouth and teeth are transpar-
(such as /s/, /ʃ/, and /r/) require speech motor control ent to show the side-to-side shape of the tongue behind the
abilities that are too advanced for her level of develop- teeth. The tongue is more or less flat for /t/ and grooved
ment? Many toddlers who do not say certain words for /s/.
motor control than the flat-tongued shape for stop con- seems to be “yes.” The perception of speech sounds
sonants. This may explain why /t/ is mastered earlier becomes more skilled as a child matures. Infants begin
in speech sound development than /s/ (and in general, with the ability to discriminate between sounds not
stop consonants are mastered earlier than fricatives). only of their own language but of other languages as
well (see Chapter 5). At the end of the first year and
into the second year of life, children’s ability to hear the
Maturation of Perceptual Mechanisms
difference between sounds that have a contrastive func-
for Processing Speech
tion in other languages, but not in English, diminishes
One of the great controversies in the field of speech and eventually disappears. At the same time, babies
development concerns the relationship of what the become especially sensitive to the important sound dis-
child hears to what the child produces. At a general criminations in their own language (Werker & Yeung,
level, the ability to hear speech sounds, and more pre- 2005).1 This not only shows a developing speech per-
cisely to distinguish between different speech sounds, ception ability in childhood but also the influence on
must be related to how a child learns to produce speech. perceptual skill of the specific language being spoken
An obvious example is the effect of significant hearing in the child’s environment. As reviewed by Nittrouer
loss on a child’s speech production abilities. Babbling (2002), children’s speech perception abilities continue
in babies with significant hearing loss at birth appears to evolve past the first year of life. Humans at birth are
much later than babbling in hearing babies (Eilers & not endowed with a “fixed” set of speech-sound per-
Oller, 1994). To the extent that auditory sensitivity pre- ception abilities. Perceptual skills change and develop.
dicts the complexity of babbling (von Hapsburgh & The more focused question of whether perceptual
Davis, 2006), the child’s auditory capability is one of skill for specific sounds must be in place for the correct
the foundations of speech sound development. production of the sounds remains open. This is a com-
Speech-language pathologists are interested in a plex issue, and the answer (at this point in time) is no
more specific question concerning the role of hearing more than a “maybe.” Late-developing speech sounds
in speech development: Is there a close match between such as liquids, fricatives, and affricates, are apparently
how the child perceives speech sounds and how he not difficult to perceive for the child who is early in
produces them? For example, do children develop speech sound development, and who has not mastered
the ability to produce the distinction between “w” and the production of these sounds (Vihman, 2004). Very
“r” (as in words such as “right” versus “white”) only young children may have difficulty perceiving frica-
when they can hear the difference between “r” and tives, or liquids, but this is not a general finding. But
“w”? More generally, do sounds such as liquids and the perception of sounds such as liquids and fricatives
fricatives appear later in speech development because may be fine even though the child has not mastered the
children who are just beginning to learn speech have production of these sounds. This latter finding weak-
difficulty hearing and distinguishing contrasts such ens the case for a close relationship between a typically
as /s/ versus /ʃ/ or /r/ versus /l/? Is the explanation developing child’s sound perception and sound pro-
for early appearing sounds such as vowels, nasals, and duction capabilities. A study relevant to this issue is
stops explained by an early ability to hear the differ- provided by Dugan, Silbert, McAllister, Preston, Sotto,
ence between them? and Boyce (2018), and an excellent review is provided
Is there evidence that the perception of fricatives, by Preston, Irwin, and Turcio (2015).
or liquids, is more challenging for younger, as com- This question is not only relevant to the typical
pared to older, children? In a more general sense, is development of speech but also has an important influ-
there evidence for the development of speech sound- ence on explanations and treatment plans for develop-
perception skills as children are learning language? mental speech sound disorders. Speech sound disorders
Or, is there evidence that little humans are endowed are conditions in which a child’s development of speech
at birth with the ability to hear all distinctions between sounds does not occur within a typically normal age
speech sounds? range. Children diagnosed with speech sound disor-
The answer to the major question, of whether the ders learn sounds in the typical order (see Figure 13–1)
perception of speech sounds is a developmental pro- but at a slower rate. As reviewed by Bankson and Ber-
cess (rather than being present in adult form, at birth), nthal (2004a), there is a history of research attempting
1
erker and Yeung (2005) have reviewed evidence that in the second half of the first year of life, babies’ evolving ability to associate words with
W
objects or actions is a “trigger” to organizing their greater sensitivity to the phonemes in their own language. In other words, early aspects of
word learning prepare the baby to perceive sound categories that have importance to making distinctions between words.
to show a link between sounds misarticulated by chil- Phonological Processes and

dren with speech sound disorders, and their ability to Speech Sound Development
perceive those specific sounds. The practical, clinical
implication of this issue is that perceptual training may Phonological learning is not limited to the mastery of
be a significant aspect of speech therapy for accurate individual speech sounds. Children must also learn
production of a speech sound. Perceptual training may how words are formed from sequences of sounds (the
even occur before training to produce the sounds cor- phonotactic characteristics of a language), and for some
rectly. Some clinical research has demonstrated, in fact, languages, the proper prosodic pattern for words with
that perceptual training of speech contributes to the multiple syllables (i.e., lexical stress patterns). In the
elimination of speech sound production errors (Bank- course of learning these aspects of phonology, children
son & Bernthal, 2004a). make errors when producing word forms that presum-
ably are intended to “match” the form produced by
adults. For example, a child around the age of 2 years
The Jury Is Sometimes may say “da” when he sees the family pooch. The child
Out and Sometimes In is attempting to produce the word “dog,” which has a
consonant-vowel-consonant (CVC) form. “Da” fails to
Many speech-language pathologists and match the adult form as it lacks the word-final conso-
audiologists use auditory training as an integral nant. Stated somewhat differently, the child uses a CV
part of speech therapy in children with speech word form for a “target” CVC word form. It is as if,
sound errors. Even though the relevant research as mentioned earlier, the child simplifies the articula-
foundation for auditory training is not firm, we tory task of producing a CVC by eliminating one of the
do not have a completely hung jury on the issue. component sounds.
As with the wide variability among children in Part of phonological development is learning the
learning a sound system, some children with correct matches to adult word forms by eliminating
speech sound disorders show improvement these simplifying processes. Notice how the simpli-
in speech production skills following percep- fication of CVC to CV syllables introduces a phono-
tual training. The training may be “auditory tactic constraint on the child’s speech production: the
bombardment” in which the child hears her adult phonology allows words to be formed by CVC
error sounds over and over. The theory is to sequences, but the child restricts his own word forms
“shape up” the sound category in perception to CV syllables.
so that it can be produced with reference to this The CV for CVC mismatch can be described as
stabilized perceptual category. Alternatively, a phonological error. Within certain age limits, such
therapy may consist of hearing perceptual errors are expected as a typical part of phonological
contrasts between closely related sounds, such as development. An interesting characteristic of phono-
the contrast between /s/ and /ʃ/, both voiceless logical development is found in the nature of these
fricatives that differ only by place of articula- errors: they are not random but are systematic. Some
tion. When a child is trained on the perception examples make this clear.
of this contrast and shows improvement in /s/ For many children learning the phonology of their
versus /ʃ/ production, two explanations are language, an error such as “da” for “dog” is not an iso-
possible. One is that the auditory training has lated case of leaving off the final consonant of a specific
a general influence on auditory skills, which word. Many typically developing children go through
includes the skill required for the perception a phase of producing “target” CVC word forms by
of any phonetic contrast. In this case, improve- always (or nearly always) deleting the final consonant.
ment in the /s/-/ʃ/ contrast and the resulting “Dog” is produced as “da,” “cat” as “kah,” “bus” as
influence on good /s/ and /ʃ/ production is a “buh,” and so forth.
consequence of an upgrade of overall auditory Scientists who study normal and delayed/dis-
skill. A second explanation is that the auditory ordered speech in children acquisition of phonology
training is specific to the /s/-/ʃ/ contrast and claim that groups of similar errors are the result of
does not transfer to other problem contrasts such phonological processes. A phonological process is a rule
as /w/-/r/. In either case (or maybe a little bit of that changes the expected word form (what an adult
both), the possibility that auditory training may produces) to a different, more simple form. In the cur-
contribute to improved articulation is worth the rent example, the phonological process is one of final
therapeutic effort. consonant deletion, in which all (or nearly all) CVC
word forms are changed to CV forms. The rule is not
“delete the ‘g’ at the end of the word ‘dog,’” but applies such as “CVC becomes CV,” as in the phonological pro-
broadly across final consonants in any CVC word. Pre- cess of final consonant deletion. This description does
sumably, this phonological process simplifies the artic- not prove, however, the presence of a mechanism in
ulation of single-syllable, CVC word forms because CV the heads of little humans that takes a CVC word form
forms are easier to produce. and changes it to a CV form as a biologically directed
The phonological process of cluster reduction part of speech sound development. Nevertheless, some
changes words with a CCV(C) or CVCC form (where scientists regard phonological processes as biologi-
“CC” = consonant cluster) to CVC forms. For exam- cally based, cognitive mechanisms that guide, at least
ple, the CCVC form for the word “stop,” in which the in part, the natural course of phonological acquisition.
consonant cluster is the word initial “st,” is changed to What kind of scientific evidence supports the bio-
“top” (a CVC form). Or, the CVCC form for the word logical view of phonological processes? One observa-
“best” is changed to “bes” or “bet” (again, a CVC tion, that the same phonological processes are seen
form). In both cases, the child simplifies the articula- among children learning very different languages
tory requirements for the adult CCVC or CVCC forms (Vihman, 2004), has been used to support the biologi-
by “reducing” two successive consonants — a conso- cal perspective. Even when different languages have
nant cluster — to a single consonant. different phonetic inventories and use them in different
Another phonological process is called stopping of ways (i.e., have different phonemes and allophones),
fricatives. This process changes fricatives to stops, as the same phonological processes tend to change adult
when “sip” is produced as “tip.” Stopping of fricatives word forms to simpler forms. This suggests something
simplifies articulation by changing a sound thought to “universal” about phonological processes, something
require advanced speech motor skill (the fricative “s”) applying to all languages even when other components
to one of relatively simple speech motor skill (the stop such as phonemes vary from language to language.
consonant “t”). The concept of a “universal” language characteristic is
English words with multiple syllables have vary- almost always an important piece of a belief in a bio-
ing stress on the syllables. Some syllables are heard logical basis of speech and language.
as “prominent,” and others are heard as weak. In the Like the mastery of individual speech sounds, dif-
word “banana,” for example, the first syllable “ba” is ferent phonological processes seem to have a schedule
produced with very little stress and therefore does not of appearance (and disappearance) during the course
sound prominent. In contrast, the second syllable “na” of phonological acquisition. For example, the process of
receives primary stress in the word and is heard as final consonant deletion is typically seen in the early
prominent. The third syllable is also produced with lit- stages of acquisition and disappears before the age of
tle stress and is not very prominent. The stress pattern 3 years (Stoel-Gammon & Dunn, 1985). The concept of
of this three-syllable word can be described as weak- disappearance of a phonological process is important:
strong-weak. The phonological process of unstressed the mastery of the phonological system of a language
syllable deletion reduces the number of syllables in a is characterized not only by learning speech sounds
multisyllabic word by eliminating an unstressed syl- but also by the elimination of processes that create mis-
lable, typically the first “weak” syllable of the word. matches between child and adult word forms. There
“Banana” is produced as “nana” (strong-weak), “ele- are gains and losses along the child’s pathway to “cor-
phant” as “elphant” (strong-strong), and “incredible” as rect” phonological behavior.
“creble” (“strong-weak”).2 Presumably, the articulation Based on available data, certain processes seem to
of multisyllabic words can be simplified by eliminating disappear early in typical speech sound development;
one (or more) of the syllables from the production. others persist to later ages. The process of final conso-
Are phonological processes merely descriptions nant deletion has already been noted as disappearing
of the way children modify word forms during typical relatively early in the course of typical phonological
phonological acquisition, or are they the result of bio- development. Cluster reduction may not disappear
logical tendencies unique to the human ability to com- until later in phonological development, around age 3
municate by speech? The difference between these two and a half years (Cohen & Anderson, 2011). Stated oth-
possibilities is important. Behavioral regularities can erwise, cluster reduction observed in a child’s speech
always be described and stated as a formula or “rule” past the age of 3 years is not considered atypical, but
2
hen my son Ben was learning the sound system of English, his multisyllabic word productions were changed by the phonological process of
W
unstressed syllable deletion. For a long time, he referred to “The Incredible Hulk,” one of his favorite destructive superheroes, as “creblhulk”
(three syllables). I combine “hulk” with the first part of the word because he obviously treated the two words as one. Ben had no idea, at this
point in his phonological development, that “incredible” was a word that could be separated from “hulk.”
deletion of word-final consonants past the age of 3 speech motor skills (phonetic skills) and the sound sys-
years suggests a phonological delay. Bankson and Bern- tem (phonology) of a language.
thal (2004b, pp. 245–249) have an excellent review Phonemic contrasts, allophones of the phoneme
of phonological processes during the typical course of categories, morpho-phonemic rules, phonotactic con-
phonological acquisition. straints, and prosody are components of phonological
development.
The development of speech motor control deter-
Phonological Development mines the development of phonetic skills, which are
and Word Learning sometimes referred to as articulatory skills.
Examples are provided of the potential indepen-
This chapter has focused on speech sound development. dence of speech motor and phonological development;
Speech sounds have been treated as independent units both of these contribute to development of speech
to be learned, such as the typical age of mastery of /g/ or sounds.
/s/. In addition, mastery of the sound system has been The order in which speech sounds are mastered
shown to be part of word learning, as in the case of the is based on research using large numbers of typically
development of word forms with “allowable” sound developing children; the age ranges over which specific
sequences (phonotactics) and simplification of word sounds are mastered is an average, and many typically
forms (phonological processes). This description of pho- developing children do not follow a fixed pattern of
nological development implies a directionality between sound development.
sound learning and word learning: sounds are learned, Speech sound development begins around 1 year
and words are built from them. In different terms, of age and is often complete by 5 or 6 years of age; for
vocabulary growth is dependent on sound mastery. some children, speech sound development continues
In recent years, this logic has been reversed. The until 8 or 9 years of age.
results of several studies suggest that word learn- Speech sounds that are mastered early in devel-
ing leads sound learning. The growth of vocabu- opment include vowels, diphthongs, nasals, glides
lary requires finer and finer articulatory distinctions and stops; lateral, rhotics, fricatives, and affricates are
between sounds to distinguish the new lexical entries. learned later in speech sound development.
For example, the child may add words such as “sign” Consonant clusters are mastered later in the course
and “shine” to the lexicon at about the same time, of speech sound development.
perhaps before the child is making a sharp articula- The order of mastery of specific sounds may be
tory distinction between “s” /s/ and “sh” /ʃ/. The dis- explained by the development of speech motor skill,
tinction between these two sounds is one of place of perceptual skill, and cognitive skills.
articulation (alveolar for /s/, palato-alveolar for /ʃ/). Phonological processes are important in the devel-
These late-mastered sounds may both be distortions, opment of speech sounds; the processes often result in
with imprecise places of articulation. The acquisition mismatches between a child’s production of a word
of these new words is thought to promote greater artic- and the “target” adult form.
ulatory distinction between the sounds — to get their Many phonological processes produce “errors”
places of articulation correct and thus match the adult early in speech sound development, but the pro-
form of the words. In this sense, vocabulary growth cesses actually simplify the child’s task of producing
may “shape” sound system growth. words; such processes are referred to as simplification
This idea has implications for speech-language processes.
therapy, as described in Chapter 15. The way in which Throughout the course of speech sound develop-
phonological development is thought of — words built ment, simplification processes disappear, which allows
from sounds versus sounds built from words — may the child to produce a word that is a good match to the
have important implications for treatment of develop- adult form of the word.
mental speech sound disorders. Gierut (2016) has an
excellent review of the relationship between vocabu-
lary growth and mastery of the speech sound system.
References
Bankson, N. W., & Bernthal, J. E. (2004a). Etiology/factors

Chapter Summary related to phonologic disorders. In J. E. Bernthal & N. W.
Bankson (Eds.), Articulation and phonological disorders (5th
ed., pp. 139–200). Boston, MA: Pearson Education.
Phonetic and phonological development in typically Bankson, N. W., & Bernthal, J. E. (2004b). Phonological assess-
developing children includes the development of both ment procedures. In J. E. Bernthal & N. W. Bankson (Eds.),
Articulation and phonological disorders (5th ed., pp. 201–267). clinicians need to understand about speech perception
Boston, MA: Pearson Education. and language processing. Language, Speech, and Hearing
Bauman-Waengler, J. (2015). Articulation and phonology in Services in the Schools, 33, 237–252.
speech sound disorders: A clinical focus (6th ed.). Boston, MA: Preston, J. L., Irwin, J. R., & Turcio, J. (2015). Perception of
Pearson Education. speech sounds in school-aged children with speech sound
Bleile, K. M. (2018). The late eight (3rd ed.). San Diego, CA: disorders. Seminars in Speech and Language, 36, 224–233.
Plural Publishing. Smit, A. B., Hand, L., Freilinger, J. J., Bernthal, J. E., & Bird,
Bricker, W. A. (1967). Errors in the echoic behavior of pre- A. (1990). The Iowa articulation norms project and its
school children. Journal of Speech and Hearing Research, 10, Nebraska replication. Journal of Speech and Hearing Disor-
67–76. ders, 55, 779–798.
Cohen, W., & Anderson, C. (2011). Identification of phonolog- Stein-Rubin, C., & Fabus, R. (2012). A guide to clinical assess-
ical processes in preschool children’s single-word produc- ment and professional report writing. Clifton Park, NY:
tions. International Journal of Language and Communication Delmar.
Disorders, 46, 461–488. Stetson, R.H. (1951). Motor phonetics: A study of speech move-
Dugan, S. H., Silbert, N., McAllister, T., Preston, J. L., Sotto, ments in action (2nd ed.). Amsterdam, the Netherlands:
C., & Boyce, S. E. (2018). Modelling category goodness North Holland Publishing.
judgments in children with residual sound errors. Clinical Stoel-Gammon, C., & Dunn, C. (1985). Normal and disordered
Linguistics and Phonetics, 24, 1–21. phonology in children. Baltimore, MD: University Park Press.
Eilers, R. E., & Oller, D. K. (1994). Infant vocalizations and Vihman, M. M. (2004). Later phonological development. In
the early diagnosis of severe hearing impairment. Journal J. E. Bernthal & N. W. Bankson (Eds.), Articulation and
of Pediatrics, 124, 199–203. phonological disorders (5th ed., pp. 105–138). Boston, MA:
Flipsen Jr., P. (2016). Emergence and prevalence of persistent Pearson Education.
and residual speech errors. Seminars in Speech and Lan- von Hapsburgh, D., & Davis, D. L. (2006). Auditory sensitiv-
guage, 36, 217–223. ity and the prelinguistic vocalizations of early-amplified
Gierut, J. (2016). Nexus to lexis: Phonological disorders in infants. Journal of Speech, Language, and Hearing Research,
children. Seminars in Speech and Language, 37, 280–290. 49, 809–822.
Ladefoged, P., & Maddieson, I. (1996). The sounds of the world’s Werker, J. F., & Yeung, H. H. (2005). Infant speech perception
languages. Oxford, UK: Blackwell. bootstraps word learning. Trends in Cognitive Sciences, 9,
Nittrouer, S. (2002). From ear to cortex: A perspective on what 519–527.
14
Motor Speech Disorders
in Adults
Introduction Classification of Motor
Speech Disorders
Motor speech disorders are a group of speech disorders
resulting from damage to the central nervous system Figure 14–1 presents a simple classification scheme
(cerebral hemispheres and their contents, the cerebel- for motor speech disorders in adults. Motor speech
lum, brainstem, and spinal cord) or peripheral nervous disorders include two major subcategories, one being
system (nerves leading from the brainstem or spinal dysarthria and the other apraxia of speech. Dysarthria
cord to and from muscles). The causes of this dam- is a motor speech disorder in which neurological dis-
age include a range of neurological diseases, includ- ease results in weakness, paralysis, or incoordination
ing degenerative diseases (like Parkinson’s disease, or among the muscles of the speech mechanism (Darley,
multiple sclerosis), strokes, tumors, inflammatory con- Aronson, & Brown, 1975; Duffy, 2013). These muscle
ditions, and other conditions. problems result in poor control of movements of the
The clinical and research history of motor speech lips, tongue, jaw, velum, larynx, and respiratory struc-
disorders is unique in the field of communication sci- tures. The poor speech movement control results in
ences and disorders. The history is unique because speech production problems.
there is a well-accepted classification system for dif- SLPs are able to identify most cases of dysarthria
ferent types of motor speech disorders. Many other by listening to a short sample of speech. Occasion-
speech and language disorders do not enjoy the ben- ally, speech of the hearing impaired or of adults with
efit of agreement on their classification. The classifica- apraxia of speech (see later in this chapter) may be con-
tion of motor speech disorders assumes that damage fused with dysarthria.
to different parts of the brain produces different — and Apraxia of speech in adults (AAS, for “adult
unique — speech symptoms. apraxia of speech)1 is thought to be a planning (also
Table 14–1 summarizes terms from Chapter 2 called programming) disorder resulting from neuro-
that are relevant to the classification of motor speech logical damage within the cortex and subcortical nuclei
disorders. such as the basal ganglia. Muscle paralysis, weakness,
1
I n much of the clinical and research literature on apraxia of speech in adults, the acronym “AOS” (apraxia of speech) is used. However, “AAS”
is used in the textbook to be consistent with more recent literature in which the adult version of the disorder is compared and contrasted with
childhood apraxia of speech (CAS; see Chapter 15).
191
Table 14–1. Terms From Chapter 2 That Are Relevant to the Current Chapter on
Motor Speech Disorders in Adults
Term Definition
Central nervous Cerebral hemispheres and their contents, cerebellum,
system brainstem, spinal cord.
Peripheral nervous Nerves attached to brainstem and spinal cord that carry
system information to and from the central nervous system; cranial
nerves serve structures of the head and neck, spinal nerves
serve structures of the limbs and torso.
Gray matter Clusters of neuron cell bodies. The cerebral cortex and
the cortex of the cerebellum are composed of cell bodies;
clusters of gray matter in the basal ganglia and thalamus,
the brainstem, below the cerebellar cortex, and spinal cord,
are called nuclei.
White matter Bundles of myelinated axons connecting one or more areas

of gray matter.
Substantia nigra A nucleus (group of cell neuron cell bodies) in the midbrain
(top part of brainstem) that manufactures dopamine, a
neurotransmitter critical to control of movement.
Motor neuron The cell body of a neuron whose axon carries information
to muscles to control their contraction time, force, and
coordination with other muscles. Groups of motor neurons
are found in the primary motor cortex and in other regions
of the central nervous system such as the basal ganglia and
brainstem, and spinal cord.
Upper motor neuron The pathways from the cortical motor neurons to motor
nuclei in the brainstem or spinal cord.
Lower motor neuron The pathways from the motor nuclei in the brainstem or
spinal cord to muscles of the head and neck (via cranial
nerves) and limbs and torso (via spinal nerves).
Motor Speech Disorders
Dysarthria Apraxia of Speech

Flaccid
Spastic
Ataxic
Hypokinetic
Hyperkinetic
Mixed
Unilateral upper motor neuron
Weakness, paralysis, Planning (no muscle deficit)

incoordination
Figure 14–1. A simple classification scheme for motor speech disorders in adults.
192
and/or incoordination are not thought to be impaired neurological diseases, and based on what they heard
in AAS. Rather, the patient has difficulty with the plan made precise estimates of the severity of different
for production of the utterance. The planning deficit speech characteristics. For example, Darley, Aronson,
may include problems with the order of consecutive and Brown (hereafter, DAB) knew that a very com-
syllables in an utterance, and problems with instruc- mon speech characteristic in motor speech disorders
tions from the cortex for the correct timing of muscle was imprecise consonants. The term was meant to indi-
contractions throughout an utterance. cate speech in which the consonant sounds appeared
to be articulated in a noncrisp, imperfect way.2 DAB
also knew, from listening to many patients with motor
Dysarthria speech disorders, that the loss of crisp consonant
articulation may range from a very subtle consonant
The problem in dysarthria is thought to be limited to problem to an obvious loss of articulatory ability. They
neuromuscular execution. The speech disorder is not therefore used a seven-point, equal-appearing interval
thought to be a partial or significant result of problems scale to record their impressions of the severity of each
with the symbolic component of language, as in apha- patient’s consonant articulation ability.
sia (see Chapter 9). In dysarthria the patient knows The two ends of the scale are given labels — scien-
what she wants to say, plans the utterance in a normal tists call this “anchoring” the scale — with the number 1
way, but fails to produce it normally because of the indicating normal consonant articulation and the num-
muscle problems noted earlier. ber 7 a very severe deviation from normal consonant
articulation — severely imprecise consonants. The num-
bers in between these two extremes indicate different
Subtypes of Dysarthria degrees of imprecise consonants.
DAB made interval-scale estimates for a total of 38
The subtypes of dysarthria in Figure 14–1 constitute characteristics of speech. The 38 perceptual dimensions
the Mayo Clinic System for Classification of Dysarthria. were selected to represent the different components
Darley, Aronson, and Brown (1975), clinician-scientists of the speech production process (speech breathing,
who worked at the Mayo Clinic where they developed phonation, velopharyngeal function, articulation).
the classification system, believed that each subtype The selected dimensions were based on DAB’s exten-
had a unique “sound” that could be related directly to sive experience with motor speech disorders and the
the location of damage within the nervous system. In authors’ knowledge of aspects of speech most likely to
fact, Darley et al. (1975) claimed that in dysarthria, the be impaired as a result of neurological disease. A few of
sound of a patient’s speech had a “localizing” value. the dimensions scaled by DAB, along with their defini-
In this view, skilled clinicians can identify the locations (as given by Darley, Aronson, & Brown, 1969) are
tion of neurological damage by listening to a patient’s listed in Table 14–2.
speech. In the 1960s, when Darley et al. developed the The more than 200 patients studied by DAB were
classification system, imaging techniques such as com- not a random sample of people with motor speech
puterized axial tomography (CAT) scans and magnetic disorders seen at the Mayo Clinic but were selected
resonance imaging (MRI) were not available to identify to represent six major disease types. These types were
the location of brain damage. Localization of neurolog- brainstem disease, stroke, Parkinson disease, cerebellar
ical damage by listening to a patient’s speech was a disease, Huntington’s disease, and amyotrophic lateral
major contribution to medical diagnosis. sclerosis.
For each disease type, DAB summarized their per-
ceptual analysis of the 38 dimensions with a statistical
The Mayo Clinic Classification procedure designed to detect patterns among the per-
System for Motor Speech Disorders ceptual dimensions. According to DAB’s hypothesis,
each disease was expected to have unique patterns
Throughout the 1960s, a monumental study of motor among the 38 perceptual dimensions. Because each of
speech disorders took place at the Mayo Clinic in Roch- the diseases they studied has specific and unique loca-
ester, Minnesota. Darley, Aronson, and Brown (1975) tions of brain damage, DAB’s hypothesis of a unique
listened to tapes of over 200 patients with different “sound” (in the broad sense, not in the sense of a
2
Note that many of these speech sound errors were not substitutions of one sound for another, such as a clear /ʃ/ (“shave”) for /s/ (“save”)
error. Documentation of these kinds of error requires narrow phonetic transcription (Chapter 12). Sound substitutions such as /ʃ/ for /s/ are
also heard in dysarthria.
Table 14–2. Selected Perceptual Dimensions and Their Definitions
Dimension Definition
Imprecise consonants Consonant sounds lack precision. They show slurring,
inadequate sharpness, distortions, and lack of
crispness. There is clumsiness in going from one
consonant sound to another.
Strained-strangled Voice (phonation) sounds strained or strangled (an

voice apparently effortful squeezing of voice through the
glottis).
Harsh voice Voice is harsh, rough, and raspy.
Breathy voice Continuously breathy, weak, and thin
Distorted vowels Vowel sounds are distorted throughout their total

duration.
Prolonged intervals Prolongation of interword or intersyllabic intervals.
Hypernasality Voice sounds excessively nasal. Excessive amount of

air is resonated by nasal cavities.
Monopitch Voice is characterized by a monopitch or monotone.

Voice lacks normal pitch and inflectional changes. It
tends to stay at one pitch level.
Monoloudness Voice shows monotony of loudness. It lacks normal

variations in loudness.
Excess and equal Excess stress on usually unstressed parts of speech,

stress e.g., (a) monosyllabic words and (b) unstressed
syllables of polysyllabic words.
Note. These perceptual dimensions were prominent in the Mayo Clinic analysis of patients
with motor speech disorders. The definitions are reproduced verbatim from Darley, Aronson,
and Brown (1969). Some dimension names have been reordered (e.g., “distorted vowels” was
“vowels distorted” in Darley et al. [1969]).
unique phoneme sound “problem”; see immediately Based on their analysis of the perceptual data,
above) for each of these diseases was equivalent to a DAB confirmed their hypothesis by identifying six
hypothesis of a unique-sounding speech for damage to unique dysarthrias. These include the first six listed
different parts of the brain. in Figure 14–1. The unique subtypes were flaccid,
The Nervous System Is More Complicated Than That

Darley, Aronson, and Brown stated their dopamine, a neurotransmitter critical to control
hypothesis of a strong link between the location of of movement. Other lesion sites, however, have
nervous system damage (neurologists call this “site also been identified in Parkinson’s disease, includ-
of lesion”) and speech symptoms, knowing that a ing the cerebellum, basal ganglia, and brainstem
given neurological disease does not have a single (Joutsa, Horn, Hsu, & Fox, 2018). The lack of simple
site of lesion. What Darley, Aronson, and Brown lesion-location/disease combinations applies to
meant was primary site of lesion. For example, in other neurological diseases as well. Keep this in
Parkinson’s disease, the primary site of lesion is in mind when considering the idea that listening to
the midbrain (mesencephalon) where cells in the the speech of someone with a neurological disease
substantia nigra deteriorate and die. The death of is a straightforward way to know where the
these cells deprives the central nervous system of lesion is.
spastic, ataxic, hypokinetic, hyperkinetic, and mixed The group of patients with flaccid dysarthria stud-
dysarthria. ied by DAB had damage in the brainstem (blue-shaded
These six dysarthrias can be described by their areas in Figure 14–2) or in the cranial nerves exiting
most severely affected perceptual dimensions, the the brainstem to innervate muscles of the speech
location of the relevant brain damage, and the muscle mechanism (yellow arrows pointing from the brain-
control problems thought to produce the abnormal and stem in direction of muscles). Damage to the brain-
distinguishing speech symptoms. The reader should stem motor neurons that innervate speech muscles
refer back to Table 14–2 for explanations of the percep- results in paralysis or weakness of the muscles, as well
tual dimensions discussed later. as atrophy (often called “wasting”) of muscle tissue.
Similar problems occur with damage to the nerves that
carry the motor neuron commands to the muscles. For
Flaccid Dysarthria
example, weakness of the laryngeal muscles will pre-
The distinguishing perceptual dimensions in flaccid vent firm closure of the vocal folds during each cycle of
dysarthria included breathy voice, hypernasality, and phonation, producing the breathy voice noted earlier.
imprecise consonants. An overall impression of the Similarly, weakness of muscles of the velopharyngeal
typical speaker with flaccid dysarthria is one of a weak, port will result in hypernasality; weakness of the jaw,
somewhat nasal voice with weak (noncrisp) articulation. tongue, and lips will cause imprecise consonants.
To speech muscles
Spinal nerves to
muscles of respiratory system
Figure 14–2. Sagittal (from the side) view of the inner wall of the right hemi-
sphere. The brainstem is shaded blue, yellow-orangish arrows directed away from
the pons and medulla represent cranial nerves that control the muscles of the speech
mechanism (muscles of the jaw, lips, tongue, soft palate, pharynx, and larynx). An
arrow below the brainstem represents spinal nerves that control muscles of the limbs
and torso. Corpus callosum, in red, is shown as a landmark.
Flaccid dysarthria is the result of lower motor neuron Patients with spastic dysarthria typically had
disease. “Lower motor neuron” refers to the motor nuclei damage somewhere along the fiber tracts connecting
in the brainstem and the cranial nerves that carry informa- motor cells in the cortex to motor neuron cells in the
tion from these nuclei to muscles. The term “lower motor brainstem. Recall from Chapter 2 that the cortex con-
neuron” contrasts with “upper motor neuron” which is sists of massive amounts of gray matter — clusters of
discussed later in the section, “Spastic dysarthria.” neuron cell bodies. These cell bodies are connected to
other groups of cell bodies in the brain by white mat-
ter, formed from bundles of axons (fiber tracts). In
Spastic Dysarthria
Figure 14–3, cortical motor cells for muscles of the
Imprecise consonants, monopitch, and reduced stress speech mechanism such as the jaw, lips, tongue, velum,
were the three most impaired perceptual dimensions and pharynx are represented by the two upper, brown
in spastic dysarthria. Many of these patients also had a dots. The two brown dots located in the brainstem
slow speaking rate and a strained-strangled voice qual- represent the motor neurons for these muscles. The
ity. The extremely slow speaking rates of patients with yellow-orange arrows originating in the cortical cells
spastic dysarthria distinguished them from patients and ending on these brainstem motor neurons repre-
with flaccid dysarthria (most patients with flaccid dys- sent a fiber tract (white matter) called the corticobul-
arthria had normal speaking rates), and the strangled, bar tract. Part of this fiber tract terminates in motor
harsh voice quality was very different from the breathy nuclei of the pons, another part of the tract terminates
voice quality of flaccid dysarthria. in motor nuclei of the medulla.3 Patients with spastic
Corticobulbar
tract
Corpus
callosum
Pons
Medulla
Figure 14–3. Sagittal (from the side) view of the inner wall of the right hemisphere. Two brown
circles in the cortex indicate motor cells controlling articulators such as the jaw and tongue, and two
brown circles in the pons and medulla indicate motor neurons that are connected directly via cranial
nerves to muscles of the speech mechanism. The yellow-orangish lines connecting the cortical motor
neurons and brainstem motor neurons represent the corticobulbar tract.
3
Part of the corticobulbar tract also terminates in the midbrain where motor nuclei are located but are not associated with speech.
dysarthria studied by DAB had lesions on both sides party is off the hook,” there are two relatively long
of the brain, that is, in the left and right corticobulbar syllables (“par” and “hook”) and five relatively short
tracts. syllables (the two “the’s”; the “ee” in “party,” “is,” and
The corticobulbar tract transmits motor commands “off”). Long syllables are typically stressed and short
in the cortex to cells in the brainstem. Because the fiber syllables unstressed. Speakers with ataxic dysarthria
tract is above the motor neurons in the brainstem, tend to equalize all syllable durations in an utterance
damage to it is referred to as upper motor neuron disease. by making each syllable long and fairly loud. This is
According to DAB, damage to this fiber tract — but not what is meant by “excess and equal stress.” “Irregular
to the motor neuron cells in the brainstem — results in articulatory breakdown” is a term coined by DAB to
spastic dysarthria.4 capture the fluctuating nature of the speech problem
Upper motor neuron disease usually results in in ataxic dysarthria. Speakers with ataxic dysarthria
very specific changes to the affected muscles, such as may sound fairly normal for short stretches of speech
hypertonic (excessive) tone which makes them stiff. and suddenly produce a clearly dysarthric string of
The muscles also have overly sensitive reflexes (hyper- syllables.
reflexia), causing them to contract with unusual force Listeners often describe ataxic dysarthria as
when stretched even a small amount. Hypertonic mus- drunk-sounding speech. This impression is due par-
cles have difficulty causing movement of structures tially to the excess and equal stress on the speaker’s
(such as the jaw), and hyperreflexive muscles result syllables, which sounds to listeners as if each syllable
in unstable muscle contraction and movement. Thus, is being “metered out” on a strict time schedule rather
movement is impaired in many muscles of the speech than following the normal speech rhythm of alter-
mechanism, such as muscles of the tongue, lips, soft nating long and short syllables. Speech-language
palate, larynx, and muscles that control jaw opening pathologists refer to this perceptual impression as
and closing. “scanning speech” (each syllable being “scanned”
DAB believed these excessively contracted, stiff carefully and then produced as if disconnected from
muscles were responsible for many of the primary the next syllable). The impression of drunken-
speech symptoms of spastic dysarthria. For example, sounding speech is probably also promoted by the ten-
the “monopitch” characteristic of spastic dysarthria dency of speakers with ataxic dysarthria to produce a
was caused by excessive stiffness of the laryngeal speech melody (intonation) with markedly exagger-
muscle responsible for voice pitch changes. Similarly, ated pitch changes, giving their speech an “out of con-
the “strained-strangled” voice quality was caused by trol” quality.
excess tone of laryngeal muscles, resulting in overly- Ataxic dysarthria results from damage to the cer-
tight vocal fold closure during phonation. Imprecise ebellum or its connecting fiber tracts. In Figure 14–4,
consonants were the result of difficulty in moving cell bodies within the cerebellum are represented by
structures, such as the tongue, into the proper posi- brown dots, and the connecting fiber tracts are shown
tions for the articulation of speech sounds. The percep- as yellow-orange lines ending in arrowheads. These
tion of “weak stress” was due to the patient’s inability arrowheads point away from the cerebellum to other
to produce stress distinctions between unstressed and parts of the brain or are directed from other parts of the
stressed syllables, as in words like “about” in which the brain into the cerebellum.
first syllable is unstressed. This inability was thought Damage to the cerebellum causes a number of well-
to be related to the difficulty of adjusting the aspects of known, general neurological symptoms. For example,
speech production (e.g., pitch, loudness, and duration) patients with cerebellar damage have difficulty main-
that are used to create stress differences. taining a steady rhythm, even when asked to open and
close their forefinger and thumb in a repetitive, simple
way. Patients also have difficulty controlling the force
Ataxic Dysarthria
of their muscle contractions, which results in actions
The primary perceptual features of ataxic dysarthria performed with either excessive or insufficient force.
were imprecise consonants, excess and equal stress, It is as if the patient has difficulty scaling muscle con-
and irregular articulatory breakdown. In English traction to the needs of the task. This may explain the
speech production, syllables alternate between ones tendency of people with ataxic dysarthria to produce
with relatively long duration and ones with relatively sequences of syllables with excess and equal stress (see
short duration. For example, in the sentence, “The Box, “Spanish and Ataxic Dysarthria”).
4
pper motor disease also refers to damage to the corticospinal tract connecting cortex to motor neurons in the spinal cord. The damaged
U
corticospinal tract and its possible relationship to dysarthria is not discussed further in this chapter.
Corpus
callosum
Cerebellum
Pons
Medulla
Figure 14–4. Sagittal (from the side) view of the inner wall of the right hemisphere. The arrows
show the interconnections between the cerebellum and structures of the central nervous system. The
arrows show that the cerebellum is connected to the spinal cord, the brainstem, and the cortex, send-
ing and receiving information from all three major components of the CNS.
Spanish and Ataxic Dysarthria

Darley, Aronson, and Brown devel- languages (Korean, Mandarin Chinese, Japanese).
oped their classification system based on speech Is “equal and excess stress” relevant to the
production of American English speakers with dysarthria classification of a Spanish speaker with
neurological disease. The reasoning for a promi- cerebellar disease? The same question can be asked
nent “excess and equal stress” dimension in ataxic of many of the speech dimensions that contribute
dysarthria is solid, but only for languages (such to the Mayo classification system but do not apply
as English) with syllable sequences that vary in in the same way across different languages. The
a long-short-long-short pattern. Many languages answers to these questions are not known with any
have syllable sequences in which each syllable certainty; cross-linguistic studies of dysarthria are
has roughly the same duration; syllables do not in their infancy (Kim & Choi, 2017; Liss, Utianski,
vary in a long-short-long-short pattern. Spanish is & Lansford, 2013).
such a language, as are Finnish and several Asian
Hypokinetic Dysarthria ers with Parkinson’s disease (PD) had imprecise con-
sonants and “short rushes of speech,” a tendency for
Monopitch, reduced stress, and monoloudness were sudden, very rapid, and “mumbly” sequences of syl-
the three prominent perceptual dimensions for patients lables. Speakers with hypokinetic dysarthria are also
with hypokinetic dysarthria. In addition, many speak- perceived as having an extremely weak, soft voice.
Hypokinetic dysarthria is associated almost exclu- The patient with Parkinson’s disease often has a
sively with Parkinson’s disease.5 Parkinson’s disease resting tremor, usually in the hand but sometimes in
involves cell death in a midbrain nucleus called the sub- other body structures. A resting tremor occurs when
stantia nigra (see blue oval in Figure 14–5 for approxi- the hand is not moving; the tremor is not seen when the
mate location). Cells in the substantia nigra are of critical hand is moving to accomplish a goal such as twisting a
importance because they manufacture the chemical dopa- beer cap. Body structures such as the arms and legs, as
mine that is used as a neurotransmitter in parts of the brain well as in the speech mechanism, have a rigid quality
responsible for movement control (as well as other parts — they are stiff and resist movement when displaced
of the brain involved in memory and the experience of (as when an examiner pulls or pushes on the arm or
pleasure). Dopamine manufactured in the substantia nigra leg). Movements, when produced, are slow and small.
is delivered to the basal ganglia, the group of cells above At times, the patient has difficulty initiating move-
the brainstem but deep within the cerebral hemispheres. ment, as if he or she is “frozen” in place.
The dopamine pathway from the substantia nigra to the DAB thought that the top-ranked perceptual
basal ganglia is indicated in Figure 14–5 by the upward- dimensions in hypokinetic dysarthria of monopitch,
pointing arrows. The basal ganglia play an important role reduced stress, and monoloudness, as well as overall
in movement control; to perform that role, they need an reduced loudness and imprecise consonants, could be
adequate supply of dopamine. Loss of dopamine due to explained by slow and small respiratory, laryngeal,
cell death in the substantia nigra is responsible for move- and articulatory movements resulting from the loss of
ment problems in Parkinson’s disease. dopamine in the basal ganglia.
Corpus
callosum
Midbrain
Figure 14–5. Sagittal (from the side) view of the inner wall of the right hemisphere. The blue
oval shows the location of the substantia nigra in the midbrain (mesencephalon) and the arrows the
direction of dopamine delivery to the basal ganglia where it is used as a critical neurotransmitter in
the control of movement.
5
ertain neurological diseases produce symptoms like those of Parkinson’s disease yet do not qualify for a specific diagnosis of the disease.
C
Patients with these diseases are often referred to as having “parkinsonism.” Patients with parkinsonism may also have hypokinetic dysarthria.
Hyperkinetic Dysarthria ments give the patients the appearance of being in

constant motion. Muscle tone may be continuously
Hyperkinetic dysarthria is a result of several different variable in Huntington’s disease, sometimes being
diseases of the basal ganglia. DAB studied hyperkinetic normal but often ranging from excessively stiff (hyper-
dysarthria in two of these diseases, Huntington’s dis- tonic) to floppy (hypotonic) in a short period of time.
ease and dystonia, both described later in this chapter. The movement symptoms are sometimes described as
The basal ganglia are a complex group of cells, having an ataxic component — inability to control the
composed of several separate but interconnected range and force of movements, and difficulty in pro-
nuclei. Figure 14–6 shows a midsagittal view of the ducing a steady rhythm.
right hemisphere in which an oval indicates the approx- Dystonia is a basal ganglia disease in which muscle
imate location of the nuclei of the basal ganglia. The contraction builds up and is sustained with excessive
region outlined by the oval is deep within the cerebral force for unusually long intervals. Muscle contrac-
hemispheres, above the brainstem (Chapter 2, Figures tions in dystonia are overly strong (hypercontraction)
2–6 and 2–7). Damage to any one of these nuclei may for the task at hand, often resulting in an unintended,
produce somewhat unique neurological symptoms, sustained posture of the trunk, arm, hand, eyelids, or
far too many to cover in this chapter. Here the general jaw. Hypercontraction in dystonia often occurs as an
characteristics are presented for the two basal ganglia exaggeration of a purposeful movement to accomplish
diseases studied by DAB. a task, rather than occurring at random times or at rest.
Huntington’s disease is a genetic disorder in which For example, in oromandibular dystonia, a hypercontrac-
movement difficulties are among the first symptoms. tion of the jaw muscles occurs when the patient begins
Later in the course of the disease, patients experi- to speak, or perhaps when he chews, but typically does
ence severe cognitive and psychiatric disturbances. not occur when the patient is not using the jaw for a
The movement symptoms are dominated by chorea, specific purpose. Similarly, spasmodic dysphonia is a
in which a series of twitches, jerks, and sudden move- dystonia that affects the vocal folds by closing them
Corpus
callosum
Location of group
of nuclei called
the basal ganglia
Figure 14–6. Sagittal (from the side) view of the inner wall of the right hemisphere. The large blue
oval encompasses the regions above the midbrain and below the cortex where the nuclei of the basal
ganglia are located. The basal ganglia include several interconnected nuclei.
forcefully when the patient attempts to produce voice mixed dysarthria. In ALS, death of motor neurons in
(i.e., to phonate). The forceful closing is the result of the brainstem (lower motor neuron disease) is com-
overly strong and sustained contractions produced by bined with lesions in the fiber tract connecting corti-
the muscles that close the vocal folds (see Chapter 10 cal motor cells with brainstem motor neurons (upper
and Chapter 18). This closing spasm of the vocal folds motor neuron disease). In this case, the mixed dysar-
interrupts vocal fold vibration, essentially preventing thria is of the flaccid-spastic type.
the patient from producing voice. Like oromandibular In theory, a mixed dysarthria may consist of any
dystonia, the laryngeal spasms occur when a person combination of the five major categories described pre-
phonates, not at random times. viously. Whether or not patients with any combination
The most affected perceptual dimensions in these of a mixed dysarthria (such as a flaccid-hyperkinetic
two forms of hyperkinetic dysarthria were imprecise dysarthria) have lesions consistent with these percep-
consonants, prolonged intervals, and variable rate in tual impressions is unknown. Surprisingly, there are
patients with Huntington’s disease, and imprecise con- no brain imaging studies linking dysarthria categories
sonants, distorted vowels, and harsh voice quality in with documented site of lesion.
dystonia. Both groups of patients had irregular articu-
latory breakdowns, as in ataxic dysarthria.
Unilateral Upper Motor Neuron Dysarthria
The hyperkinetic dysarthria in Huntington’s dis-
ease and dystonia is thought to be the result of overly As described earlier, spastic dysarthria is thought to be
strong and sustained contractions of speech muscles, the result of bilateral (both sides) lesions to the corti-
which make it difficult to move from one speech sound cobulbar tract. Most of these lesions are the result of
to the next. The sustained contractions in dystonia stroke, where a loss of blood flow to the fibers connect-
“hold” articulators in one position when they should ing cortical cells to brainstem motor neurons results in
be moving smoothly and quickly to the next speech damage to or destruction of the fibers. There are cases,
sound. The constant and variable movements in Hun- however, in which a stroke affects only one side of the
tington’s disease are thought to affect the muscles of brain (unilateral damage), leaving the corticobulbar
the speech mechanism from making precise, consistent tract on the other side intact and functional. In particu-
movements. This loss of control causes speech to sound lar, some strokes may produce a loss of blood flow to
inconsistent, with fluctuating voice and articulation a very small part of the corticobulbar tract, resulting in
characteristics. a typically mild and often transient motor speech dis-
order called unilateral upper motor neuron (UUMN)
dysarthria.
Mixed Dysarthria
UUMN dysarthria was not part of the original
In each of the five dysarthria types summarized previ- DAB classification system for motor speech disorders.
ously, the lesion causing the problem was thought to be As pointed out by Duffy (2015), UUMN dysarthria was
in one major region of the brain. Flaccid dysarthria was probably left out of the classification system because
a result of brainstem or cranial nerve damage (lower the speech characteristics were so mild. In addition,
motor neuron disease), spastic dysarthria from damage the mild dysarthria often resolved over a short period
in the cortex or the tract that carries information from of time following a stroke. The mild characteristics of
the cortex to the brainstem motor nuclei (upper motor UUMN dysarthria include imprecise consonants, irreg-
neuron disease), ataxic dysarthria from cerebellar dis- ular articulatory breakdowns, harsh voice, and slow
ease, hypokinetic dysarthria from substantia nigra lesions speaking rate (Duffy, 2015). The impression of impre-
(usually Parkinson’s disease), and hyperkinetic dysar- cise consonants dominates the speech of persons with
thria from basal ganglia lesions (Huntington’s disease). UUMN dysarthria.
Some neurological diseases are known to have
damage to two or more of these major brain regions.
For example, patients with multiple sclerosis (MS) often The Dysarthrias: A Summary
have upper motor neuron and cerebellar lesions. For
these patients, a dysarthria with both spastic (upper Table 14–3 lists the categories of dysarthria in the Mayo
motor neuron disease) and ataxic (cerebellar) character- Clinic classification system, along with the prominent
istics might be expected. DAB believed their perceptual perceptual dimensions heard by DAB for each type.
analyses of speakers with MS were consistent with a Some of the perceptual dimensions were prominent for
mixed, spastic-ataxic dysarthria. several dysarthria types (e.g., imprecise consonants)
DAB regarded amyotrophic lateral sclerosis (ALS) and some were uniquely prominent in only one type
as another neurological disease associated with a (e.g., distorted vowels). Table 14–3 also includes prom-
Table 14–3. Mayo Clinic Classification of Dysarthria With Prominent Perceptual

Impressions for Each Category
Classification Prominent Perceptual Impressions

Flaccid Breathy voice, imprecise consonants, hypernasality
Spastic Imprecise consonants, monopitch, reduced stress, slow rate
Ataxic Imprecise consonants, excess and equal stress, irregular

articulatory breakdowns
Hypokinetic Monopitch, reduced stress, breathy voice
Hyperkinetic Imprecise consonants, prolonged intervals, variable rate

(Huntingtons disease)
Imprecise consonants, distorted vowels, harsh voice
(Dystonia)
Mixed Any combination of above, e.g., spastic-flaccid dysarthria

(as in ALS), spastic-ataxic dysarthria (as in MS)
Unilateral upper motor Imprecise consonants, slow speaking rate, harsh voice,
neuron dysarthria irregular articulatory breakdowns (all mild)
inent perceptual characteristics of unilateral upper that listening to the speech of a person with dysarthria
motor neuron dysarthria. provided a good clue to the location of damage within
When the speech of a person with dysarthria the nervous system. This was the localizing value of
is heard, listeners do not separate their perceptual careful listening.
impressions into the individual perceptual dimensions Some dysarthrias may have natural recovery,
used by DAB to scale their prominence. Listeners hear such as the improvement of speech during recovery
a “whole” (integrated) percept. The point made by from stroke. Some dysarthrias may become increas-
DAB, following analysis of all the perceptual dimen- ingly worse, as in degenerative neurological diseases
sions and how they clustered differently in each of the such as multiple sclerosis, Parkinson’s disease, and
dysarthrias listed in Figure 14–1 and Table 14–3, was ALS. In most cases of dysarthria, SLPs are effective in
Identification of Dysarthria Type by Listening:

A Brief Natural History
In the mid- to late-1960s, when Darley, Aronson, MRI, positron-emission tomography [PET]) allows
and Brown were developing their classification for detailed visualization of brain structures and
system for dysarthria, the idea of identifying site accurate location of site of lesion. Second, identifi-
of lesion (location of damage) by listening to a cation of disease type from listening to speech, and
patient’s speech was a big deal. Brain imaging the inference to site of lesion, are not very reliable.
techniques were in their infancy, making the Several studies have reported a relatively low level
perceptual expertise of SLPs a significant of agreement among professionals for both individ-
contribution to diagnosis of site of lesion. Two ual perceptual dimensions and dysarthria type (see
developments have lessened the role of SLPs in review in Bunton, Kent, Duffy, Rosenbek, & Kent,
the medical diagnosis of site of lesion, and by 2007). Nevertheless, the Mayo Clinic classification
extension, the disease in persons with dysarthria. system is in contemporary use by both clinicians
First, in many cases, the contemporary use of and scientists.
highly sophisticated imaging techniques (CAT,
finding treatment strategies to improve communica- speech muscles but rather in the plan for their control.
tion skills. Evidence in studies of speech motor control supports
the separation of neural planning (programming)
processes from execution of an act (Maas & Mailend,
Apraxia of Speech 2012). A programming problem is something like a
problem with computer code. A computer in which the
The other major subcategory of motor speech disorders hardware is in perfectly good shape does not perform
is AAS. AAS is a controversial disorder, mostly because its tasks correctly when the software code contains
professionals have not agreed on its defining charac- errors (see Box, “The Code: Can It Be Fixed”). Apraxia
teristics (Ballard et al., 2016). AAS has been related to of speech is also diagnosed in children, as discussed in
lesions in various parts of the central nervous system, Chapter 15.
most often in the left cerebral hemisphere (Graff-Red-
ford et al., 2013). It is possible that there are several
types of AAS, but in this chapter the possible subtypes The Code: Can It
are not discussed. Be Fixed?
In their original description of the Mayo Clinic
classification system, DAB claimed that AAS was dif- The analogy of a well-functioning computer
ferent from dysarthria. Unlike dysarthria, the speech running defective software to the speech-motor
characteristics of AAS were not the result of paralysis, planning problem in AAS is based on the
weakness, or incoordination of the speech muscles. meaning of the term “praxis.” Praxis, a word
Despite the absence of muscle problems in AAS, the from Old Greek, means the process of producing
patients had articulatory errors as well as other types a skill. DAB categorized the disorder as apraxia
of speech abnormalities (Darley, Aronson, & Brown, of speech to represent the presumed deficits in
1975). These speech abnormalities typically appeared the process — that is, the plan for an articulatory
following a stroke or surgery affecting the left hemi- sequence, just as lines of computer code are
sphere of the brain. Because muscle problems did not the process for the computer’s action. Other
seem to explain AAS, the speech problems had to be kinds of apraxia occur in persons who have had
explained on a different basis. strokes and are recovering. For example, given
The perceptual impressions of apraxia of speech a toothbrush and asked to show how to use it,
included very slow speaking rate (see earlier, descrip- a patient may raise the toothbrush to his face
tion of spastic dysarthria), a tendency to produce and hesitate as if he is not sure how to continue
speech as if the component syllables were “pulled the act of brushing teeth. Then, the patient
apart” from each other (see earlier, description of ataxic may act but in the wrong way: he may use the
dysarthria), and imprecise consonants and vowels. In toothbrush to perform a hair-brushing gesture.
addition, when asked to produce a word or sentence, How does a clinician go about fixing (or reduc-
people diagnosed with AAS appeared to search for ing) the programming problem in AAS? A lot of
the right articulatory configuration to begin the utter- discussion surrounds this question (e.g., McNeil,
ance, as if they were unsure of the correct way to pro- Ballard, Duffy, & Wambaugh, 2016), but at the
duce the initial speech sound(s). Patients configured current time the evidence for effective clinical
their lips and tongue in a certain position, hesitated, approaches is not convincing.
and tried another configuration as if searching through
several attempts to “get it right” before beginning the
utterance. DAB described this searching as articulatory
groping. The patients also had greater articulatory dif-
ficulty when trying to produce a multisyllabic word
Chapter Summary
(such as “statistical”) as compared with the single syl-
lable at the beginning of the word (e.g., “statistical” Motor speech disorders in adults are a group of speech
versus “stat”). The multisyllabic word was more likely disorders caused by damage to the central and/or
to elicit sound errors and articulatory groping (word peripheral nervous system.
length effect). Many neurological diseases are associated with a
DAB proposed AAS as a motor speech programming motor speech disorder.
problem. In this kind of disorder, the speech problems Classification of motor speech disorders was
are not the result of deficits in the direct control of the formalized by Mayo Clinic clinician-scientists in the
late 1960s and early 1970s; the classification system is References

widely used and accepted by speech-language patholo-
gists and researchers. Ballard, K. J., Azizi, L., Duffy, J. R., McNeil, M. R., Halaki, M.,
The classification for motor speech disorders O’Dwyer, N., . . . Robin, D. A. (2016). A predictive model
was based primarily on perceptual impressions of the for diagnosing stroke-related apraxia of speech. Neuropsy-
patient’s speech. chologia, 81, 129–139.
The classification system for motor speech disor- Bunton, K., Kent, R. D., Duffy, J. R., Rosenbek, J. C., & Kent, J.
ders included a group of dysarthrias, associated with F. (2007). Listener agreement for auditory-perceptual rat-
problems in the control of speech muscles, and a plan- ings of dysarthria. Journal of Speech and Hearing Research,
50, 1481–1495.
ning disorder in which muscle control was more or less
Darley, F. L., Aronson, A. E., & Brown, J. R. (1969). Differen-
intact, but the ability to plan (program) articulatory tial diagnostic patterns of dysarthria. Journal of Speech and
sequences was impaired. Hearing Research, 12, 246–269.
The classification categories for dysarthria included Darley, F., Aronson, A., & Brown, J. (1975). Motor speech disor-
flaccid, spastic, ataxic, hypokinetic, hyperkinetic, and ders. Philadelphia, PA: Saunders.
mixed types. Duffy, J. D. (2015). Motor speech disorders: Substrates, differential
The classification category for the planning disor- diagnosis, and management (4th ed.). St. Louis, MO: Mosby
der was apraxia of speech. Elsevier.
The classification is based on groups of perceptual Graff-Redford, J., Jones, D. T., Strand, E. A., Rabinstein, A. A.,
impressions, called “perceptual dimensions,” which Duffy, J. R., & Josephs, K. A. (2013). The neuroanatomy
were scaled from normal to most severe by the Mayo of pure apraxia of speech. Brain and Language, 129, 43–46.
Joutsa, J., Horn, A., Hsu, J., & Fox, M. D. (2018). Localizing par-
Clinic clinician-scientists as they listened to audiotape
kinsonism based on focal brain lesions. Brain, 141, 2445–2456.
recordings of a paragraph-level passage. Kim, Y., & Choi, Y. (2017). A cross-linguistic study of acoustic
The patients who read the passage included per- predictors of speech intelligibility in individuals with Par-
sons with known neurological diseases, including kinson’s disease. Journal of Speech, Language, and Hearing
brainstem disease, damage to the corticobulbar and Research, 60, 2506–2518.
corticospinal tracts, damage to the cerebellum, dam- Liss, J. M., Utianski, R., & Lansford, K. (2013). Cross-linguistic
age to dopamine-producing cells in the midbrain, and application of English-centric rhythm descriptors in motor
damage to the basal ganglia. speech disorders. Folia Phoniatrica et Logopaedica, 65, 3–19.
Patients with multiple sclerosis (damage to both Maas, E. L., & Mailend, M. L., (2012). Speech planning hap-
the corticobulbar/corticospinal tracts, and to the cer- pens before speech execution: Online reaction time meth-
ebellum) and amyotrophic lateral sclerosis (damage to ods in the study of apraxia of speech. Journal of Speech,
Language, and Hearing Research, 55, S1523–S1534.
both the corticobulbar/corticospinal tracts, and to the
McNeil, M. R., Ballard, K. J., Duffy, J. R., & Wambaugh, J.
brainstem) were also studied. (2016). Apraxia of speech: Theory, assessment, differential
The general theory of the Mayo Clinic classifica- diagnosis, and treatment: Past, present, and future. In P. H.
tion system is based on the idea that by listening to a H. M. Lieshout, B. Maassen, & H. Terband (Eds.), Speech
patient’s speech, a trained clinician-scientist can make motor control in normal and disordered speech: Future develop-
a likely estimate of the location of neurological damage, ments in theory and methodology (pp. 195–221). Rockville,
and by extension the patient’s neurological disease. MD: ASHA Press.
15
Pediatric Speech Disorders I
Introduction Structural problems in the speech mechanism can

also result in developmental articulation disorders.
Many writers have discussed the history of how the For example, a child born with a cleft palate may have
field of Communication Sciences and Disorders has problems closing the velopharyngeal port even after
viewed speech sound disorders in children (see Bank- surgery is performed to close the palate and reattach
son, Bernthal, & Flipsen, 2017). The history is interest- the relevant muscles in the correct configuration. This
ing because the vast majority of children seen in speech child can be expected to have difficulty with obstru-
and hearing clinics for delayed or different speech ents, the speech sounds requiring a positive oral pres-
sound development do not present other symptoms sure for correct production. When the child attempts to
that clearly point to the cause of the problem. The produce these sounds, air leaks through the ineffective
explanation for delayed or different speech sound VP port and the resulting speech sounds are incorrect.
development in these children has therefore been the The same child may try to compensate for this struc-
cause of much speculation and debate; speculation and tural problem in a way that introduces yet another
spirited debate always make for an interesting scien- error into her developmental speech sound profile (see
tific and clinical history. Chapter 19).
What kinds of conditions would clearly explain a Finally, a child born with a neurological disease
developmental speech sound disorder? Three condi- such as cerebral palsy or who suffers some other form
tions are immediately suggested, including hearing of brain insult (as a result of surgery, traumatic brain
impairment, structural (anatomical) problems with the injury, and other diseases) may have difficulty moving
speech mechanism, and neuromuscular problems asso- the articulators, laryngeal muscles, and/or muscles of
ciated with a known form of disease of the central and/ the respiratory system, any or all of which may contrib-
or peripheral nervous system. ute to delayed or different speech sound development.
A significant hearing impairment has an effect on This chapter presents information on two pediatric
a child’s speech sound development. The details of the speech sound disorders. By definition, pediatric speech
hearing impairment may not account for the details of sound disorders arise in childhood. The present chap-
the speech sound problems, but in a general sense, the ter considers two such disorders whose cause has yet to
child with hearing impairment has a strong likelihood be identified. These disorders are called speech delay
of delayed or different (as compared to the typically (SD), and childhood apraxia of speech (CAS).
developing child) development of the speech sounds Stuttering is also a developmental speech disorder
of her language. of currently unknown origin. There are well-founded
205
suspicions, however, that stuttering is best classified as same way as hypothesized for the adult version of the
a developmental motor speech disorder — that a speech disorder (AAS).
motor control problem is the underlying basis of stut- An understanding of pediatric speech sound dis-
tering even if it is not the only factor that determines orders can benefit from a classification system. The
stuttering behavior (Smith & Weber, 2017). We have classification system may include the cause of the
chosen to present material on stuttering in a separate disorder, known or presumed, the severity of the dis-
chapter (Chapter 17). order, the natural history of the disorder (if and how
CAS is also considered a motor speech disorder it changes over time), and subtypes within a single
by many clinicians and scientists largely because some named disorder (e.g., subtypes of motor speech dis-
of the speech symptoms are said to resemble those in orders). A detailed classification system for pediatric
adults with known brain lesions and who have been speech sound disorders, supported by some data as
diagnosed with adult apraxia of speech (AAS) (Ad well as reasonable speculation, has been published by
Hoc Committee on Childhood Apraxia of Speech, Shriberg and colleagues (2010). Some of the ideas from
2007). There is some preliminary evidence for central this classification system are used in this chapter.
nervous system dysfunction in CAS (e.g., Fiori et al., A final introductory point is the role of speech
2016). This evidence, however, requires extensive, spec- intelligibility in a speech disorder. A basic question in
ulative inferences from the tentative results of a few almost any speech disorder is, to what extent does it
brain imaging studies to the speech behavior in CAS. affect a speaker’s intelligibility? How difficult is it to
Some scientists have argued that current evidence does understand what the speaker is saying? This is a cen-
not support a brain basis for CAS (Liégeois & Morgan, tral concern in pediatric speech disorders, regardless of
2012; Morgan & Webster, 2018).The evidence for a neu- the underlying cause.
ral basis in CAS is also not nearly as strong as it is in
stuttering, and there is much uncertainty and contro-
versy about the diagnosis of CAS (Ad Hoc Committee Speech Delay
on Childhood Apraxia of Speech, 2007). In the current
chapter, CAS is considered as a speech sound disorder Estimates of the prevalence of speech delay are as high
of unknown origin. as 15.6% in children aged 3 years. By age 6 years, many
Speech sound disorders with unknown causes of these children “catch up” to typical development
constitute the majority of all childhood speech disor- norms. Prevalence estimates of speech delay drop to
ders. The term “speech delay” refers to speech sound about 4% at age 6 years, reflecting a positive outcome
development that lags typical development (see for many children who were diagnosed with speech
Chapter 13) without a clear explanation for the delay. delay at age 3 years. This leaves a significant number
“Speech delay” is not the only term used to designate of children who enter first grade with speech charac-
this category of childhood speech sound disorders. teristics that are noticeably delayed relative to expec-
Some authors use the terms “phonological delay” or tations for their age. (See Vick, Campbell, Shriberg,
“articulatory disorders” to refer to delayed mastery of Green, Truemper, Rusiewicz, & Moore [2014] for a
speech sounds with unknown cause (e.g., Eecen, Eadie, review of the prevalence data and Flipsen [2016] for
Morgan, & Reilly, 2018). In this chapter, we use “speech a summary of data showing a decrease in the preva-
delay” to represent all of these terms. Some comments lence of speech sound disorders throughout grade,
are made, however, concerning potential implications middle, and high school.)
of the difference between the terms “speech delay,” Speech sounds are mispronounced by typically
“articulatory delay,” and “phonological delay.” developing children who are partially or largely unin-
CAS refers to delayed and disrupted speech sound telligible at certain times during their speech develop-
development that includes speech sound patterns, ment. The expectation for children who have typically
prosodic characteristics, and continued (throughout developing sound systems is for a decreasing number
development) severity not seen in children with speech of speech sound errors and increasing speech intelli-
delay. A child may have speech characteristics that do gibility as the child gets older. The child with speech
not clearly suggest a diagnosis of either speech delay or delay also mispronounces individual speech sounds
CAS. Clinicians and scientists do not always agree on but at ages when these sounds are mastered by the
the specific speech characteristics that fit speech delay majority of children. Children with speech delay are
versus CAS. As described in the section “Childhood therefore more unintelligible than they should be at
Apraxia of Speech,” the disorder may be due to a defi- a specific age. For example, a 4-year-old child who is
cit in planning speech sound sequences, in much the diagnosed with speech delay has more speech sound
errors and lower speech intelligibility than a typically sounds. Slightly different norms are given for males
developing child of the same age. When a significant and females to recognize the typically faster mastery
degree of unintelligible speech seems to be unusual for of the sound system by female children as compared
a child’s age, parents or teachers may refer the child to with male children.
an SLP for formal evaluation. A typical age of referral The tested speech sounds are classified in one of
is between 4 and 4½ years (Shriberg & Kwiatkowski, four ways, including correct, substitution, omission, or
1994). Of course, many children with speech delay cor- distortion. A correct speech sound is self-explanatory
rect their speech sound errors over time, without ther- (e.g., a “g” in the word “dog”). A substitution is the
apy, and become fully intelligible. replacement of the correct sound with another pho-
Very generally, speech delay can be defined as a neme (e.g., a “d” for “g” substitution, resulting in
childhood disorder in which speech sound errors and “dod”). An omission is the absence of a sound that is
phonological processes reflect immaturity of speech present in the correctly produced word (e.g., an omit-
development for a child of a given age. This impres- ted “g” in “dog,” “da” (or “daw,” depending on dia-
sion of age-inappropriate speech skills by parents or lect) for “dog.” Finally, a distortion is a speech sound
teachers is not very precise, especially in the absence of having the characteristics of the target but produced
an obvious cause for the delay as well as the significant unclearly, like a poor version of the sound (e.g., a stop
variability in speech development among typically consonant like a “g” produced slightly in front of the
developing children. What seems to be excessively place of articulation for “g”).
unintelligible speech for a typically developing 5-year- The results of formal articulation tests are com-
old may reflect nothing more than a different path to pared to the norms; the scores for a child are totaled
full intelligibility; in a year or two, the child may have across all tested sounds. This total score is expressed
no speech sound errors. This is one reason for a for- as a deviation from the average total score for the
mal evaluation of a child’s speech when a delay is sus- typically developing norms. For example, a 4-year-old
pected. A decision to obtain a formal evaluation may child evaluated for speech delay may have a total
be prompted by parents, or by a teacher who has the sound production score of 85, which is compared to
impression that a child is not learning speech sounds the total score norms for typically developing 4-year-
in a typical way. old children. In this example, we assume the 4-year-old
norm is 100. Depending on the score criterion recom-
mended in a specific test of articulation, the total score
Diagnosis of Speech Delay of 85 may deviate sufficiently from the norm to merit a
diagnosis of speech delay.
A speech-language pathologist is most likely to initiate Another approach to estimating a child’s articula-
the evaluation by conducting a standardized articu- tory mastery is to compute a measure of percentage of
lation test. Several formal articulation tests are avail- consonants correct (PCC). Imagine a child engaged in a
able to speech-language pathologists. Most such tests conversation, a recording of which is made for analysis.
are based on the speech sound production of a large The recording of the child’s speech (the “speech sam-
sample of children whose ages range from 2 years 0 ple”) is transcribed using the IPA (Chapter 12) to obtain
months (hereafter, 2;0) to as high as nearly 22 years. a count of all consonants within the sample includ-
These data are called “norms,” because they reflect ing correctly and incorrectly produced consonants.
typical development as defined by the test. Most often, The correctly produced consonants are expressed as
the norms are in the form of ages at which a high per- a percentage of the total number of consonants in the
centage of tested children produce a sound correctly. sample. The PCC measure was originally described
For example, one often-used test of articulation reports by Shriberg and Kwiatkowski (1982) and has been
the average age of mastery for /s/ in the word-initial updated and refined several times (Shriberg, Austin,
position as 5 years (Goldman & Fristoe, 2015). “Aver- Lewis, McSweeny, & Wilson, 1997).
age age” means that a criterion percentage (e.g., 85%) At 5 years of age, typically developing children
of all children tested produced the sound correctly. have PCC scores of 90% to 95%. PCC varies with sev-
Some children master /s/ by age 3 years, others may eral factors, including age and type of speech material
not master it until age 8 years. But a percentage crite- used to extract the measure. As described earlier, the
rion is a useful way to answer the question, “At what measure as originally developed was taken from con-
age does the typically developing child master the /s/ versational speech samples, but it has also been used
sound?” The same question can be asked and answered in the analysis of single words (e.g., Fabiano-Smith &
about any speech sound or combination of speech Hoffman, 2018).
PCC can also be used to diagnose speech delay. statistical question: can y (speech intelligibility score)
Much like the score from a standardized articulation be predicted from x (articulation score)?
test, there is no firm PCC cutoff to separate children Among children diagnosed with speech delay,
with typically developing speech sound mastery from there is, at best, a modest correlation between PCC
those with speech delay. The clinician who uses the and a measure of speech intelligibility (Shriberg &
PCC score to confirm a diagnosis of speech delay must Kwiatkowski, 1982). More recent studies of children
adopt a criterion percentage to make the decision. For with speech sound disorders fail to show a convincing
example, original data reported by Shriberg and Kwi- relationship between PCC and speech intelligibility:
atkowski suggested a cutoff PCC of around 85%. Chil- “Improvements in severity (as measured by PCC) were
dren with a PCC score below 85% were candidates for noted in some of the children, but these improvements
a diagnosis of speech delay. did not translate into improvements in intelligibility”
(Lousada, Jesus, Hall & Joffe, 2014, p. 593). Knowing
the PCC for an individual child may be a weak predic-
Quantitative Measures of Speech tor of speech intelligibility.
Delay and Speech Intelligibility
Speech Delay and Individual Speech Sounds
Articulation tests such as the Goldman-Fristoe Test of
Articulation (GFTA) and PCC are quantitative because Children diagnosed with speech delay often have the
they use numbers to estimate the goodness of articula- most pronounced delay for consonants mastered late in
tory skills. This is in contrast to qualitative estimates of the course of typical speech sound development (Shrib-
a child’s speech such as “excellent,” “good,” “poor,” erg et al., 1997). In children with speech delay, the “late
and so forth. Earlier in the chapter, speech intelligibility eight” /s/, /z/, /r/, /l/, /T/ (“thin”), /D/ (“those”), /ʃ/
was cited as a primary issue for children with speech (“shine”), and /tʃ/ (“chop”) may show more delay
delay. When quantitative measures of articulation (stan- than sounds mastered early in development such as
dardized articulation test or PCC) and speech intelligi- /b/, /d/, /g/, /p/, /t/, /k/, /m/, /w/, and /n/. Speech
bility (e.g., the number of words heard correctly) are delay for the late eight, being more likely and lasting
available for the same children, is there a relationship a longer time than speech delay for early mastered
between them that allows a reliable estimate of speech sounds, may have a disproportionate effect on speech
intelligibility from the articulation score? This is a basic intelligibility.1 A therapy plan for a child with speech
It Just Doesn’t Add Up

Speech intelligibility tests were devel- counts of “phonemes correct” do not match overall
oped many years ago in the dark ages of landlines intelligibility scores (reviewed in Weismer, 2008).
to evaluate the quality of telephone transmission. This is consistent with the more recent findings of
The idea was to have tests that “added up” the Lousada et al. (2014) for children, and with results
quality of each speech sound in a list of words to of a study conducted by Ertmer (2010) on the
obtain a percentage of the words recognized by speech intelligibility of children with hearing loss.
a crew of listeners (Weismer, 2008). This seems to Why doesn’t it add up? It is not yet entirely clear,
make sense, but as it turns out it does not work very but a good guess is that the connections between
well when speech intelligibility tests are used to sounds — how you get from one sound to another
estimate the severity of a person’s speech disorder. within a word — are just as important as the quality
Speech intelligibility tests are used routinely of individual sounds. In addition, speech intel-
to estimate the severity of a speech problem in ligibility is not simply the “sum” of individual
persons with cleft palate, hearing impairment, and sounds, but as discussed in Chapter 11 makes use
motor speech disorders (dysarthria and apraxia of top-down processes in which words are often
of speech), as well as in children with speech identified before all the sounds have
sound disorders. Study after study has shown that been analyzed.
1
I f speech delay was the same for all speech sounds, it might seem reasonable to expect each sound to make an equal contribution to speech
intelligibility problems. This hypothesis is not correct, however, because the frequency of occurrence varies across the speech sounds of a
delay may include focused work on sounds thought to a child deletes consonants from the word-final posi-
have a large effect on speech intelligibility. tion, the specific “missing” sounds are not treated one
by one. Rather, the child is exposed to groups of CVC
words in which the word-final C varies across several
Speech Delay: Phonetic, consonant types (e.g., /t/, /g/, /s/, /n/). The object of
Phonological, or Both? the therapy is the child’s mastery of the CVC word
form, which includes many different word-final con-
Earlier, it was noted that some professionals use the sonants. The assumption is that exposure to the CVC
diagnostic term “phonological disorders” for develop- words generalizes across consonant types, eliminating
mental speech sound disorders with no known cause. the phonological process of final consonant deletion in
The view of speech sound errors as phonetic (articula- these word forms.
tory) or phonological may have important implications
for clinical practice.
Residual and Persistent Speech Sound Errors
A child diagnosed with multiple speech sound
errors may be treated with articulatory practice (also In the majority of typically developing children, mas-
called traditional articulation therapy: see Hegarty, tery of the speech sound system is complete around 8
Titterington, McLeod, & Taggart, 2018) as part of his years of age. A small number of children have speech
speech therapy. A central component of this approach sound errors that extend past this age and into the
is repetition of each error sound as a way to establish teenage years, and possibly into and beyond young
and refine the speech motor control required for its cor- adulthood. The terms residual speech sound errors
rect production. An assumption of this approach is that and persistent speech sound errors have been used to
extensive movement and placement practice is likely describe articulatory errors lasting past the age of com-
to result in mastery of an incorrectly produced speech plete speech sound mastery.
sound (Powell, Elbert, Miccio, Strike-Roussos, & Bras- As pointed out by Flipsen (2016), clinicians and
seur, 1998; Lousada et al., 2013). The effect of articu- researchers have tended to use the terms, “residual”
latory practice, in this view, is similar to the expected and “persistent” speech sound errors interchangeably
effect of practice on any skill (e.g., throwing a football; to classify children who have these long-lasting errors.
keyboarding).2 Although the terms are now thought to classify chil-
When speech sound errors are considered phono- dren with partially different histories of speech sound
logical, the emphasis in therapy is on the sound sys- errors, the two groups share a characteristic — the
tem, rather than on individual sounds and phonetic speech sounds in error are transcribed as distortions,
practice (Brumbaugh & Smit, 2013). Children who rather than substitutions or omissions.
receive phonological therapy may be trained to recog- As discussed earlier, a distortion is recognizable
nize and produce minimal pair contrasts. A minimal as the intended but poorly produced speech sound.
pair is formed by words that differ by a single feature, For example, the word-initial [s] in [sIn] “sin” is pro-
such as consonant voicing, place of articulation, and duced with the tongue too far forward in the mouth
manner of articulation. Examples of minimal pairs are (like a lisp) or in a “slushy” way. The perceived word
“pack-back” (word-initial voicing), “pack-tack” (word- “sin” is recognized as such, not as “thin” or “shin.”
initial place of articulation), and “sack-tack” (word- On the other hand, a substitution is heard as a differ-
initial manner of articulation). Minimal pair therapy ent sound than the one intended, as when the speaker
addresses the component of the sound system that says “shin” for the intended “sin” (a /ʃ/ for /s/ substi-
makes phoneme contrasts — the sounds that change the tution). An omission is the absence of a sound in the
word meaning when exchanged in the same word posi- intended word. For example, omission of the /l/ in
tion. Treatment of phonological processes that have not “slow” results in a production transcribed as [soU], or
disappeared from the child’s sound system is another omission of the /ŋ/ in “sing” is spoken as [sI] (possibly
example of phonological therapy. For example, when with a nasalized /I/, hence [sĨ ]).
language. For example /t/, /d/, and /n/ are frequently occurring sounds in American English, whereas /T/, /ʃ/, and /h/ occur relatively
infrequently (Mines, Hanson, & Shoup, 1978). /T/ and /ʃ/, two of the infrequently occurring sounds in this example, are also among the late-
eight sounds. Perhaps the late-eight difficulties are not so important to speech intelligibility problems in children with speech delay? Of course,
it is never that simple. /s/, a late-eight sound, is among the most frequently occurring speech sounds.
2
here are cognitive contributions to establishing and refining any motor skill. The analogy between articulatory practice and (for example)
T
throwing a football is not meant to eliminate cognitive skill, such as knowing the relationship between articulatory movement and placement
with acoustic results, or between arm motion and grip on the football with the location of the ball when it is thrown).
If speakers with residual errors are different from The possibility of perceptual difficulties in chil-
speakers with persistent errors even though they both dren with residual errors is relevant to treatment
produce distortions of speech sounds, how are they options. If residual /r/ and /s/ errors have a basis in
distinguished? poor perception of phoneme-specific contrasts (e.g., the
Residual errors are usually distortions of /r/ and /r/-/w/ contrast), “ear training” makes sense as part
/s/, with [w]-like for /r/ errors, and [T]-like for /s/ of a therapy program to correct the production errors.
errors (or, slushy-like [s] for /s/ errors). The “-like” Ear training — listening to many examples of /r/-/w/
part of these error patterns is important: casual listen- pairs — may establish a better representation of the
ing to a distorted /r/ may at first seem like a [w] for sounds as different phoneme categories. Once the cat-
/r/ substitution, but closer listening reveals a [w]-like egories are established by improvements in perceptual
sound with rhotic (/r/-like) qualities. The /r/ distor- skill for specific contrasts, the child uses the categories
tion often sounds as if it is between an /r/ and a /w/. to produce the sounds correctly.
/s/ distortions have the same “in-between” quality.
Many clinicians and researchers believe that /r/
and /s/ residual errors reflect an incomplete process Additional Considerations in
of speech motor learning and are usually unresolved Speech Delay and Residual and
articulatory imperfections from a previously diagnosed Persistent Speech Sound Errors
and treated speech delay (Flipsen, 2016). The diagno-
sis of speech delay earlier in the child’s developmental Speech sound disorders, including both speech delay
history was made because of multiple speech sound and residual errors, have been associated with causes
errors, not just /r/ and /s/ errors. and effects other than those presented. For example,
Persistent errors are thought to be /r/ and /s/ some authors (review in Eaton, 2015) have suggested
distortions that were present earlier in a child’s speech that subtle deficits in cognitive abilities may be a fac-
sound development, but were not treated because tor in residual errors. “Cognitive abilities” include the
the remainder of the speech sounds were learned in ability to process, represent, store, and represent infor-
a typical way. In other words, the child was not diag- mation and to focus on relevant stimuli and exclude
nosed with speech delay because the errors were lim- irrelevant stimuli. Of special interest to the presence of
ited to only one or two speech sounds, both typically residual speech sound errors, these cognitive abilities
mastered late in the speech-sound learning process. must be employed to compare speech sound output,
There is, however, some evidence that children who such as a residual error, to stored representations of
are thought to have persistent errors may have other the sound. This is called self-monitoring. The success
speech sound errors in addition to /r/ and/or /s/. This of self-monitoring depends on the ability to focus on
may also distinguish these children from those who are the speech output, to use memory to access the rep-
described as having residual errors. resentation of the sounds in the brain, and to make a
Children with residual or persistent errors may proper comparison between the output and the stored
recover spontaneously to produce distortion-free /r/ representation. In theory this makes sense, but research
and /s/. However, up to 25% of children who have to date has not produced clear results on deficiencies in
residual or persistent errors around 9 years of age may cognitive abilities in children with residual errors, or
not recover spontaneously, and may require services to the effectiveness of therapy techniques to correct resid-
correct these errors. The distinction between residual ual errors based on training of self-monitoring skills.
and persistent speech sound errors may be important Still, cognitive skills as a partial explanation of residual
when therapy is undertaken to eliminate distortions of errors and as a potential target in therapy deserve fur-
these sounds. ther research effort.
Another potential explanation for residual and Children with residual errors may also experi-
persistent speech sound errors is that a child has subtle ence challenges in social settings and academic perfor-
perceptual problems specific to /r/-/w/ and /s/-/T/ mance. Hitchcock, Harel, and Byun (2015) conducted
(or /s/-/ʃ/) distinctions. In a recent study, children a survey filled out by parents whose children had
aged 9 to 14 years with residual /r/ errors were able to /r/ errors. The survey focused on the effect of the /r/
hear the difference between /r/ and /w/ with the same errors on the child’s social and academic life. The rat-
proficiency as children without residual errors (Pres- ings suggested that the greatest impact of the residual
ton, Irwin, & Turcios, 2015). Other studies, reviewed errors was on social interactions, especially for chil-
by Preston et al., have found some evidence of subtle dren over 8 years of age. The findings of this survey
perceptual problems in children with speech sound are consistent with research suggesting that children
disorders. with residual errors are judged more negatively than
children with typically developing articulation (Crowe soning is that the likelihood of children with speech
Hall, 1991). In another survey, adults who as children delay having family members with speech sound dis-
had a history of speech sound disorders received lower orders should be the same as children with typically
grades in high school and had fewer years of post–high developing speech sounds, if there is no hereditable
school education as compared to children who had typ- (genetic) susceptibility to speech sound disorders.
ically developing speech sounds (Felsenfeld, Broen, & This line of reasoning is complicated by the effect
McGue, 1994). Children diagnosed with speech delay of environment on speech sound development. As
as they enter grade school may also have delayed lit- previously noted, the familial pattern of speech sound
eracy skills (literacy skills include reading, writing, and disorders may reflect a language stimulation environ-
spelling) relative to children with typically developing ment that is similar in multiple generations of a family.
speech sound skills (Haylou-Thomas, Carroll, Leavett, Perhaps parents in a family do not direct a good deal
Hulme, & Snowling, 2017). Delayed literary skills can of spoken language to their babies, much in the same
have profound effects on academic success. way that the parents’ parents did with them. Across
These studies, taken together, suggest a connection generations, this style of language stimulation may
between childhood speech sound disorders and the be an environmental influence on speech-language
quality of social and academic aspects of life. A clear genes that results in a high probability of speech sound
cause-and-effect relationship cannot be established disorders.
from these studies, and certainly not all children with Evidence in support of a hereditable component
speech sound disorders have, or as adults will have, for speech sound disorders has also been gathered
social and academic problems. The trends in these from studies in which the tendency for speech sound
studies, however, point to the potential value of speech disorders among adopted children was tied to a his-
therapy to correct speech sound disorders and mini- tory of speech sound disorders in a biological parent,
mize their effect on the quality of life. as compared with an adoptive parent (Felsenfeld &
Plomin, 1997). This finding is consistent with a genetic
component in childhood speech sound disorders, but
Speech Delay and Genetics does not rule out an environmental effect as well.
Finally, there is interest in the possibility of a genetic

basis for speech delay. A simplified explanation for this Childhood Apraxia of Speech
interest is the possibility of inheritance of a predispo-
sition for speech delay. Researchers acknowledge the CAS is a diagnostic term for children with a speech
unlikely case of a single gene “explaining” speech delay, sound disorder that shares some characteristics with
Rather, a larger group of genes under the influence of speech delay and has unique characteristics as well.
variables in the environment is thought to contribute “Praxis” is a Greek word meaning “doing,” or “action.”
to speech and language development (Lewis, Shriberg, “Apraxia” is “not doing.” Most patients diagnosed
Freebairn, Hansen, Stein, Taylor, & Iyengar, 2006; Peter- with CAS are able to “do” speech behaviors, albeit with
son, McGrath, Smith, & Pennington, 2007); a disruption difficulty. This is why some clinicians and researchers
in these speech/language genes may result in speech prefer the term “dyspraxia” to categorize the disorder.
delay. Variation in the environment (e.g., extensive lan- In this chapter we use the acronym CAS to designate
guage stimulation versus minimal language stimula- childhood apraxia of speech.
tion) almost certainly modifies if, and how, the speech/ Apraxia is a diagnostic term that has been in use
language genes affect speech sound development. An for many years to describe an inability to perform cer-
interaction between multiple genes with an environ- tain motor behaviors in the absence of muscle weak-
mental basis for speech/language development makes ness, or other muscle problems.
the identification of specific speech-language genes
very challenging. Praxis is defined as the ability to perform . . . skilled
Given the complexity of human genetic material or learned movements. Apraxia refers to the inability
and its interaction with environmental factors, why do to carry out . . . praxis movements in the absence of
researchers explore a genetic predisposition for speech elementary motor, sensory, or coordination deficits that
delay? As reviewed by Felsenfeld (2002), children with could serve as the primary cause. (Park, 2017, p. 317)
speech sound disorders (including speech delay) are
more likely to have family members with a history of A nearly identical definition for apraxia is provided
speech sound disorders when compared to children by Zadikoff and Lang (2005, p. 1480), who also point
whose speech sound development is typical. The rea- out that apraxia may coexist with “elementary” (e.g.,
weakness, loss of sensation) muscle disorders. For In adults, oral nonverbal apraxia, and/or apraxia
example, a limb may be weak but capable of produc- of speech, are almost always the result of a known brain
ing an appropriate gesture, even if slow and reduced in lesion or known neurological disease. Studies such as
magnitude. A patient with apraxia and limb weakness that by Whiteside et al. (2015) included 50 patients
is likely not to produce the appropriate gesture, even if who had suffered a stroke and had documented brain
slow and small. lesions. Adult stroke survivors are the most frequent
The 2007 American Speech-Language-Hearing participants in studies of apraxia of speech.
Association (ASHA) Committee report on CAS pro-
posed the following definition of the disorder:
CAS Compared With Adult
Childhood apraxia of speech (CAS) is a neurological Apraxia of Speech (AAS)
childhood (pediatric) speech sound disorder in which
the precision and consistency of movements underly- AAS is covered in greater detail in Chapters 9 and
ing speech are impaired in the absence of neuromus- 14, but a sketch of the speech problem is required to
cular deficits (e.g., abnormal reflexes, abnormal tone). introduce the speech characteristics in CAS. The char-
CAS may occur as a result of known neurological acteristics of AAS include hesitation in the initiation of
impairment, in association with complex neurobehav- speech, groping for articulatory postures as if search-
ioral disorders of known or unknown origin, or as an ing for the correct articulatory position and shape for a
idiopathic neurogenic speech sound disorder. The core specific sound, production of multisyllabic words as if
impairment in planning and/or programming spatio- each syllable is “pulled apart” with a robotic-sounding
temporal parameters of movement sequences results in effect due to equal stress on each syllable, vowels and
errors in speech sound production and prosody. consonants that have unusually long durations, and
inconsistent sound errors such as substitutions and
Apraxia as a sign of neurological disease has distortions. Many clinicians and scientists have com-
been recognized for many years. Long before the mented on the inconsistency of these sound errors:
term “apraxia” was used to describe speech deficits, when asked to repeat the same word many times, the
it referred to a neurologically based disorder of limb sound errors are not the same on each repetition, and
and orofacial movement in which patients are unable a sound error on one repetition may be produced cor-
to produce voluntary actions that may nevertheless be rectly on the next repetition (Bislick, McNeil, Spencer,
produced spontaneously. The apraxic patient, asked Yorkston, & Kendall, 2017).
to demonstrate how to brush his hair (either miming Adults with apraxia of speech are also likely to
the action or asked to use a brush), may have difficulty make more sound errors and to show more initiation
initiating the proper gestures (or use an actual brush problems and articulatory groping, as a sound sequence
properly), make stop-and-go movements around his increases in complexity. For example, the three words,
head that are not hair-brushing movements, and show “please,” “pleasing,” “pleasingly” increase in complex-
a good deal of frustration with his inability to respond ity by virtue of the number of syllables in the words:
appropriately to the request. Similar difficulties may “pleasing” is more complex than “please,” and “pleas-
be observed when asking the patient to open and close ingly” is more complex than “pleasing.” The key to
the jaw, stick out his tongue, or purse his lips. These lat- understanding this phenomenon is to focus on errors
ter movement problems are called orofacial nonverbal made, or increased hesitation before, the initial sounds
apraxias, or simply oral apraxias. Another example of (/p/ and /l/) of each word. Adults with apraxia of
orofacial nonverbal apraxia is the patient who, when speech tend to make more word-initial errors with an
asked to “Show me how you whistle,” hesitates before increasing number of syllables following the word-
attempting to respond to the request, and once move- initial syllable. Articulation of the word-initial /p/
ment begins may have difficulty narrowing the lips, and /l/ in “please-pleasing-pleasingly” is affected by
forming them into the shape required for a whistle. The the number of syllables following it, suggesting that the
patient seems to “grope” for the required lip configura- beginning of multisyllabic words is not produced inde-
tion and make several attempts to get it right. pendently of the middle and end of the word. Another
Interestingly, the diagnosis of oral nonverbal example of a difference in phonetic complexity is the
apraxia does not mean the patient is also diagnosed word pair “sit” and “split.” The /spl/ consonant clus-
with apraxia of speech, and vice versa: a patient diag- ter is considered more complex than the singleton /s/;
nosed with apraxia of speech does not necessarily have otherwise, the two words share the same phonetic seg-
oral nonverbal apraxia (Whiteside, Dyson, Cowell, & ments (/s/ as the word-initial sound, /I/ and /t/ as
Varley, 2015). the final two sounds). Adults with apraxia of speech
typically have greater difficulty with the word-initial well as the timing of muscle contractions to produce
/s/ for words such as “split” as compared with “sit.” these articulatory goals. A program is not the same as
Selected speech characteristics in AAS are listed in the action; the program (sometimes called the “plan”)
Table 15–1 (a complete review of these characteristics is is the representation in the brain of the action and may
found in McNeil, Robin, & Schmidt, 2009). These char- exist without its execution. The idea of speech motor
acteristics are not observed in every patient diagnosed programs is controversial, and even the nature of what
with apraxia of speech and may even appear and dis- is programmed (phonemes? syllables? Whole, multi-
appear in the speech of a single patient. syllabic words?) is debated.
Even with controversy, the concept of speech
motor programs enjoys widespread acceptance among
Speech Motor Programs
clinicians and scientists who study apraxia of speech.
One explanation for the phonetic complexity effect Motor programs are thought to take time to assemble,
depends on the concept of a motor program. Motor even if the scale of the time is in micro- or milliseconds
programs are thought to be the organizational pro- (typical for brain-time processes to be fast). The pro-
cesses in the brain that prepare execution of an action. gramming time is assumed to be longer for complex
The speech motor program is assumed to contain plans actions compared with simpler actions. For example,
for placement and configuration of the articulators as the motor program for the production of “pleasingly”
Table 15–1. Selected Speech Characteristics of AAS
Characteristic Brief Description

Initiation difficulties Patient hesitates before the onset of speech, as if he
cannot get started; he appears to “grope” for the
correct articulatory configuration for the first sound.
Loss of stress contrasts* Syllables in multisyllabic words have roughly equal

duration and seem “pulled apart” from surrounding
syllables, making speech sound robotic.
Vowels and consonants have Adults with apraxia of speech have slow speaking rates,
long durations which means that the durations of speech sounds are
longer than normal.
Sound errors are substitutions Speech sound errors are replacements of one sound
and distortions with another (like phonemic errors) or poorly produced
sounds (phonetic errors).
Sound errors are inconsistent Multiple repetitions of one word by a patient do not
always have sound errors that are the same. The sound
errors are variable in type. A word such as “save” may
have a substitution for /s/ in one repetition (“shave” for
“save”) and a distortion in another repetition (a sound
between an “s” and “sh” for “save”). Another repetition
may have a correct “s.”
Increased hesitation, groping, Multisyllabic words “bring out” apraxic errors more
and articulatory errors with than single-syllable words; more complex syllables
increased phonetic complexity (“spl” versus single-consonant “s”) also bring out more
apraxic errors.
*This speech characteristic in adult apraxia of speech is specific to languages such as English (and, e.g.,
Dutch, French, and Russian) in which there are “long” and “short” syllables within multisyllabic words. In
the English word “elephant,” the first syllable is stressed, the second and third are unstressed (the first
syllable is longer than the following two syllables). Try saying “elephant” with equal duration for each syl-
lable to get an idea of the “robotic speech” characteristic of adult apraxia of speech. There are languages
(e.g., Spanish, Korean, Mandarin Chinese), however, in which successive syllables are not long or short
but have roughly equal duration. The characteristics of apraxia of speech have not been defined well (or
at all) in these languages.
takes more time to assemble than the programming for As in AAS, speech sound errors are inconsistent
“please,” as does the programming of “split” compared in CAS. A specific sound segment may be misarticu-
with “sit.” One hypothesized proof of motor program- lated in different ways as a child repeats the same
ming and the complexity effect is that reaction times to word multiple times. The inconsistent errors may be
say the word “split” are greater than reaction times to omissions, substitutions, or distortions; for some rep-
say “sit.” Similarly, reaction times are longer for “pleas- etitions of a word, the speech sound may be produced
ingly” versus “please.” Ballard, Tourville, and Robin correctly. Children diagnosed with CAS are likely to be
(2014) provide a review of the programming hypoth- quite unintelligible as a result of the severe articulatory
esis in adult apraxia of speech. disorder.3
Additional speech characteristics in CAS include
lengthened connections between speech sounds (that
CAS: Prevalence and is, longer transitioning movements between successive
General Characteristics sounds, similar to the “pulled apart” syllables in AAS),
disturbed prosody, articulatory groping, increased
In a 2007 report posted at the ASHA website (https:// articulatory errors with increased phonetic complexity,
www.asha.org/practice-portal/clinical-topics/child and “unusual” phonetic errors such as the omission of
hood-apraxia-of-speech/), CAS was estimated to have word-initial consonants (/æt/ for “cat”). Table 15–2
a prevalence of between 0.1% and 0.2%, or between 1 lists selected speech characteristics in CAS.
and 2 per 1,000 children with the disorder. Boys are This list of speech characteristics in children with
diagnosed with CAS more often than are girls. The CAS is in many cases similar or identical to the list of
prevalence estimate is an educated guess, because fixed characteristics described earlier for AAS. This seems
criteria for the diagnosis have not been determined, or to validate the use of the term “apraxia” for both the
at least have not been widely agreed upon in clinical adult and child versions of this speech sound disorder.
and research settings. Excessive diagnosis of CAS has In fact, the speech sound disorder in CAS is thought to
been discussed in the literature, although it is hard to be the result of a motor programming disorder, as it is
know how to identify an incorrect diagnosis (a “false- in the adult form of apraxia of speech (ASHA, 2007).
positive” diagnosis) when the criteria for diagnosing There are interesting differences between the adult
the disorder have not been firmly established. and child versions of apraxia of speech. With very few
CAS is thought to occur as part of the develop exceptions, the adult version is associated with known
mental delays and disorders in several genetic condi- neurological diseases such as stroke, Parkinson’s dis-
tions. CAS can be said to have a known origin when it ease (review in Presotto, Rosenfeld Olchik, Schumacher
is part of a known disease or syndrome (such as frag- Shuh, & Reider, 2015), and dementia (e.g., Alzheimer’s
ile X syndrome, discussed in Chapter 8). Because the disease; review in Cera, Ortiz, Bertolucci, & Minett,
majority of diagnosed CAS cases are not associated 2013). In most cases of stroke, a region of damaged
with known genetic disorders, this chapter includes brain tissue (a lesion) is seen in brain images. In Par-
CAS as a developmental speech sound disorder of kinson’s disease, lesion locations in the midbrain and
unknown origin. frontal lobe have been demonstrated at autopsy. Lesion
locations throughout the central nervous system have
also been demonstrated in Alzheimer’s disease and
CAS: Speech Characteristics other types of dementia.
Brain lesions in CAS have been difficult or impos-
Children diagnosed with CAS often have a severe sible to locate. Imaging studies have suggested size dif-
articulatory disorder. These children are likely to have ferences in certain locations of the cerebral hemispheres
many speech sound errors, possibly including frequent of children with CAS as compared with typically devel-
vowel errors. The vowel errors are notable because oping brain areas (ASHA, 2007). In at least one study,
vowels are mastered early in the course of speech children with CAS appear to have different connections
sound learning and are typically not observed as errors between brain areas compared with children who do
in children diagnosed with speech delay. not have speech sound errors (Fiori et al., 2016).
3
n example of the poor speech intelligibility of children diagnosed with CAS is found in Namasivayam et al. (2015, Figure 2, p. 539). Speech
A
intelligibility scores for word or sentence lists (number of correctly heard words/sentences by a group of listeners) that are less than 50% are
typically considered to indicate a severe intelligibility deficit. In Namasivayam et al., the average intelligibility scores for children with CAS
are no greater than 30% and as low as 7%.
Table 15–2. Selected Speech Characteristics in Childhood Apraxia of Speech
Characteristic Brief Description

Severe articulatory disorder Many speech sounds in error, including vowels;
children often have severe intelligibility deficits.
Sound errors are inconsistent Multiple repetitions of one word by a child do not
always have sound errors that are the same. The sound
errors are variable in type.
Lengthened durations between Speech perceived as slow and lacking smoothness.
adjacent syllables or individual
sounds
Disturbed prosody Speech characteristics that distinguish stressed from
unstressed syllables, which include pitch, loudness, and
duration, are atypical; melody of whole utterances may
seem atypical.
Increased hesitation, groping, Multisyllabic words “bring out” apraxic errors more
and speech sound errors with than single syllable words; more complex syllables
increased phonetic complexity (“spl” versus “sit”) also bring out more errors than
phonetically simple syllables.
Unusual speech sound errors Vowel errors may be common; word-initial consonants
(unusual in the process of normal may be omitted.
speech sound development)
CAS and Overlap With Other attempt to modify or correct the program for an articu-
Developmental Delays latory sequence. It is not clear how a motor program
can be modified or corrected, although an argument
Some children with speech delay, as discussed previ- can be made that practice of multisyllabic words can
ously, also have language and reading delays. In chil- achieve this goal. A therapy like this for children with
dren with CAS, language learning delays tend to be CAS has been developed by Murray, McCabe, and Bal-
more common and more severe than in children with lard (2012) and has shown some preliminary, promising
speech delay. Problems in language comprehension results. Morgan, Murray, and Liégeois (2018) provide a
and language expression, as well as literacy skills, have comprehensive review of therapies for CAS.
been linked with CAS. The clinical opinion of many
SLPs is that the speech and language deficits associated
with CAS are likely to persist into the teenage years CAS and Genetics
and even into adulthood. The implications of these co-
occurring speech and language disorders for academic As in speech delay, children with CAS have family
and work-life success are clear and point to the need members with a higher probability of a history of CAS (or
for effective speech and language therapies for children other speech sound disorders) compared with typically
with CAS. Reviews of the co-occurring speech and lan- developing children. The interest in the genetic basis of
guage deficits in CAS are available in ASHA (2007), CAS was significantly motivated by study of a multi-
Peterson et al. (2007), Gillon and Moriarty (2007), and generational family in Great Britain (Lai, Fisher, Hurst,
Zaretsky, Velleman, and Curro (2010). Vargha-Khadem, & Monaco, 2001). Severe speech prob-
There are many treatments available for the child lems with characteristics resembling apraxia of speech
who is diagnosed with CAS. Almost all treatment strat- were identified in roughly 50% of the family members,
egies include a focus on the reduction of articulatory across generations and in both children and adults.
errors for individual sound segments, in the same way Many of the family members with apraxia of speech
as treatment for speech delay. The focus on correction also had intellectual disability (Chapter 8). Study of the
of articulatory errors does not seem to reflect the view genetic profile of each of the family members revealed
of CAS as a programming disorder — that is, there is no that those with apraxia of speech shared an abnormality
of a single gene, on the seventh chromosome. The com-

mon gene abnormality among the family members with Chapter Summary
apraxia of speech suggested the presence of a “speech
and language gene.” During embryonic development, Speech delay and CAS are presented in this chapter
this gene abnormality was hypothesized to disturb as two speech sound disorders of unknown origin,
the regulation of other genes that are important to the meaning that factors such as abnormal structural char-
proper development of brain structures thought to be acteristics of the speech mechanism, documented neu-
active in speech and language development. rological disease, intellectual disability, and/or hearing
Since the original publication of the Lai et al. (2001) loss are ruled out and therefore do not explain the
report, several other gene abnormalities have been pro- speech disorder.
posed to be important in developmental speech and Speech delay is a childhood speech sound disorder
language disorders. Some of these speech and language with high prevalence and unknown origin; the charac-
disorders are characteristic of other diseases in which teristics of the disorder include speech sound mastery
there are many other problems. In these diseases, the that is delayed relative to age and sex norms that are
speech and language problems in CAS do not stand documented as a part of standardized tests of articula-
alone, as in a “pure” version of CAS. At this point in tion, or by other measures.
time, it seems clear that multiple genes play a role in Children with speech delay are most likely to have
“pure” cases of CAS (see Box, “What Is ‘Pure’ CAS?”). speech sound errors for later-learned, as compared to
A review of genetic factors in CAS is found in Worthey earlier-learned sounds, which presumably reflects the
et al. (2013). more complex speech motor control skills required for
the later-mastered sounds.
Possible explanations for speech delay include
What Is “Pure” CAS? (but may not be limited to) immature speech motor
control, delayed learning of phonological rules, and
In theory, “pure” CAS is a speech delayed speech perception skills, or some combination
sound disorder with a neurological basis. Pure of the three.
CAS, if it exists, is a speech motor control The speech intelligibility problem in speech delay
disorder. The control of the nervous system is partly, but not completely, a result of speech sound
over movements of the articulators, larynx, and errors. The speech intelligibility deficit that is due to
respiratory system are compromised, resulting the speech sound errors is greater than expected for a
in the selected speech characteristics listed in child’s age.
Table 15–2. Other speech characteristics may Speech delay may be associated with problems
also be observed in pure CAS, but they are all in socialization and academic performance in grade
the result of a speech motor control problem. school and possibly into middle and high school.
More often, as described in the text, children Some children have /r/ and/or /s/ errors, called
with CAS have other developmental delays residual errors, past the age (around 8 or 9 years) at
and problems including language and literacy which speech sound mastery is typically completed.
delays. If most children with CAS have these A smaller number of children have persistent
additional problems that cannot be explained by speech sound errors for several speech sounds in addi-
a pure disorder of speech motor control, why are tion to /r/ and /s/ errors.
scientists interested in the identification of the Speech delay is frequently a communication com-
small number of children with the pure variety? ponent of a more general language delay, which may
The reason is that pure cases may allow the iden- include language comprehension and/or expression
tification of the genetic basis of the speech motor delay.
control component of CAS. Stated in a different Children with speech delay are more likely to have
way, perhaps the pure cases have a more focused delays in reading ability compared with children who
genetic basis — one or two gene abnormalities. have typically developing sound production.
The genetic basis of the more frequent “CAS Research evidence points to a genetic component
plus language disorders” may be harder to pin in speech sound disorders, including speech delay; the
down because of the presumed larger set of genetic component is likely to involve multiple genes
genes at work in the disorder (Morgan, Fisher, that interact with the environment to make a child
Scheffer, & Hildebrand, 2017). more susceptible to speech delay, compared with chil-
dren who do not have these genes.
CAS is a relatively rare, severe speech sound dis- Brumbaugh, K. M., & Smit, A. B. (2013). Treating children ages
order in which inconsistent, multiple speech sound 3–6 who have speech sound disorders: A survey. Language,
errors and prosodic disturbances are thought to be core Speech, and Hearing Services in the Schools, 44, 306–319.
features. Cera, M. L., Ortiz, K. Z., Bertolucci, P. H. F., & Minett, T. S. C.
(2013). Speech and orofacial apraxias in Alzheimer’s dis-
Vowel errors, hesitation before initiating speech,
ease. International Psychogeriatrics, 25, 1679–1685.
and groping for articulatory positions and shapes are
Crowe Hall, B. J. (1991). Attitudes of fourth and sixth graders
also speech characteristics observed in CAS. toward peers with mild articulation disorders. Language
The speech characteristics of CAS are similar Speech and Hearing Services in the Schools, 22, 334–340.
(but not identical) to the speech characteristics seen in Eaton, C. T. (2015). Cognitive factors and residual speech
AAS. errors: Basic science, translational research, and some
As in AAS, the core and other speech characteris- clinical frameworks. Seminars in Speech and Language, 36,
tics in CAS are hypothesized to be the result of a plan- 247–256.
ning/programming problem in the brain mechanisms Eecen, K. T., Eadie, P., Morgan, A. T., & Reilly, S. (2018). Vali-
that control speech production. dation of Dodd’s model for differential diagnosis of child-
CAS is a disorder in which the diagnosis does hood speech sound disorders: A longitudinal community
cohort study. Development Medicine and Child Neurology, 61,
not have good-to-excellent reliability; there is dis-
689–696.
agreement about the precise characteristics that point
Ertmer, D. J. (2010). Relationships between speech intelligibil-
to a diagnosis of CAS, and apparently many diagno- ity and word articulation scores in children with hearing
ses of the disorder that turn out to be incorrect (“false loss. Journal of Speech, Language, and Hearing Research, 53,
positives”). 1075–1086.
Children with CAS are frequently diagnosed with Fabiano-Smith, L., & Hoffman, K. (2018). Diagnostic accuracy
expressive and/or receptive language delay and with of traditional measures of phonological ability for bilin-
poor literacy skills. gual preschoolers and kindergarteners. Language, Speech,
Children with CAS and language and liter- and Hearing Services in the Schools, 49, 121–134.
acy delays are at risk for poor academic and career Felsenfeld, S. (2002). Finding susceptibility genes for devel-
achievement. opmental disorders of speech: The long and winding road.
Journal of Communication Disorders, 35, 329–345.
Like speech delay, there is interest in a possible
Felsenfeld, S., Broen, P. A., & McGue, M. (1994). A 28-year
genetic basis for CAS.
follow-up of adults with a history of moderate phonologi-
Research publications over the last 20 years have cal disorder: Educational and occupational results. Journal
demonstrated the likelihood of a group of genes that of Speech and Hearing Research, 37, 1341–1353.
are associated with typical speech and language devel- Felsenfeld, S., & Plomin, R. (1997). Epidemiological and off-
opment, and when mutated (when a gene is changed spring analyses of developmental speech disorders using
from its typical properties) predispose a child to CAS data from the Colorado adoption project. Journal of Speech,
and language and literacy delays as well. Language, and Hearing Research, 40, 778–791.
Speech-language treatments for CAS show prom- Fiori, S., Guzzetta, A., Mitra, J., Pannek, K., Pasqualliero, R.,
ise and are still in the early stages of development and Cipriani, P., . . . Chilosi, A. (2016). Neuroanatomical corre-
evaluation. lates of childhood apraxia of speech. Neuroimage Clinical,
12, 894–910.
Flipsen, Jr., P. (2016). Emergence and prevalence of persistent
and residual speech errors. Seminars in Speech and Lan-
References guage, 36, 217–223.
Gillon, G. T., & Moriarty, B. C. (2007). Childhood apraxia of
Ad Hoc Committee on Childhood Apraxia of Speech. (2007). speech: Children at risk for persistent reading and spelling
Childhood apraxia of speech [Technical report]. Rockville, disorder. Seminars in Speech and Language, 28, 48–57.
MD: American Speech-Language-Hearing Association. Goldman, R., & Fristoe, M. (2015). Goldman-Fristoe Test of
Ballard, K. A., Tourville, J. A., & Robin, D. A. (2014). Behav- Articulation — Third edition (GFTA-3). Circle Pines, MN:
ioral, computational, and neuroimaging of acquired American Guidance Service.
apraxia of speech. Frontiers in Human Neuroscience, 8, 1–9. Haylou-Thomas, M. E., Carroll, J. M., Leavett, R., Hulme, C., &
Bankson, J. E., Bernthal, N. W., & Flipsen, Jr., P. (2017). Articu- Snowling, M. J. (2017). When does speech sound disorder
lation and phonological disorders (8th ed.). New York, NY: matter for literacy? The role of disordered speech errors, co-
Pearson. occurring language impairment and family risk of dyslexia.
Bislick, L., McNeil, M., Spencer, K. A., Yorkston, K., & Ken- Journal of Child Psychology and Psychiatry, 58, 197–205.
dall, D. L. (2017). The nature of error consistency in indi- Hegarty, N., Titterington, J., McLeod, S., & Taggart, L. (2018).
viduals with acquired apraxia of speech and aphasia. Intervention for children with phonological impairment:
American Journal of Speech-Language Pathology, 26, 611–630. Knowledge, practices, and intervention intensity in the
UK. International Journal of Language & Communication Dis- tional Journal of Language and Communication Disorders, 50,
orders, 53, 995–1006. 529–546
Hitchcock, E. R., Harel, D., & McAllister Byun, T. (2015). Park, J. E. (2017). Apraxia: Review and update. Journal of Clin-
Residual speech errors in school-aged children: A survey ical Neurology, 13, 317–324.
study. Seminars in Speech and Language, 36, 283–294. Peterson, R. L., McGrath, L. M., Smith, S. D., & Pennington,
Lai, C. S. L., Fisher, S. E., Hurst, J. F., Vargha-Khadem, F., & B. F. (2007). Neuropsychology and genetics of speech,
Monaco, A. P. (2001). A forkhead domain gene is mutated language, and literacy disorders. Pediatric Clinics of North
in a severe speech and language disorder. Nature, 413, America, 54, 543–561.
519–523. Powell, T.W., Elbert, M., Miccio, A.W., Strike-Roussos, C., &
Lewis, B. A., Shriberg, L. D., Freebairn, L. A., Hansen, A. J., Brasseur, J. (1998). Facilitating s production in young chil-
Stein, C. M., Taylor, H. G., & Iyengar, S. K. (2006). The dren: An experimental evaluation of motoric and concep-
genetic bases of speech sound disorders: Evidence from tual treatment approaches. Clinical Linguistics & Phonetics,
spoken and written language. Journal of Speech, Language, 12, 127–146.
and Hearing Research. 49, 1294–1312. Presotto, M., Rosenfeld Olchik, M., Schumacher Shuh, A.
Liégeois, F. J., & Morgan, A. T. (2012). Neural bases of child- F., & Reider, C. R. M. (2015). Assessment of nonverbal
hood speech disorders: Lateralization and plasticity for and verbal apraxia in patients with Parkinson’s disease,
speech functions during development. Biobehavioral Review, Parkinson’s Disease (Article ID 840327). https://doi.org/
36, 439–458. 10.1155/2015/840327
Lousada, M., Jesus, L. M. T., Capelas, S., Margaça, C., Simões, Preston, J. L., Irwin, J. R., & Turcios, J. (2015). Perception of
D., Valente, . . . Joffe, V. (2013). Phonological and articu- speech sounds: I. School-age children with speech sound
lation treatment approaches in Portuguese children with disorders. Seminars in Speech and Language, 26, 224–233.
speech and language impairments: A randomized con- Shriberg, L. D., Austin, D., Lewis, B. A., McSweeny, J. L., &
trolled intervention study. International Journal of Language Wilson, D. L. (1997). The percentage of consonants correct
& Communication Disorders, 48, 172–187. (PCC) metric: Extensions and reliability data. Journal of
Lousada, M., Jesus, L. M. T., Hall, A., & Joffe, V. (2014). Speech, Language, and Hearing Research, 40, 708–722.
Intelligibility as a clinical outcome measure following Shriberg, L. D., & Kwiatkowski, J. (1982). Phonological dis-
intervention with children with phonologically-based orders III. A procedure for assessing severity of involve-
speech-sound disorders. International Journal of Language ment. Journal of Speech and Hearing Disorders, 47, 256–270.
and Communication Disorders, 49, 584–601. Shriberg, L. D., & Kwiatkowski, J. (1994). Developmental
McNeil, M. R., Robin, D. A., & Schmidt, R. A. (2009). Apraxia phonological profiles: I. A clinical, profile. Journal of Speech,
of speech: Definition, and differential diagnosis. In M. R. Language, and Hearing Research, 37, 1100–1126.
McNeil (Ed.), Clinical management of sensorimotor speech Smith, A., & Weber, C. (2017). How stuttering develops: The
disorders (2nd ed., pp. 249–268). New York, NY: Thieme. multifactorial dynamic pathways theory. Journal of Speech,
Mines, M. A., Hanson, B. F., & Shoup, J. E. (1978). Frequency Language, and Hearing Research, 60, 2483–2505.
of occurrence of phonemes in conversational English. Lan- Vick, J. C., Campbell, T. F., Shriberg, L. D., Green, J. R.,
guage and Speech, 21, 221–241. Truemper, K., Rusiewicz, H. L., & Moore, C.A. (2014).
Morgan, A., Fisher, S. E., Scheffer, I., & Hildebrand, M. (2017). Data-driven subclassification of speech sound disorders
FOXP2-related speech and language disorders (2016 Jun in preschool children. Journal of Speech, Language, and Hear-
23 [Updated 2017 Feb 2]). In M. P. Adam, H. H. Ardinger, ing Research, 57, 2033-2050.
R. A. Pagon, et al. (Eds.), GeneReviews [Internet]. Seattle, Weismer, G. (2008). Speech intelligibility. In M. J. Ball, M. R.
WA: University of Washington, Seattle (1993–2018). Perkins, N. Müller, & S. Howard (Eds.), Handbook of clinical
Retrieved from https://www.ncbi.nlm.nih.gov/books/ linguistics (pp. 568–582). Oxford, UK: Blackwell.
NBK368474/ Whiteside, S. P., Dyson, L., Cowell, P. E., & Varley, R. A. (2015).
Morgan, A. T., Murray, E., & Liégeois, F. J. (2018). Interven- The relationship between apraxia of speech and oral
tions for childhood apraxia of speech. Cochrane Database of apraxia: Association or dissociation? Archives of Clinical
Systematic Reviews, 2018(5), CD006278. https://www.doi Neuropsychology, 30, 670–682.
.org/10.1002/14651858.CD006278.pub3 Worthey, E. A., Raca, G., Laffin, J. J., Wilk, B. M., Harris, J. M.,
Morgan, A. T., & Webster, R. (2018). Aetiology of childhood Jakielski, K. J., . . . Shriberg, L. D. (2013). Whole-exome
apraxia of speech: A clinical practice update for paediatri- sequencing supports genetic heterogeneity in childhood
cians. Journal of Paediatrics and Child Health, 54, 1090–1095. apraxia of speech. Journal of Neurodevelopmental Disorders,
Murray, E., McCabe, P., & Ballard, K. J. (2012). A comparison 5, 29.
of two treatments for childhood apraxia of speech: Meth- Zadikoff, C., & Lang, A. E. (2005). Apraxia in movement dis-
ods and treatment protocol for a parallel group random- orders. Brain, 128, 1480–1497.
ized control trial. BMC Pediatrics, 12, 112. Zaretsky, E., Velleman, S. L., & Curro, K. (2010). Through the
Namasivayam, A. K., Pukonen, M., Goshulak, D., Hard, J., magnifying glass: Underlying literacy deficits and reme-
Rudzicz, F., Rietveld, T., . . . van Lieshout, P. (2015). Treat- diation potential in Childhood Apraxia of Speech. Interna-
ment intensity and childhood apraxia of speech. Interna- tional Journal of Speech-Language Pathology, 12, 58–68.
16
Pediatric Speech Disorders II
Introduction damage present at birth or throughout childhood. The

neurological damage may be acquired in early child-
Motor speech disorders in children are often discussed hood, as in the case of a child who has a brain tumor
separately from motor speech disorders in adults. Sev- removed surgically or experiences a penetrating or
eral educated guesses for the separate discussion can closed head injury. It is safe to say that in the case of
be offered. First, motor speech disorders in adults, as childhood neurological disease, the development of
discussed in the professional literature and reviewed speech motor control is within the context of brain
in Chapter 14, are almost always the result of acquired mechanisms different from those of either healthy
neurological disease. In these cases, a previously healthy adults, or in adults who acquire a neurological disease
adult has a stroke or other acute condition affecting the after typical development of speech motor control. We
brain. Dysarthria and apraxia of speech may also result therefore expect speech behavior, developed within the
from neurological deficits resulting from degenerative context of atypical brain mechanisms, to look different
diseases such as Parkinson’s disease, amyotrophic from speech behavior of someone whose mechanisms
lateral sclerosis, and multiple sclerosis. Dysarthria in are mature and then damaged by an acquired disease
adults may also be an outcome of a traumatic brain process. “Expect” is an important word in the previous
injury; traumatic brain injury and motor speech disor- sentence: the difference between speech motor control
ders in children are discussed later in the chapter. in children born with neurological disease (or acquir-
When these conditions, and other neurological ing it very early in life) versus speech motor control
diseases in adults, affect the neurological substrate of in previously healthy adults who acquire neurologi-
the speech mechanism, the damage is to a fully mature cal disease is a hypothesis. Firm data to support this
system, one in which speech motor control has been hypothesis are not yet available.
established. Presumably, the speech motor control This chapter describes motor speech disorders in
skills developed and maintained over a lifetime may children with cerebral palsy, traumatic brain injury, and
be used by adults, to some degree, to compensate for who have had surgical removal of tumors. The reader
the loss of control associated with the acquired neuro- is encouraged to review Table 14–1 in Chapter 14 as
logical damage. preparation for the current chapter. Speech disorders in
In contrast, motor speech disorders in children hearing impairment, many cases of which are the result
are associated with known or suspected neurological of neurological disease, are discussed in Chapter 23.
219
Childhood Motor Speech spastic, athetoid/dyskinetic, ataxic, and mixed.1 Recent

Disorders: Cerebral Palsy estimates of the occurrence of these types are 85% to
90% for the spastic type, 7% for the dyskinetic type,
Cerebral palsy is the most common motor disability and 4% for the ataxic type (Wimalasundera & Steven-
in childhood, with an estimated prevalence of 1.5 to son, 2016).2 The most common mixed type is a com-
4 cases per 1,000 live births (Stavsky, Mor, Mastrolia, bination of spasticity and dyskinesia. Other subtypes
Greenbaum, Than, & Erez, 2017). The estimated preva- have been discussed in the literature, but they are rare
lence varies by race, country, and other factors beyond and not discussed here.
the scope of this chapter; the range of 1.5 to 4 is thought A brief review of neuroanatomical structures pro-
to encompass all of these varying estimates (see Stavsky vides a framework for the following description of
et al., 2017). Motor speech disorders are found in a cerebral palsy types. The corticobulbar and corticospi-
minimum of 20% and possibly as many as 50% of chil- nal tracts (fiber bundles) connect cells in the primary
dren born with cerebral palsy (Nordberg, Miniscalco, motor cortex to motor cells in the brainstem (corti-
Lohmander, & Himmelmann, 2013). The motor speech cobulbar) and spinal cord (corticospinal). The basal
disorder in cerebral palsy is almost always dysarthria, ganglia, a group of subcortical nuclei with connections
which results in speech intelligibility deficits. In a small to cortical cells, play a major role in the planning, ini-
number of cases, apraxia of speech may co-occur with tiation, and execution of movements; the basal ganglia
dysarthria. also inhibit movement when it is not appropriate for
Cerebral palsy is a childhood disorder resulting an action. Finally, the cerebellum is located beneath the
from brain damage incurred before, during, or shortly cerebral hemispheres, just behind the brainstem. The
following birth. The term “cerebral palsy” includes cerebellum plays an important role in the coordination
several different types. A unifying feature of cerebral of movements, and the smoothness with which the
palsy across types is the impairment of posture and movements are executed.
movement. When the impairment of movement is also
observed in the speech mechanism, dysarthria is a
The Spastic Type of Cerebral Palsy
likely result.
Roughly 40% to 45% of children with cerebral palsy Spastic cerebral palsy is the outcome of damage to the
have intellectual impairment, which may range from corticobulbar and/or corticospinal tracts. The damage
mild to severe (Reid, Meehan, Arnup, & Reddihough, is likely to occur early in fetal development. Damage to
2018). In addition, about 40% of children with cerebral these tracts results in excessive muscle tone that results
palsy have some degree of hearing loss (Foo, Guppy, in stiffness and weakness in affected structures (such
& Johnston, 2013; Weir, Hatch, McRackan, Wallace, & as a limb or the jaw). The excessive muscle tone may
Meyer, 2018). Intellectual disability and hearing loss be chronic. The constant contraction of wrist, arm, and
are both associated with speech production problems. leg muscles may result in distorted postures of these
Speech breathing, voice, prosodic, and sound errors in structures, even when they are not being used. Damage
cerebral palsy may therefore have more complicated to the corticobulbar and corticospinal tracts also results
causes than a purely speech motor control problem. in hypersensitive reflexes, which may be triggered by
a purposeful movement. These reflexes can interfere
with the movement and its goal.
Subtypes of Cerebral Palsy
The Dyskinetic Type of Cerebral Palsy
The types of cerebral palsy have been classified in many
different ways since the disease was first described in Dyskinetic cerebral palsy is the result of damage to
the mid-19th century (Morris, 2007). In this chapter, we the basal ganglia. Dyskinesias are uncontrolled body
adopt a classification system in common clinical use movements, either at rest or during purposeful move-
and that fits with a discussion of speech disorders in ments. Dyskinesias are sometimes described as writh-
cerebral palsy. The main types of cerebral palsy are ing, uncontrolled movements and have the potential
1
ome scientists and clinicians add to this classification the number of limbs involved. For example, “spastic diplegia” indicates the spastic
S
type with symptoms observed in two limbs, “spastic quadriplegia” symptoms observed in all four limbs, and “spastic hemiplegia” symptoms
observed on just one side of the body (usually in both limbs).
2
ercentages vary in different surveys published in the literature. In all surveys, however, the spastic type of cerebral palsy clearly occurs with
P
the greatest frequency. As reviewed by Bugler, Gaston, and Robb (2019), the occurrence of the spastic type of cerebral palsy among all cases in
six major studies (including their own) ranged from 81% to 100% (average across all six studies = 91% spastic type, based on a total of 4,385
children diagnosed with cerebral palsy).
to interfere with proper movement control to achieve children with cerebral palsy sound the same, and on
a goal. A simple example of a disturbance in achieving careful analysis have the same characteristics, as spas-
such a goal is the involuntary, uncontrolled movement tic dysarthria in adults who have suffered a stroke?
of the arm that interferes with the act of raising a fork Keep in mind an important caution throughout
to the mouth. the description of speech characteristics in different
types of childhood dysarthria. For a particular type
of dysarthria, not every characteristic applies to each
The Ataxic Type of Cerebral Palsy
child diagnosed with the type. There is a good deal of
This relatively rare type of cerebral palsy results from variation from child to child in speech characteristics
damage to the cerebellum. The function of the cerebel- of any single dysarthria type. This is similar to the case
lum is to control posture and balance, coordination of adult dysarthrias.
of muscle contraction during voluntary movement,
and control of the force of muscle contraction to guar-
Spastic Dysarthria
antee accuracy and scale of movement (how big or
small the movement is depends on the goal of a move- According to Workinger (2005), children with the spas-
ment). Cerebellar damage often results in disturbance tic type of cerebral palsy have speech breathing and
of these functions. phonation problems, with less-affected articulation.
A child may be diagnosed with the ataxic type of The speech breathing and laryngeal problems result in
cerebral palsy when his feet are widely planted when a weak voice and a strained voice quality (Solomon &
walking and he is unsteady. The diagnosis is rein- Charron, 1998). The strained voice quality, the result
forced if accuracy and force of movement are poorly of unusually strong muscle force in the larynx during
controlled. Walking may be delayed in babies with phonation, may be a compensation for the muscular
suspected ataxia, and other milestones (e.g., sitting up) problems in the respiratory system and larynx. If the
may also be delayed. natural result of spasticity in the larynx — excessive
muscle tone, for example — prevents the larynx from
closing during phonation, laryngeal muscles may
The Mixed Type of Cerebral Palsy
work too forcefully to overcome excessive “leaks” of
Children diagnosed with the mixed type of cerebral airflow through the vibrating vocal folds. Similarly, the
palsy are assumed (or known, as revealed by imag- stiff muscles of the respiratory system that compress
ing studies) to have damage in multiple parts of the the lungs may have difficulty generating the pressures
brain. As reviewed earlier, the mixed type of cerebral required for phonation. The low lung pressures and
palsy is infrequent, the most likely type being a spastic- consequently low airflows are tightly metered through
dyskinetic form of the disease. the larynx to conserve the limited air supply, resulting
in the strained voice quality.
More recent research (Lee, Hustad, & Weis-
Dysarthria in Cerebral Palsy mer, 2014; Levy, Chang, Ancelle, & McAuliffe, 2017;
Schölderle, Staiger, Lampe, Strecker, & Ziegler, 2016)
The dysarthria in cerebral palsy is often described by points to a more prominent role of disordered articu-
the Mayo Clinic classification system, described in lation in children with the spastic type of cerebral
Chapter 14. Children diagnosed with the spastic type palsy and dysarthria. The outcome measure of inter-
of cerebral palsy are typically judged to have spastic est in these studies is speech intelligibility; increases
dysarthria, the dyskinetic type of cerebral palsy have in the articulatory deficit result in decreases in speech
hyperkinetic dysarthria (dysarthria resulting from intelligibility. Another aspect of articulatory problems
involuntary movements during attempts to move noted by Lee et al. and Levy et al. is an abnormally
articulators), and so forth. As in adult dysarthria, the slow speaking rate in children with spastic dysarthria.
diagnosis of dysarthria type is made by perceptual Slow speaking rates stretch out the duration of speech
analysis. Also as in adults, the dysarthria diagnosis is sounds, which may contribute to problems with speech
not necessarily matched to the diagnosis of cerebral intelligibility.
palsy type. Thus, a child diagnosed with the spastic The results of Lee et al. (2014) and Levy et al.
type of cerebral palsy may be perceived with hyperki- (2017) are consistent with results reported by Platt,
netic dysarthria (or vice versa). Andrews, Young, and Quinn (1980), who performed
An important question is the extent to which the detailed analysis of sound errors and speech intelligi-
dysarthria of a specific type in children has the same bility in young and older adults with the spastic type
speech characteristics as the adult dysarthria of the of cerebral palsy. On average, approximately 20% of all
same type. For example, does spastic dysarthria in sounds in the words, including consonants and vowels,
were misarticulated. The average speech intelligibility the pitch contrasts for syllables in multisyllabic words.
for these speakers was 59%. Platt et al.’s results sug- Pitch contrasts at the sentence level are used to convey
gest a substantial role for speech sound errors in speech meaning (as in the difference between statements and
intelligibility. questions) and emotion. Pitch contrasts for syllables in
Are the data on speech sound errors reported for multisyllabic words are an important component of lis-
adults with the spastic type of cerebral palsy applica- tener word recognition. A word such as “copycat” has
ble to children with spastic dysarthria? A convincing two stressed syllables “cop” and “cat” surrounding an
answer is not available. No one has done a study of unstressed syllable. The unstressed “ee” in “copy” is
children with spastic dysarthria, similar to the study of very short, with a lower pitch compared with the two
Platt et al. (1980), in which detailed phonetic transcrip- stressed syllables. The pitch differences are part of lexi-
tion was performed to identify speech sound errors. cal stress as indicated in dictionaries.
However, Platt et al. compared the types of phonetic
errors they obtained to the same kind of analysis
reported by Byrne (1959) many years ago for children Compensation in the
with cerebral palsy. The pattern of errors in both stud- Speech Mechanism
ies was the same, suggesting (tentatively) that the adult
data may provide a partial model for speech sound Like limb motor control strategies, adjustment of
errors in children with cerebral palsy. The reason for the speech mechanism for changing or changed
the exaggerated caution in the previous sentence is the conditions is not unusual. For example, when
strong possibility of change in a child’s dysarthria char- speakers are asked to increase speaking rate,
acteristics as he grows older, which may change the they make articulatory movements that are
similarity between the adult and child forms of spastic smaller than movements at “habitual” (normal)
dysarthria in cerebral palsy (Schölderle et al., 2016). speaking rates. This strategy allows the produc-
Prosody is also affected in spastic dysarthria. The tion of more syllables per unit time — smaller
effect is a reduced ability to change the pitch of the movements require less time. Speech therapy
voice to produce the melody (intonation) of speech, and that utilizes variations in speaking rate are based
on the idea that movements at a slowed rate are
larger than at habitual rates. A slowed rate gives
Intelligibility Is More clients a better chance to produce the articula-
tory movements required to position articulatory
Than Understanding
structures (the tongue, lips, soft palate) for
Words
a correct speech sound. Another example of
Prosody is tricky. Melodies of statements compensation in persons with speech motor
versus questions are different, but not always. control problems is a tightening of the larynx
In general, “Who,” “what,” “where,” “when,” to restrict the limited airflow coming from the
and “why” questions have falling pitch at the lungs. Individuals adjust one part of the speech
end of an utterance. Exceptions to this for “wh-” mechanism — in this case, the tightness of closure
questions occur when a speaker does not quite during vocal fold vibration — to compensate for
get the question, or is in disbelief at the question: another part (the respiratory system) that is not
“When do I get my allowance?” (typical falling functioning well. In some cases, the adjustment
pitch) versus “When do I get my allowance?” can be excessive — like the strain-strangled voice
(rising pitch at the end, parent response in disbe- in spastic dysarthria that attempts to compensate
lief that the question was asked). The inability for weak airflow from the lungs.
to control the melody of utterances can interfere
with a listener’s understanding of subtle shades
of meaning between these utterances that
Dyskinetic Dysarthria
share the same speech sounds. This potential for
communication failure extends to mood, which Damage to the basal ganglia occurs in 10% to 20% of
is conveyed both subtly and in dramatic ways by children with cerebral palsy. The damage occurs later
pitch change across utterances. Happy, sad, bored, in fetal development as compared with the earlier dam-
angry, hopeful, are all communicated by varia- age thought to occur in the corticobulbar and cortico-
tions in the melody of utterances. The lesson is: spinal tracts.
Intelligibility is more than understanding words. As reviewed earlier, damage to the basal ganglia
results in loss of control of speech structures. Articula-
tory movements are directed to unintentional locations involuntary movements of the head and neck, and
within the vocal tract, and sudden, unintentional move- especially of speech mechanism structures such as the
ments disrupt these movements. Also, slow, unplanned lips and tongue, are thought by many scientists and cli-
changes occur in structures that are intended to main- nicians to play a major role in the dysarthria observed
tain articulatory and phonatory positions and shapes among many children with the dyskinetic type of cere-
for a limited amount of time. The writhing, changing bral palsy.
configuration of the lips for vowel sounds is one exam- Dysarthria in the dyskinetic type of cerebral palsy
ple of this motor control problem. Another example is shares many characteristics with spastic dysarthria.
the background muscular tone in the larynx that allows Both types have many speech sound errors and similar
the vocal folds to vibrate for phonation with controlled patterns of errors across different speech sounds (Platt
pitch, loudness, and quality. The inability to stabilize et al., 1980). The lack of voice stability in dyskinetic
these background forces results in a lack of phona- dysarthric is different from the strain-strangled voiced
tory control. These motor control problems — really, quality in spastic dysarthria. Dyskinetic dysarthria
a group of motor control problems — are called dys- may have a more intermittent hypernasality (some-
kinesias (impairments of voluntary movement). The times too nasal, other times not) compared with spastic
older literature on types of cerebral palsy uses the term dysarthria in which the hypernasality may be more or
“athetoid cerebral palsy” to describe these movement less constant.
problems. Currently, dyskinetic cerebral palsy is the Selected characteristics of spastic and dyskinetic
preferred term. dysarthria in children observed directly or expected
The involuntary, often constant movements in from analyses of adult data are summarized in
the dyskinetic form of cerebral palsy are frequently Table 16–1. As stated above, not all characteristics are
observed in structures of the head and neck. Many seen in each child, and the severity of the characteris-
children with dyskinetic cerebral palsy have random, tics varies across children even when they share a com-
involuntary movements of the mouth and tongue. The mon speech diagnosis such as spastic dysarthria.
Table 16–1. Summary of Speech Characteristics of Children With the Spastic

and Dyskinetic Types of Cerebral Palsy and Dysarthria
Spastic type Speech breathing problems (difficulty generating positive

lung pressure and adequate airflows)
Weak voice; strain-strangled quality
Speech sound errors (consonants and vowels) contribute
heavily to speech intelligibility deficits
Slow speaking rate (lengthened sound durations)
Chronic hypernasality
Pauses (dysfluency)
Dyskinetic type Instability of phonatory (voice) pitch, loudness, and quality

Speech sound errors (consonants and vowels)
Intermittent hypernasality
Speech breathing problems
Pauses (dysfluency)
Prosody problems
Note. Data from adults with the spastic type of cerebral palsy and dysarthria contribute to this
summary, mostly for the component of speech sound errors. Data from “Dysarthria of Adult
Cerebral Palsy: I. Intelligibility and Articulatory Impairment,” by L. G. Platt, G. Andrews, M.
Young, and P. T. Quinn, 1980, Journal of Speech and Hearing Research, 23, pp. 28–40; “Dysarthria
In Adults With Cerebral Palsy: Clinical Presentation and Impacts on Communication,” by T.
Schölderle, A. Staiger, R. Lampe, K. Strecker, and W. Ziegler, 2016, Journal of Speech, Language,
and Hearing Research, 59, pp. 216–229; Platt, Andrews, & Howie, 1980; and review of the adult
literature in “Acoustic-Phonetic Contrasts and Intelligibility in the Dysarthria Associated with
Mixed Cerebral Palsy,” by B. M. Ansel and R. D. Kent, 1992, Journal of Speech and Hearing
Research, 35, pp. 296–308.
Ataxic Dysarthria This is an important consideration in the likelihood

of dysarthria in TBI. Morgan, Mageandran, and Mei
The ataxic type of cerebral palsy reflects damage to the
(2009) reported on an 8-year series of 1,895 children
cerebellum, or to the fiber tracts that connect the cer-
with a recent TBI and identified only 22 diagnosed with
ebellum with other parts of the central nervous system.
dysarthria. However, Morgan et al. point out that chil-
Although ataxic disorders in children who have a vari-
dren with more severe head injuries — the ones likely
ety of diseases are not exceedingly rare, the ataxic type
to have significant lifelong deficits — were also more
of cerebral palsy is. An estimate of the prevalence of the
likely to have dysarthria.
disorder is 0.1 cases per 1,000 live births (Musselman
TBI occurs as a result of motor vehicle accidents,
et al., 2014).
falls, sports-related accidents, and other mishaps. Most
The data on the dysarthria of the ataxic type of
children with a TBI show improvement following the
cerebral palsy are limited, largely due to the rareness
accident, and many show great gains over time and
of the disorder. In the small number of ataxic chil-
regain nearly normal function. This recovery includes
dren studied by Workinger (2005), the dysarthria was
speech and language function, and nearly full speech
judged to be mild. However, among dysarthric chil-
function may be recovered even when a child is mute
dren with the spastic, dyskinetic, and ataxic types of
shortly after the accident (Campbell & Dollaghan, 1995).
cerebral palsy, Nordberg, Miniscalco, and Lohmander
The brain damage in TBI is often diffuse, meaning
(2014) reported that the most severe speech symptoms
it includes widespread parts of the brain. Recall that
were observed for the ataxic children. The ataxic chil-
the brain is protected from the bony casing of the skull
dren had more severe consonant errors compared with
by the fluid surrounding it and a thick, hide-like cov-
the spastic and dyskinetic children.
ering which is the outer layer of the meninges. In TBI,
The number of children in the Workinger (2005)
the brain may be twisted and rotated at high accelera-
and Nordberg et al. (2014) studies is too small to draw
tions within this protective environment, resulting in
firm conclusions concerning the typical characteristics
injury to the fiber tracts connecting various cell groups
of the dysarthria in the ataxic type of cerebral palsy. It is
(nuclei). The axons — the parts of brain tissue that make
possible that these match speech characteristics that are
up the fiber tracts — are stretched and in some cases
well-documented in the adult literature on ataxic dys-
“sheared” by the twisting and rotating motion. There
arthria (Chapter 14). It is also possible that descriptions
also can be focused regions of damage in the brain, as
of ataxic dysarthria in children with cerebral palsy can
when the brain tissue beneath the point of impact on
be estimated from ataxic dysarthria in children who
the skull is damaged.
have had tumors removed from their brainstem or cer-
Children who suffer TBI are likely to have speech,
ebellum. These cases are reviewed in the next sections.
language, and cognitive symptoms that interfere with
their academic and social lives. One clinic’s experience
over a 5-year period suggests that approximately 15%
Childhood Motor Speech of the children seen for TBI-related services will have
Disorders: Traumatic some form of dysarthria (Hodge & Wellman, 1999).
Brain Injury and Tumors
Traumatic Brain Injury Prevalence and

Incidence — Part I
The diagnostic category traumatic brain injury (TBI) is
separated into closed-head injuries and penetrating The term “prevalence” refers to the existing
head injuries. Closed-head injuries include blows to the number of cases of a disorder or condition at a
head that leave the skull more or less intact but result given time; “incidence” refers to the number of
in brain damage. In penetrating head injuries, an object “new” cases over some time period, such as at
(such as a bullet or bomb fragment) pierces the skull birth, or over the course of a year. Analysis of
and enters the brain. Here we limit our discussion to prevalence and incidence are ways to estimate
closed head injuries. the “true” numbers in the whole popula-
In the United States, the prevalence of childhood tion — think, for example, of doing a study that
TBI has been estimated at 2.5%; approximately 18% sampled the entire population in the United
of children with lifelong effects from a TBI have dys- States for prevalence of fourth ventricle tumors.
arthria and language deficits (Haarbauer-Krupa, Lee, Thus, percentages vary across investigators
Bitsko, Zhang, & Kresnow-Seddaca, 2018). Note the because they are estimates, not true values.
phrase, “lifelong effects,” in the preceding sentence.
Dysarthria in TBI age to the corticobulbar and corticospinal tracts. Based

on the imaging data of Liégeois et al., spastic dysar-
Speech breathing, voice, velopharyngeal function, and
thria alone or in combination with another dysarthria
articulation are likely to be affected in children with TBI
type might be expected to occur frequently among
and dysarthria (Morgan et al., 2009). The Mayo Clinic
children with TBI and dysarthria. When diffuse injury
system categories (Flaccid, Spastic, Ataxic, Hyperki-
affects both cerebral hemispheres or the tracts between
netic, Hypokinetic, Mixed) have been used to describe
cortical motor cells and motor cells in the brainstem
the speech of these children, but in all likelihood, the
and/or spinal cord, the dysarthria is severe. In these
“mixed” type is the most frequent. This reflects the dif-
cases, there is a poor outlook for improvement of the
fuse brain injury in TBI. Chronic hypernasality, slow
dysarthria.
speaking rate, voice quality abnormalities, prosodic
abnormalities (including inappropriate pauses), and
imprecise consonants and vowels are heard frequently
Brain Tumors
in this clinical group.
Results of a brain imaging study suggest that the
Tumors in the region of the fourth ventricle are called
corticobulbar tract is frequently damaged in children
posterior fossa tumors; they are relatively rare. The
with TBI and dysarthria (Liégeois, Tournier, Pigdon,
posterior fossa is a cradle within the skull base in
Connelly, & Morgan, 2013). As discussed earlier, the
which the posterior structures of the brain — the brain-
spastic type of cerebral palsy is associated with dam-
stem, fourth ventricle, and cerebellum — are contained.
These structures are immediately below the occipital
lobes of the cerebral hemisphere. The occipital lobes
Prevalence and are the most posterior of the four lobes of the cerebral
Incidence — Part II hemispheres.
Posterior fossa tumors are typically cancerous and
In the series of children analyzed by Morgan are removed surgically. Often the surgery is followed
et al., 1.2% were diagnosed with dysarthria. This by radiation and/or chemotherapy.
series of children was a consecutive one — the
children were not selected for the study based
Dysarthria in Brain Tumors
on a characteristic other than being diagnosed
with a TBI. This means that the sample of 1,895 Approximately 50% of childhood brain tumors are
children included mild, moderate, and severe located in and around the fourth ventricle. The fourth
cases. When cases are chosen only for severe TBI, ventricle is the diamond-shaped cavity located between
a greater percentage of children are diagnosed the posterior wall of the brainstem and the anterior
with dysarthria. A different series of patients wall of the cerebellum (see Chapter 2, Figure 2–8).
with TBI, over a 5-year period, estimated a 15% The fourth ventricle is one of the cavities in the brain
occurrence of dysarthria (Hodge & Wellman, through which cerebrospinal fluid flows. A tumor in
1999). Children in Hodge and Wellman came to a this area exerts pressure on brainstem and cerebel-
speech-language clinic seeking services, whereas lar structures and causes disruptions of the functions
children in the Morgan et al. (2009) study were taken served by those structures. Two of those important
from a social medicine registry in Australia, functions are speech and swallowing.
and no doubt included many cases in which The cerebellum plays a major role in the coordina-
services were not pursued. Thus, the Moran tion of muscles of the respiratory system, the larynx,
et al. sample probably included many more mild and the articulators to produce speech smoothly and
cases (hence, not seeking services for speech and with syllable sequences that sound “seamless.” A dis-
language problems) compared with Hodge and ease process that interferes with cerebellar function dis-
Wellman. When Morgan et al. extracted from rupts coordination and makes speech sound choppy,
the total sample only those children who had as if consecutive syllables, and even sounds within a
been referred for services to a speech-language syllable, are pulled apart.
pathologist, 14% of the children were diagnosed The cranial nerves are connected directly to mus-
with dysarthria, a number very similar to the cles of the larynx and articulators and issue the final
one reported by Hodge and Wellman. These commands to control contraction characteristics such as
percentages are estimates of prevalence, not inci- force and timing. Muscle contraction strength is weak
dence, and many variables affect the estimates. and ineffective when cranial nerves that serve head and
neck muscles are affected by a disease process.
Many children with posterior fossa tumors have tion through an airflow resistance. The airflow resis-
dysarthria. Shortly after surgical removal of the tumor, tance is provided by a tube containing a float that is
approximately 10% to 33% of the children have a con- displaced by incoming air, or a wire mesh screen held
dition called cerebellar mutism (compare Morgan et in place by a face mask. Forceful exhalation exercises
al., 2011, to Tamburrini, Frassanito, Chieffo, Massimi, rib cage muscles used to raise lung pressures required
Caldarelli, & Di Rocco, 2015). In this stage of postsurgi- to vibrate the vocal folds and achieve an effective voice
cal recovery, children are mute (unable to speak) for a loudness.
variable period of time. The duration of mutism can be When a child succeeds in completing multiple tri-
2 weeks or more, although some children with cerebel- als of the nonspeech exercises, a layer of motor com-
lar mutism begin to speak within hours or a few days plexity is added to the task. One version of this is to
after the surgery. train the child to coordinate the initial part of expi-
Children who are mute and then regain an abil- ratory airflow with phonation of a sustained “ah.”
ity to speak usually have dysarthria, which typically Although this seems to be a very simple task, many
improves but not necessarily to the point of normal children with either cerebral palsy or a TBI are chal-
speech. The disruption of cerebellar and cranial nerve lenged by it. When the training of coordinating speech
speech functions by a posterior fossa tumor may be breathing with phonation results in good performance,
expected to result in a flaccid dysarthria, or a cerebellar a child may be trained to open her jaw wide during
(ataxic) dysarthria, or a mixed flaccid-ataxic dysarthria phonation to produce a louder sound.
(Chapter 14). Some authors endorse this expectation, These exercises are hierarchical, in the sense that
based on their data (Morgan et al., 2011), whereas oth- separate training tasks are sequenced to build on sim-
ers do not (De Smet et al., 2012). Children who are not ple skills and make them increasingly complex. In the
mute following surgery often regain normal speech sequence just described, there is no mention of direct
abilities even when dysarthria is present during the training of articulatory skills. Some therapeutic pro-
early days of their recovery. grams may, in fact, not have a major goal of training
articulatory skills. Rather, the foundation of stronger
speech muscles, coordination of speech breathing,
Treatment Options and and increased vocal loudness are thought to have a
Considerations “spreading effect” to articulatory skill. Even in the
absence of training to produce (for example), good fric-
The end-goal of speech-language pathologists when ative, vowel, rhotic and lateral sounds, the improved
treating children with neurologically based speech dis- foundation of speech motor skills leads to improved
orders is to improve speech intelligibility. Publications articulatory behavior. The overall effect of the therapy
in the speech-language pathology literature provide is better speech intelligibility — the long-term outcome
preliminary evidence that speech-language therapy goal of speech therapy.
can result in positive gains in the outcome goal of A small amount of research evidence is available to
improved intelligibility. support the connection between a stronger foundation
In one approach, speech-language pathologists of speech motor skills and improved speech intelligibil-
focus initially on simple strength and coordination of ity. In both young (5 to 11 years old) and older children
structures of the speech mechanism. The idea behind (12 to 18 years) with cerebral palsy, speech intelligibil-
this first step is to provide a foundation for the more ity for single words improved between 9% and 14% fol-
complex motor control requirements of speech produc- lowing the therapy previously described (Pennington,
tion. These therapy activities are called oro-motor non- Miller, Robson, & Steen, 2010; Pennington, Roelant,
speech exercises, because they are done in the absence Thompson, Robson, Steen, & Miller, 2013). Not every
of phonation (voice) or articulatory movements. child in these studies had improved intelligibility fol-
For example, the lips and tongue can be exercised lowing the therapy, but the overall improvement for
by having a child compress his lips together as force- the groups of children is encouraging.
fully as possible or push his tongue against a resistance Whether or not the examples of speech therapy in
(e.g., using tongue protrusion to move a barrier placed children with cerebral palsy can be generalized to chil-
in front of the lips). The analogy to increasing strength dren with tumors of the fourth ventricle or with a TBI
in an arm or leg by exerting high limb force and press- is unknown; the appropriate studies are not available.
ing against resistances is direct. Nonspeech exercise is Clark (2003) provides a review of treatment in neuro-
also extended to muscles of the respiratory system that logical speech disorders.
support speech breathing (Chapter 10). These muscles When a child has severe dysarthria, and is either
can be strengthened by a program of forceful exhala- completely unintelligible or barely intelligible, therapy
approaches exist to provide the child with communi- Hyperkinetic dysarthria in cerebral palsy is
cation options. Augmentative and alternative com- thought to result from uncontrolled movements of the
munication (AAC) technology can supplement (aug- speech structures, and muscle tone that is constantly
mentative) or substitute (alternative) for the severely changing.
impaired speech skills. AAC can be low-tech, such as The speech characteristics in dyskinetic dysarthria
an alphabet board that allows users to spell words or are similar to those in spastic dysarthria, including
a picture board to convey simple ideas, or high-tech, problems with speech intelligibility.
such as speech synthesizers, controlled by hand, a light Ataxic dysarthria in cerebral palsy is thought to
pointer, or eye movements, like the one used by the result from coordination problems among the articula-
famous physicist Stephen Hawking. AAC options are tors, larynx, and respiratory system.
adapted to each user’s needs and capabilities, under The speech characteristics in ataxic dysarthria are
the guidance of a specialist, to make communication not well understood because of the small occurrence of
available to those children who cannot speak. the ataxic type of cerebral palsy, and a lack of relevant
studies.
Traumatic brain injury (TBI) associated with a
closed head injury results in diffuse damage to brain
Chapter Summary
tissue, as well as focal (one part of the brain) injuries.
Children with TBI often have cognitive, social,
Motor speech disorders in children are discussed sepa- and speech and language problems, many of which
rately from motor speech disorders in adults because improve over time; approximately 15% of children
the effects of neurological disease on a child’s speech with TBI have dysarthria.
are in the context of a developing brain, while in adults The dysarthria in TBI is likely to be a mix of the
the effects are in the context of a previously developed Mayo categories and to include slow speaking rate,
and mature brain. hypernasality, abnormal voice quality and prosodic
Cerebral palsy is the most common motor disabil- deficits, and speech sound errors.
ity in childhood, and it has been estimated that dysar- Posterior fossa tumors are located in and around
thria is present in 20% to 50% of the diagnosed cases. the fourth ventricle and may result in muteness imme-
The major types of cerebral palsy are spastic, dys- diately after surgery, followed by a period of recovery
kinetic, and ataxic, with the majority of children with of speech skills; many children regain nearly normal
cerebral palsy having the spastic type. speech following surgery, but some children have dys-
The spastic type of cerebral palsy results from arthria as a long-term communication disorder.
damage to the corticobulbar and corticospinal tracts, In preliminary studies, speech therapy has been
the dyskinetic type from damage to the basal ganglia, shown to be effective in children with dysarthria, espe-
and the ataxic type from damage to the cerebellum and cially when the focus of the therapy is improvement of
its tracts connecting it to other parts of the central ner- speech intelligibility by working on basic strength and
vous system. coordination skills as the foundation for more complex
The types of dysarthria in children with cerebral speech motor tasks.
palsy are often matched to those in adults with dys-
arthria: spastic dysarthria in the spastic type, hyper-
kinetic dysarthria in the dyskinetic type, and ataxic References
dysarthria in the ataxic type.
The perceptual analysis used to diagnose the dys- Ansel, B. M., & Kent, R. D. (1992). Acoustic-phonetic con-
arthria type may not match the diagnosed type of cere- trasts and intelligibility in the dysarthria associated with
bral palsy (e.g., spastic dysarthria may be diagnosed in mixed cerebral palsy. Journal of Speech and Hearing Research,
a child diagnosed with the dyskinetic type of cerebral 35, 296–308.
palsy). Bugler, K. E., Gaston, M. S., & Robb, J. E. (2019). Distribution
Spastic dysarthria in cerebral palsy is thought to and motor ability of children with cerebral palsy in Scot-
land: A registry analysis. Scottish Medical Journal, 64, 16–21.
result from stiff muscles which are the result of exces-
Byrne, M. (1959). Speech and language development of ath-
sive tone in speech structures such as the larynx and etoid and spastic children. Journal of Speech and Hearing
tongue. Disorders, 24, 231–240.
The speech characteristics in spastic dysarthria Campbell, T. F., & Dollaghan, C. A. (1995). Speaking rate,
include sound errors (consonants and vowels) and articulatory speed, and linguistic processing in children
problems with prosody; these characteristics result in and adolescents with severe traumatic brain injury. Journal
speech intelligibility deficits. of Speech and Hearing Research, 38, 864–875.
Clark, H. M. (2003). Neuromuscular treatments for speech in school-aged children with cerebral palsy and speech
and swallowing: A tutorial. American Journal of Speech- impairment. International Journal of Speech-Language Pathol-
Language Pathology, 12, 400–415. ogy, 16, 386–395.
De Smet, H. J., Catsman-Berrevoets, C., Aarsen, F., Verhoeven, Nordberg, A., Miniscalco, C., Lohmander, A., & Himmel-
J., Mariën, J., & Paquier, P. F. (2012). Auditory-perceptual mann, K. (2013). Speech problems affect more than one
speech analysis in children with cerebellar tumours: A long- in two children with cerebral palsy: Swedish population-
term follow-up study. European Journal of Pediatric Neurol- based study. Acta Pædiatrica, 102, 161–166.
ogy, 16, 434–442. Pennington, L., Miller, N., Robson, S., & Steen, N. (2010).
Foo, Y., Guppy, M., & Johnston, L. M. (2013). Intelligence Intensive speech and language therapy for older children
assessments for children with cerebral palsy: A systematic with cerebral palsy: A systems approach. Developmental
review. Developmental and Child Neurology, 55, 911–918. Medicine and Child Neurology, 52, 337–344.
Haarbauer-Krupa, J., Lee, A. H., Bitsko, R. H., Zhang, X., Pennington, L., Roelant, E., Thompson, V., Robson, S., Steen,
& Kresnow-Seddaca, M. J. (2018). Prevalence of parent- N., & Miller, N. (2013). Intensive dysarthria therapy for
reported traumatic brain injury in children and associated younger children with cerebral palsy. Developmental Medi-
health conditions. JAMA Pediatrics, 172, 1078–1086. cine and Child Neurology, 55, 464–471.
Hodge, M. M., & Wellman, L. (1999). Management of chil- Platt, L. G., Andrews, G., Young, M., & Quinn, P. T. (1980).
dren with dysarthria. In A. J. Caruso & E. A. Strand (Eds.), Dysarthria of adult cerebral palsy: I. Intelligibility and
Clinical management of motor speech disorders in children (pp. articulatory impairment. Journal of Speech and Hearing
209–280). New York, NY: Thieme. Research, 23, 28–40.
Lee, J., Hustad, K. C., & Weismer, G. (2014). Predicting speech Reid, S. M., Meehan, E. M., Arnup, S. J., & Reddihough, D. S.
intelligibility with a multiple speech subsystems approach (2018). Intellectual disability in cerebral palsy: A population-
in children with cerebral palsy. Journal of Speech, Language, based retrospective study. Developmental Medicine and
and Hearing Research, 57, 1666–1678. Child Neurology, 60, 687–694.
Levy, S. E., Chang, Y. M., Ancelle, J. A., & McAuliffe, M. J. Schölderle, T., Staiger, A., Lampe, R., Strecker, K., & Ziegler,
(2017). Acoustic and perceptual consequences of speech W. (2016). Dysarthria in adults with cerebral palsy: Clini-
cues for children with dysarthria. Journal of Speech, Lan- cal presentation and impacts on communication. Journal of
guage, and Hearing Research, 60, 1766–1779. Speech, Language, and Hearing Research, 59, 216–229.
Liégeois, F., Tournier, J-D., Pigdon, L., Connelly, A., & Mor- Solomon, N. P., & Charron, S. (1998). Speech breathing in
gan, A. T. (2013). Corticobulbar tract changes as predic- able-bodied children and children with cerebral palsy:
tors of dysarthria in childhood brain injury. Neurology, 80, A review of the literature and implications for clinical
926–932. intervention. American Journal of Speech-Language Pathol-
Morgan, A. T. Liégeois, F., Liederkerke, C., Vogel, A. P., Hay- ogy, 7, 61–78.
ward, R. Harkness, W., . . . Vargha-Khadem, F. (2011). Role Stavsky, M., Mor, O., Mastrolia, S. A., Greenbaum, S., Than,
of cerebellum in fine speech control in childhood: Persis- N. G., & Erez, O. (2017). Cerebral Palsy — Trends in epide-
tent dysarthria after surgical treatment for posterior fossa miology and recent development in prenatal mechanisms
tumour. Brain and Language, 117, 69–76. of disease, treatment, and prevention. Frontiers in Pediat-
Morgan, A. T., Mageandran, S-D., & Mei, C. (2009). Incidence rics, 5, 21. doi:10.3389/fped.2017.00021
and clinical presentation of dysarthria and dysphagia Tamburrini, G., Frassanito, P., Chieffo, D., Massimi, L.,
in the acute setting following pediatric traumatic brain Caldarelli, M., & Di Rocco, C. (2015). Cerebellar mutism.
injury. Child: Care, Health, and Development, 36, 44–53. Child’s Nervous System, 31, 1841–1851.
Morris, C. (2007). Definition and classification of cerebral Weir, F. W., Hatch, J. L., McRacken, T. R., Wallace, S. A., &
palsy: A historical perspective. Developmental Medicine and Meyer, T. A. (2018). Hearing loss in pediatric patients with
Child Neurology, 49, 3–7. cerebral palsy. Otology and Neurotology, 39, 59–64.
Musselman, K. E., Stoyanov, C. T., Marasigan, R., Jenkins, Wimalasundera, N., & Stevenson, V. L. (2016). Cerebral palsy.
M. E., Konczak, J., Morton, S. M., & Bastian, A. J. (2014). Practical Neurology, 16, 184–194.
Prevalence of ataxia in children. Neurology, 82, 80–89. Workinger, M. S. (2005). Cerebral palsy resource guide for
Nordberg, A., Miniscalco, C., & Lohmander, A. (2014) Con- speech-language pathologists. Clifton, NY: Thomson Delmar
sonant production and overall speech characteristics Learning.
17
Fluency Disorders
Introduction Incidence and Prevalence

of Stuttering
A diagnosis of stuttering is made when a child or adult
cannot maintain the normal flow of speech even when A great deal of research has been devoted to the inci-
there is no firm evidence of a disease process or struc- dence and prevalence of “childhood-onset fluency
tural problem affecting the peripheral speech mecha- disorder” (Yairi & Ambrose, 2013). Childhood-onset
nism (respiratory system, larynx, upper articulators) or fluency disorder (hereafter, developmental stutter-
the brain. Typically, the diagnosis is made by listening ing), classified in the Diagnostic and Statistical Manual
to the child (or adult) and counting dysfluencies. of Mental Disorders (American Psychiatric Associa-
Historically, stuttering has been mysterious and tion [APA], 2013) as a neurodevelopmental disorder,
intriguing because it is a speech problem of unknown begins in very early childhood, perhaps as early as
cause. In keeping with contemporary scientific and 2 years of age. “Incidence” is the percentage of new
clinical literature on fluency disorders, we use the term diagnoses of a disorder within the population, over a
“people who stutter” (PWS), among whom are chil- given time period (such as a year). “Prevalence” is an
dren who stutter (CWS) and adults who stutter (AWS). estimate of the percentage of the population who stut-
Stuttering behavior is variable. PWS may have ter at a particular point in time. The contrast between
severe episodes of stuttering in some situations but incidence and prevalence of all cases of stuttering
not others. PWS may enjoy extended periods of appar- (including CWS and AWS) is particularly important
ent fluency, preceded and followed by long periods of to understanding the developmental nature of the
severe stuttering behavior. Within a matter of minutes, disorder.
certain types of speech may be produced with severe The focus in this chapter is on developmental stut-
stuttering, whereas other types (e.g., singing, perhaps tering because the great majority of cases are diagnosed
talking to a pet) may be completely fluent. Stuttering is well before the age of 6 years. One estimate of the aver-
therefore not a fixed set of symptoms, but rather a vari- age onset age of stuttering is 33 months.
able set of behaviors. For children who are diagnosed A conservative estimate of the incidence of stut-
with stuttering around age 3 years and who continue tering is 5% to 8% of the population (Yairi & Ambrose,
to stutter past the age of 5 or 6 years and into the teen- 2013). As previously noted, this estimate includes all
age years and adulthood, nonspeech behaviors may be children and adults newly diagnosed within a fixed
associated with stuttering (Table 17–1, discussed later). period of time. In contrast, the prevalence of stuttering
229
Table 17–1. A Brief, Natural History of Stuttering
Age Range
Stage (years) Core Behaviors Secondary Behaviors
Typical dysfluencies 1.5–6 • Low number of dysfluencies None
• Part-word or single syllable
repetitions
• Possible sound prolongations
and “tense” pauses
• Revisions more frequent
over time
Borderline stuttering 1.5–6 • High number of dysfluencies Rare

• Multiple syllable repetitions
Beginning stuttering 2–8 • Rapid, multiple repetitions • Escape behaviors (e.g.,

with tension eyeblinks, “um’s”)
• Blocks • Frustration (awareness)
• Difficulty initiating words
Intermediate stuttering 6–13 • Blocks • Escape behaviors to

• Repetitions, prolongations terminate blocks
• Avoidance behaviors
(sounds, words, situations)
Advanced stuttering >14 • Long, tense blocks • Sophisticated escape and

• Repetitions, prolongations avoidance behaviors
• Lip, jaw tremors
Note. It has been argued that a group of children, perhaps about one-third of all children who are diagnosed with child-
hood stuttering, do not begin stuttering behavior with easy repetitions, as summarized in text and this table. Rather, these
children show initial symptoms that look more like intermediate or advanced stuttering, with tense blocks and struggle
behavior. This sudden onset of more severe symptoms, rather than the gradual onset and progression summarized in
this table, does not seem to predict which children will have persistent stuttering and which children will have natural
recovery. See Guitar (2005) and Yairi & Ambrose (2005) for more details. Adapted from Stuttering: An Integrated Approach
to Its Nature and Treatment, by T. J. Peters and B. Guitar, 1991, Baltimore, MD: Williams & Wilkins; Stuttering: An Integrated
Approach to Its Nature and Treatment (3rd ed), by B. Guitar, 2005, Baltimore, MD: Lippincott Williams & Wilkins.
— all CWS and AWS — is 1% of the population, or even about equally for boys and girls. Girls, however, are
a bit lower (Yairi & Ambrose, 2013). much more likely than boys to recover naturally.
The lower prevalence, compared with the inci- Roughly 80% of children who recover fluency after
dence, reveals a critical characteristic of stuttering: the early diagnosis are girls. Because of the difference
roughly 80% of children who are diagnosed with stut- in recovery rates between girls and boys, the ratio of
tering as toddlers recover without therapy. This “natu- males to females who stutter in teenage years and
ral recovery,” which currently is not well understood, adulthood is about 4:1.
results in a higher incidence than prevalence. Many There is little evidence that the incidence and prev-
children included in the incidence estimate do not con- alence of stuttering varies in a significant way across
tribute to the prevalence estimate — they become flu- countries, cultures, races, and socioeconomic groups.
ent and are not included in the estimate of “percentage Thus, the population of any of these groups (countries,
of people who currently have the disorder.” Children cultures, and so forth) can be multiplied by1% to arrive
who do not recover fluency in childhood are said to at the number of people who currently stutter in that
have persistent stuttering. group (see http://www.nsastutter.org/ and https://
Diagnoses of stuttering in toddlerhood are made www.stutteringhelp.org/).
Genetic Studies Ambrose and Yairi (1999) used the term “stuttering-
like dysfluencies” (SLDs) to identify instances of dysflu-
Research since the 1960s and extending to the present ency that are typically not seen in children who develop
time leaves little doubt that stuttering or a predisposition fluent speech. More specifically, children perceived to be
to stuttering is transmitted from parent to child (Etch- stuttering, by parents or other caretakers, made multiple
ell, Civier, Ballard, & Sowman, 2018; Smith & Weber, repetitions of single syllables (in multisyllabic words) or
2017). As reviewed by Peters and Guitar (1991), “family of single-syllable words. “Multiple repetitions” means
tree” studies from the 1960s and 1970s showed that PWS at least three to five syllable repetitions (e.g., “buh-buh-
were far more likely to have a first-degree relative who buh-buh . . . ”). These repetitions were produced easily,
stuttered, when compared with a control group of flu- with no apparent struggle, and rapidly. Children devel-
ent speakers. Twin studies (see Andrews, Morris-Yates, oping fluent speech often produced only a single repeti-
Howie, & Martin, 1991; Felsenfeld et al., 2000) added to tion of a single syllable or a single-syllable word.
the evidence for a genetic component in stuttering. The Ambrose and Yairi (1999) computed the number of
results of these studies were striking. If one member of SLDs per 100 syllables spoken by fluent children and
an identical twin pair stuttered, the likelihood of stutter- by children with suspected stuttering. Children with
ing in the other member of the pair was about 70%. In typically developing fluency produced, on average,
contrast, if one member of a fraternal (nonidentical twin less than one SLD per 100 syllables spoken. Children
pair) stuttered, there was only a 25% to 30% likelihood of with suspected stuttering onset produced at least three
stuttering in the other member of the pair. Identical twins such dysfluencies per 100 syllables. Although there
are genetically the same; fraternal twins are genetically were other types of SLDs, multiple syllable repetitions
no more similar than any siblings from the same family. were the dominant sign of the onset of developmental
The much higher likelihood of stuttering in both mem- stuttering. Types of SLDs are listed in Table 17–2.
bers of identical twins, as compared to fraternal twins,
provides strong evidence of a genetic component in stut-
tering. Frigerio-Domingues and Drayna (2017) provide The Natural History of
an up-to-date review of twin studies and stuttering. Developmental Stuttering
The specific nature of the genetic component in
developmental stuttering remains unknown. Candi- Developmental stuttering can have a sudden onset in
date genes for stuttering have been suggested but for toddlers, with symptoms that develop slowly or rapidly.
the time being are best regarded as hypotheses rather Because the development of stuttering occurs through-
than settled fact (Frigerio-Domingues & Drayna, 2017; out the same time period as speech-sound and language
Yairi & Ambrose, 2013). Whatever genetic component development, a potential interaction between fluency
exists in developmental stuttering is probably more a and speech and language skills has been discussed in
predisposition for stuttering, rather than a guarantee of the literature (see Smith & Weber, 2017, for a summary).
stuttering (see Chapters 7 and 15 for similar observa- These interactions are discussed later in this chapter.
tions on a genetic component in pediatric language
disorders, and speech sound disorders, respectively).
This is consistent with the high but not perfect occur- Table 17–2. Types of Stuttering-Like Dysfluencies
rence of stuttering in identical twins. As noted by Smith (SLDs) Observed in Children Who Stutter (CWS)
and Weber (2017), the “expression” of genes is shaped
by the environment. If there is a stuttering gene (or SLD Example
genes), it does not guarantee stuttering in a child who
Sound, syllable and/ “b-b-b-but . . . ”
has received the gene(s) from a parent. or single-syllable “buh-buh-buh . . . ” (for “but”)
repetition
Diagnosis of Developmental Whole-word repetition “but, but, but . . . ”

Stuttering Sound prolongations “Ffffffffffffine”
Dysfluencies are a normal part of speech and language Blocks Inaudible stoppage of speech
learning in toddlers. How, then, is stuttering diagnosed with mouth open or closed;
in a young child? How are normal dysfluencies distin- inability to initiate sounds
guished from dysfluencies that suggest a diagnosis of Note. Adapted from https://www.asha.org/practice-portal/clinical-
developmental stuttering? topics/childhood-fluency-disorders/
Early symptoms of developmental stuttering, such mechanism — that is undergoing growth and matura-
as the easy repetitions described earlier, to advanced tion of motor control. Some hesitations, repetitions, and
stuttering behavior can be described as a sequence of movement prolongations should be expected, as with
behaviors. We call this the natural history of persis- the initial learning of any complicated skill.
tent stuttering. The abbreviated description given here The distinction between typical dysfluencies and
owes much to information published by Peters and SLDs is important — children who are diagnosed with
Guitar (1991), Guitar (2005), Yairi and Ambrose (1999), developmental stuttering are best served when they
and Yairi and Seery (2015). receive clinical services at the earliest possible age
Table 17–1 provides a summary of this history. As (Conture, 2001).
in all cases of normal and disordered speech and lan- Table 17–1 contrasts “typical” dysfluencies like the
guage development, child-to-child variability is to be “mi-milk” or “do-doggie” examples given, with SLDs
expected. The age ranges and characteristic symptoms such as “Mi-mi-mi-milk” or “Do-do-do-doggie.”
listed in Table 17–1 are general guidelines to the devel- Even when a young (preschool) child produces
opment of stuttering, rather than fixed milestones. multiple repetitions and is considered for a diagnosis
Table 17–1 contains columns labeled Core and of developmental stuttering, the repetitions are likely
Secondary Behaviors. The distinction is important in to sound relaxed and without struggle. As the fluency
the natural history of developmental stuttering. Core problem develops, frequent signs of struggle with the
behaviors are actual speech behaviors produced by a flow of speech are important in the diagnosis of stutter-
person who stutters. These include syllable and word ing as a potentially persistent problem.
repetitions, “blocks” (stoppages) in the stream of Research suggests that typical dysfluencies are
speech, and prolongations of speech sounds. fairly rare when actually counted. In Table 17–1, the
Secondary behaviors are learned behaviors in entry “low number of dysfluencies” for Stage I indi-
PWS. These are reactions to the difficulty of producing cates the relative rareness of dysfluent events in typical
a fluent stream of speech. Secondary behaviors include speech-language development.1
almost anything with the purpose of avoiding or escap- Note in Table 17–1 the fairly wide age range for
ing an ongoing or anticipated SLD (Peters & Guitar, typical dysfluencies during speech and language devel-
1991; Guitar, 2005). Secondary behaviors include opment. Peters and Guitar (1991) summarize research
“um”s, hesitations, and body part movements such as suggesting that, as typically developing children mas-
blinking or turning the head (as well as others). ter speech production skills, part-word and whole-
word repetitions become less frequent, and revisions of
the repetitions more frequent. Children age 2 years and
Stage I: Typical Dysfluencies 5½ years may both have “normal” dysfluencies, but the
younger child repeats more, whereas the older child
As noted earlier, toddlers have dysfluencies that are has few repetitions but more revisions (e.g., “I was
typically single repetitions of syllables or words. Exam- going . . . when I left I went . . . and before I left . . . ”).
ples of these typical dysfluencies are part-word (“mi-
milk”) or whole-word (“milk-milk”) repetitions and
single repetitions of syllables in multisyllabic words (as Stage II: Borderline Stuttering
in “do-doggie eat”). Sound prolongation (“sssssssee”)
is another type of typical dysfluency. Many scholars Borderline stuttering covers the same age range as the
interested in the development of stuttering have sug- period of “typical” dysfluencies (Table 17-1). Border-
gested that these early dysfluencies make sense because line stuttering is distinguished from typical dysfluency
children are learning complex aspects of language such in the following two ways: (a) dysfluency becomes
as syntax, vocabulary, and grammatical morphology. In more frequent in children with borderline stuttering,
addition, children are trying to output this complicated compared with typically developing children, and
new stuff of language through a system — the speech (b) part- or whole-word repetitions include three or
1
he actual number of dysfluencies per 100 words that distinguishes typically dysfluent children from those who are diagnosed later with
T
stuttering varies somewhat in different studies. Peters and Guitar (1991) summarized research to the date of publication of their text and sug-
gested that typically developing children had fewer than 10 dysfluencies per 100 words spoken, whereas children who stutter (or who would
eventually be diagnosed with stuttering) had 16 to 20 dysfluencies per 100 words. Yairi and Ambrose (1991) used a much more stringent cri-
terion, separating beginning stutterers from those with “typical” dysfluencies with a criterion of 3 dysfluencies per 100 words spoken — more
than 3/100, and the child was considered to be atypically dysfluent.
more consecutive repetitions in borderline stuttering. and may feel a good deal of frustration with the inabil-
For the most part, the child who meets criteria for bor- ity to produce fluent speech.
derline stuttering shows no secondary behaviors. Occa-
sionally, the child with borderline stuttering may begin
to show some frustration with the repetitions. Stage IV: Intermediate Stuttering
As noted by Guitar (2005), there is substantial
gray area between the stages of typical dysfluency and Intermediate stuttering is an elaboration of behaviors
borderline stuttering. The behaviors outlined here for seen in beginning stuttering. Blocks, repetitions, and
the two stages may blend together and even shift over sound prolongations are heard in the child’s speech
time. Sometimes the child may seem to have typical with increasing frequency and severity. In the interme-
dysfluencies; other times, SLDs suggest borderline diate stage of stuttering, the child uses escape behav-
stuttering. The child who fits criteria for borderline iors to release blocks and other dysfluencies, and
stuttering may, in fact, become fluent in the near future; begins to use and refine avoidance behaviors. Avoid-
a smaller number of children in this category continue ance behaviors may be associated with specific speech
to show stuttering behavior. sounds and/or words, communication partners, and
communication situations. For example, the child in
the intermediate stage of developmental stuttering
Stage III: Beginning Stuttering may have sufficient experience with dysfluency to
know that the “m” sound is particularly difficult and
Beginning stuttering, which may be diagnosed across may begin to avoid words beginning with this sound.
a wide age range of 2 to 8 years but rarely after about Or, the child may have a history of dysfluency with a
6 years of age, differs from borderline stuttering in person’s name, and avoid saying it.
several ways. First, the multiple, part-, or whole-word Avoidance of specific sounds and words reflects
repetitions are produced very rapidly. These repeti- anticipation of upcoming dysfluencies, and often
tions sound less controlled than the easy, relaxed rep- leads to a search for a different way to communicate
etitions heard in borderline stuttering. The repetitions the same thought. Such behavior may involve revi-
of beginning stuttering also appear to be produced sions, hesitations, and insertions of “um’s” to give the
with tension, indicating signs of struggle with the flow child time to find an “easier” way to produce the mes-
of speech. sage. Children in the intermediate stage of stuttering
Another sign of tension in beginning stuttering begin to connect their stuttering with negative conse-
is the presence of blocks — the complete stoppage of quences, and, like all of us, look for ways to avoid the
speech. The child seems to get “stuck” on a speech negative feelings.
sound. Blocks may occur with the child’s mouth open,
or closed, but it is clear that he or she is trying to pro-
duce speech but cannot release the air required to Stage V: Advanced Stuttering
maintain the flow of syllables. Blocks may last for a
second or two, or much longer.2 The age range of 14 years and older for advanced
Beginning stuttering is often accompanied by stuttering (see Table 17–1) includes individuals who
the onset of secondary behaviors. Now the child who have been dysfluent for many years and have sub-
stutters incorporates escape behaviors into his or her stantial experience with the entire spectrum of stutter-
attempt to speak. These behaviors are meant to assist ing behaviors. The core behaviors include long, tense
the child in reestablishing the flow of speech when it is blocks, part- and whole-word repetitions, and visible
interrupted by a series of repetitions or blocks or sound lip and jaw tremors (Smith & Weber, 2017). A distin-
prolongations (Peters & Guitar, 1991). Eye blinks, head guishing characteristic of advanced stuttering is the
nods, or even slapping a hand against the body, are sophistication of learned and practiced escape and
used to “release” a stuttering episode, to trigger the avoidance behaviors. For example, one of the authors
resumption of the smooth flow of speech. The child of the Peters and Guitar (1991) textbook recounts a
who meets the criteria for the beginning stage of stut- behavior he used, as a young man, to avoid placing
tering shows awareness of his or her fluency problems food orders in restaurants. He knew that the pressure to
2
hen I was an undergraduate student, I had a clinical assignment in a summer residential program for AWS. The day I was introduced to
W
one of the young men assigned to work with me, the client began to introduce himself and when he came to his name, he had an open-mouth
block that lasted well over 30 seconds.
speak his order quickly would cause him to be dysflu- speech motor skills may also be important factors in
ent. He waited until he saw the server approaching the persistent stuttering versus recovery from stuttering.
table, let his friends know what he wanted to order, and Factors that appear to contribute to recovery of flu-
then excused himself to go to the rest room. Personal ency versus persistence of stuttering are summarized
accounts from adults who stutter reveal many of these in Figure 17–1 (see Ambrose & Yairi, 1999; Conture,
types of avoidance behaviors (https://www.mnsu.edu/ 1999; Nippold, 2001, 2002; Walsh et al., 2018; Watkins,
comdis/kuster/PWSspeak/PWSspeak.html). Yairi, & Ambrose, 1999; and Yairi & Ambrose, 2005 for
additional information on the controversy surround-
ing factors that account for persistence versus recovery
Recovery of Fluency from stuttering).
An important component of the natural history of

stuttering is the trajectory of the stages outlined in Possible Causes of Stuttering
Table 17–2. For example, when a child appears to be
in Stage III of this natural history, with multiple repeti- Three theories of stuttering are considered here. These
tions, blocks, and evidence of frustration in attempts to include psychogenic, learning, and biological linguistic
deal with dysfluency, does he necessarily move through theories. The theories are labeled this way for conve-
the final two stages and into advanced stuttering? nience of presentation; they are not necessarily mutu-
This question has occupied a good number of sci- ally exclusive in their proposed explanations. The first
entists whose findings agree in a general sense, but two theories are presented only briefly. These theories
perhaps not specifically. Yairi and Ambrose (1999) provide historical perspective but among contempo-
concluded that many children diagnosed with stutter- rary researchers and clinicians are not widely accepted.
ing around the age of 3 years stop stuttering within a
year or two (or even sooner) and become fluent speak-
ers. This recovery from the diagnosis of stuttering in Psychogenic Theories
young children seems to be a natural process, one that
does not require (although may be helped by) clinical Psychogenic theories explain stuttering as a neurosis.
treatment. The neurotic basis of stuttering may include hostility,
Recall that about four of five children who are repression of unwanted feelings, phobias, as well as
diagnosed with stuttering recover, and become fluent other psychological constructs often associated with
speakers (Yairi & Ambrose, 1999). The question con- Freudian psychopathology (see summaries of psycho-
cerning the natural, or spontaneous, recovery of stut- genic theories of stuttering in Owens, Metz, & Haas,
tering has focused on the factors that may explain why 2003, and Ramig & Shames, 2006). In contemporary
children recover fluency after an initial diagnosis of thinking about stuttering, a psychogenic basis for the
developmental stuttering. One factor is clearly the sex disorder is largely discounted because evidence of
of the child: females are much more likely to recover common personality traits among PWS is not compel-
fluency compared with males (see sections “Incidence ling, and psychotherapy does not seem to lessen stut-
and Prevalence of Stuttering”). tering symptoms. PWS seem to be just like everyone
Recovery of fluency may also be predicted by else, except that they stutter.
family history (Walsh et al., 2018; Yairi, 2007). Children This conclusion may need to be qualified by con-
with a documented family history of persistent stut- sidering the differences between underlying causes
tering are less likely to recover fluency than children of a disorder, and psychological outcomes of hav-
whose family history of persistent stuttering is less cer- ing a disorder. Tran, Blumgart, and Craig (2011), in a
tain. These facts suggest that persistent stuttering has a review of the literature on psychological issues among
genetic component (see sections, “Genetic Studies” and PWS, failed to find a persuasive case for psychologi-
“Possible Causes of Stuttering”). cal problems as the cause of stuttering. Results of their
Recovery of fluency may also be linked to lan- own study, however, in which AWS filled out a ques-
guage skills and temperament at the time of diagno- tionnaire concerning their levels of anxiety, mood, and
sis, but the research findings on these factors are much other psychological states, suggested differences from
less convincing than the sex and family history fac- a control group. PWS were more likely (by self-report)
tors. Other factors, such as age at onset of stuttering than control participants to have anxiety and negative
symptoms, type of dysfluency (e.g., repetitions versus mood states. Tran et al. regarded these findings as indi-
tense blocks), the child’s phonological development cators of the potential effect of stuttering on psycho-
(development of the sound system), and a child’s logical well-being, rather than the cause of stuttering.
BEGINNING STUTTERING
Re
%)
co
20
ve
(~
re
nt
d(
ste
~8
rsi
0%
Pe
)
Male Female
Family members with history Lack of family history

Age at onset?
Age at onset?
Type of SLDs?
Type of SLDs?
Developmental speech sound
and/or language disorders? Typical speech and language
development?
Temperament?
Figure 17–1. Variables likely to be associated with persistent versus recovered stut-
tering in children.
Learning Theories or elimination of the negative emotions when fluency

follows repetitions or blocks, is viewed as a likely set-
Learning theories of stuttering propose that some chil- ting for stuttering as learned behavior. To add to this
dren learn to be dysfluent during a period of typical logic, reviews of the efficacy (effectiveness) of speech
dysfluencies. A child develops an association between therapy in CWS and AWS indicate that therapies using
normal dysfluencies and fear of speaking; when the rather simple principles of learning can reduce stutter-
dysfluencies are “released,” an immediate reduction ing to a significant degree (Bothe, Davidow, Bramlett,
of fear reinforces dysfluency, making it more likely to & Ingham, 2006). If we follow the logic used earlier to
occur in the future. Over time, this pattern of dysflu- discount neurotic theories of stuttering, the relative
ency, fear, release of the dysfluency with its reduction success of many of these “learning” therapies could be
in anxiety and fear, leads to chronic stuttering as a taken as “proof” for stuttering as learned behavior.
learned habit.
Learning theories of stuttering are controversial.
There is little doubt that as stuttering develops, emo- Biological Theories
tions such as fear, anxiety, and shame increase as the
child finds it more difficult to produce fluent speech A biological theory of stuttering implies a physical or
and be an effective communicator. Moreover, it makes physiological basis for the disorder (or both). Biological
sense that chronic, speaking-related emotions compli- explanations for stuttering focus on brain differences
cate and possibly undermine the child’s attempts to between PWS and fluent speakers, because there is
produce fluent speech. no evidence that articulators such as the tongue, lips,
As a child who stutters matures through adoles- and jaw are different in PWS than in fluent speakers.
cence and young adulthood, and has lengthy expe- In most cases, these hypothesized brain differences are
rience with advanced stuttering, a deeply rooted thought to be present at birth.
association of frustration, anxiety, fear, and shame Over the past several decades, three types of
with the act of speaking may be established. It is easy evidence have been used to support a brain basis for
to see how this association, together with the reduction childhood-onset fluency disorder. Differences in brain
anatomy and brain physiology between CWS and In another example, the size difference in many
children who are fluent constitute two of the three humans between the left- and right-hemisphere pla-
types of evidence. The third type of evidence, which num temporale — larger in the left hemisphere, con-
may be the basis for the first two, suggests that stutter- sistent with left lateralization of speech and language
ing is a genetic disorder. The three types of evidence — has been shown to be eliminated or even reversed in
are not independent. A genetic basis for stuttering AWS, but not in CWS. This makes it unclear if the loss
may include a difference in the development of speech of asymmetry in AWS is due to the origin of stuttering
motor control mechanisms in the central nervous (e.g., is part of “programmed brain growth” in PWS) or
system. is the result of chronic stuttering into adulthood which
results in changes in brain anatomy (Chang, Erickson,
Ambrose, Hasegawa-Johnson, & Ludlow, 2008).
Anatomical Differences
White matter differences between PWS and fluent
Sophisticated brain imaging techniques are available to adults (or the comparison of CWS and fluent children)
address the question of anatomical differences between include differences in the corpus callosum (the fiber
PWS and fluent speakers. Studies have examined the tract that connects the two hemispheres). Specifically,
relative size of gray matter in brain locations known the corpus callosum has been reported to have greater
to be part of the speech production network, as well as volume in CWS compared with fluent children (Etch-
the connections (white matter) between these locations. ell et al., 2018). This may be consistent with increased
A related question has been the possibility of different gray matter on the right side of the brain of PWS. Com-
patterns of anatomical asymmetry in the cerebral hemi- pensation by the right side of the brain for dysfunc-
spheres of PWS compared with fluent speakers. Sum- tional left-side speech structures may require that more
maries of this work are found in Etchell, Civier, Ballard, information be transferred to the right hemisphere
and Sowman (2018) and Ingham, Ingham, Euler, and during speech. To accommodate this increased flow of
Neumann (2018). information, a larger interhemispheric (between hemi-
Recall from Chapter 2 the size differences between spheres) fiber tract may be needed.
structures in the left and right hemispheres: in a fluent There is also evidence that white matter connec-
speaker, left-side structures that are part of the speech tions between brain structures for speech and lan-
and language network are often larger than corre- guage, such as between Wernicke’s and Broca’s areas or
sponding right-side structures, apparently consistent between cortical and subcortical areas for speech motor
with the left-side dominance for speech and language control, are less well developed in CWS compared with
structure. fluent children (Chang, Zhu, Choo, & Angstadt, 2015).
The findings of several relevant studies of AWS are Poor development of white matter tracts that are essen-
not always consistent, but Etchell et al. (2018) conclude tial to the speech motor control brain network may
that compared to fluent adults, two structures in the play a role in childhood stuttering.
right hemisphere of AWS have more gray matter. These
structures are the right-hemisphere areas in the fron-
Physiological (Functional) Differences
tal lobe corresponding more or less with Broca’s area
in the left hemisphere, and the planum temporale, an Functional neuroimaging studies (using, for example,
auditory area important for speech perception. In other the functional magnetic resonance imaging [fMRI]
words, the asymmetries observed in fluent speakers technique described in Chapter 2) have revealed dif-
may be reduced or not observed in PWS. ferences in brain activity patterns for speech produc-
At first glance, these claims for the brain structures tion in PWS, compared with fluent people. Brown,
in PWS contradict the typical lateralization of speech Ingham, Ingham, Laird, & Fox (2005) examined brain
structures to the left hemisphere. As argued by Etch- regions for fluent speakers as they produced speech
ell et al. (2018), though, the conclusion makes sense if and observed a small “core” set of active regions. These
poorly functioning speech motor control structures in regions included Wernicke’s and Broca’s areas, among
the left hemisphere, which presumably are one cause of others known to be involved in the motor control of
stuttering, are compensated for by activity in the analo- structures such as the lips, tongue, and larynx. PWS
gous structures of the right hemisphere. It is as if the showed activation during speech of the same “core”
right hemisphere structures take over part or all of the regions but had either stronger- or weaker-than-normal
job of fluent speech production, and the assumption of activity in those areas. In addition, regions of the brain
greater speech activity requires more gray matter (see associated with auditory processing, typically active
Foundas, Bollich, Corey, Hurley, & Heilman, 2001; and in fluent speakers, showed little activity in PWS. This
Foundas et al., 2003). last finding is provocative in light of the planum tempo-
rale asymmetries discussed previously. Both the ana- A speech motor control perspective on develop-
tomical and physiological findings therefore suggest a mental stuttering is compelling for several reasons.
role of the brain’s auditory processing centers in stut- First, it explains why articulatory movements of PWS
tering, and perhaps specifically for the perception of are different from the articulatory movements of a flu-
speech and language and its integration with speech ent speaker — even during fluent utterances. (Zimmer-
motor control. man, 1980).
Second, it is consistent with different types of stut-
tering, most notably clonic versus tonic SLDs. “Clonic”
Speech Motor Control and Stuttering
is a term that describes rhythmic, repetitive movements
The possibility of developmental stuttering as a con- of a body part. Multiple, consecutive sound or syllable
sequence of immature or dysfunctional speech motor repetitions can be considered a type of speech clonus
control has been gaining traction over the past few (noun form of “clonic”). Similarly, “tonic” is a term that
decades. Execution of speech movements, such as describes a muscle contraction that is sustained for a
tongue speed and coordination of two or more articula- relatively long period of time. The long “blocks” seen
tors, is believed to be affected in toddlers who begin to in advanced stuttering can be regarded as a form of
stutter. As stated by Smith and Weber (2017), “Disflu- speech tonus.
encies arise when the motor commands to the muscles The terms “clonus” and “tonus” are, in fact, used
are disrupted, and normal patterns of muscle activity to describe signs of certain neurological diseases, but a
required for fluent speech are not generated” (p. 2487). caution is in order: although the terms have been used
Deficits in the planning of speech sound sequences to describe the SLDs of multiple repetitions and blocks,
are also thought to be a component of the speech motor respectively, they are not used to link stuttering with
control deficit. As discussed previously for apraxia of specific neurological diseases (Schwartz & Conture,
speech in adults (Chapter 14) and in children (Chap- 1988). However, these two SLD types fit with unin-
ter 15), the distinction between the planning and execu- tended and uncontrolled aspects of speech production
tion components of speech motor control is important. that seem explained better by a speech motor control
Execution is the direct control over movements of the deficit than by (for example) a learning theory.
speech mechanism (such as the tongue) by cells in the A third piece of evidence concerns the hypothe-
primary motor cortex. Planning is the preparation of sized speech motor planning component of a speech
a program for the execution of movements, which motor control deficit. Stuttering is not a linguistic, equal-
includes (at least) the selection of speech sounds, their opportunity speech disorder. As reviewed by Anderson,
ordering, and the commands for intended movements. Pellowski, and Conture (2005), stuttering is most likely
A speech motor control plan can be assembled without to occur on low-frequency words, one of the first three
executing the plan — it is like a mental representation words of an utterance, function words in young CWS
of what is intended. In contrast, execution of the plan versus content words in older CWS and AWS, and lon-
is the result of the commands from the primary motor ger, grammatically complex utterances (for grammati-
cortex that have direct control over the muscles. Smith cal complexity, see Melnick & Conture, 2000).
and Weber (2017) say, “It has been hypothesized that An argument can be made that each of these four
the underlying speech motor deficit in adults with per- linguistic conditions requires more speech motor plan-
sistent stuttering is a failure to form stable underlying ning skills than its opposite. For example, lipstick and
motor programs for speech” (p. 2487). Smith and Weber rabbit are low- and high-frequency words, respectively,
support the idea of immature speech motor programs likely to be known by a 4-year-old. Frequently used
by citing their own work on articulatory stability (vari- words such as rabbit are produced so many times by
ability of articulator movements over multiple repeti- children, and therefore planned so many times, that
tions of a short phrase). Fluent adults did not improve the plan becomes more or less automatic. In contrast, a
their stability over many trials of the repetitions low-frequency word like lipstick that is said fewer times
because their speech was already planned at a mature may require more active planning, and thus be sub-
level. Conversely, AWS and fluent children showed ject to programming demands that challenge speech
improvement over trials, presumably in the stabil- motor control maturity in CWS. Similarly, longer and
ity of the program. When the planned speech sound more grammatically complex utterances such as Where
sequence is stable, so are the executed movements. does he go when he is hungry? are likely to require more
AWS improved because they had immature speech sophisticated planning skills than the shorter and less
motor programs and therefore had room to improve. complex, Where does he go?
Fluent children improved because their speech motor Finally, an explanation of developmental stut-
control is still maturing; they too had room to improve. tering as a speech motor control deficit may seem
incompatible with the possibility that CWS and AWS position away from persistent stuttering and toward
are more likely to stutter when they are anxious (e.g., recovery of fluency (Smith & Weber, 2017).
Davis, Shisca, & Howell, 2007). How does a speech
motor control perspective on stuttering accommodate
the effect of anxiety on either speech motor planning Acquired (Neurogenic)
and/or execution? Although this question has no cur- Stuttering
rent answers, two observations from the research litera-
ture are relevant to an increased understanding of why Adults who have been fluent throughout their lives
motor planning and execution might be expected to be (with some exceptions) and who acquire brain dam-
sensitive to fluctuating states of anxiety. age from (for example) a stroke (Theys, van Wierin-
The quality of any motor control task, including gen, Sunaert, Thijs, & De Nil, 2011) or traumatic brain
speech production, is likely to be sensitive to a person’s injury (Penttilä, Korpijaakko-Huuhka, & Kent, 2019)
anxiety level at the time the task is performed. For may have stuttering as a speech-language problem. It
example, fine motor control and precise coordination is called “neurogenic” stuttering to recognize its cause
required for finger movements during skilled piano from a documented brain injury/damage. Neurogenic
performance deteriorates to various degrees when the stuttering is rare and is not well understood.
person playing the piano is anxious (Kotani & Furuya, Two questions are often asked about neurogenic
2018). The cognitive resources associated with plan- stuttering. First, are the symptoms of neurogenic stut-
ning and execution of finger movements may be com- tering like those seen in developmental stuttering?
promised by anxiety. This suggestion has been made Second, which neurological diseases have neurogenic
directly for the case of speech motor planning and stuttering as a possible symptom, and when neuro-
execution by Hennessy, Dourado, and Beilby (2014), genic stuttering occurs in one of these diseases, which
who showed in an experimental task that severity of parts of the brain are likely to be damaged? The sum-
stuttering in a verbal task was related to the current mary that follows is based on the reviews cited previ-
anxiety state in the person who stutters. Nonverbal ously, as well as articles cited in those reviews.
responses (pressing a button to indicate an answer to
the same question posed in a verbal-response task)
did not show this relationship between anxiety and Symptoms of Neurogenic
button-press errors. Hennessy et al. concluded that the Stuttering Compared With
relationship between variations in stuttering severity Developmental Stuttering
and variations in anxiety was specific to speech, and
probably was due to anxiety competing for response A significant amount of attention has been devoted to
resources with speech motor planning and execution. the similarity (or dissimilarity) of stuttering symptoms
in acquired (neurogenic) as compared to developmental
stuttering. The reason for making this comparison is, in
Speech Motor Control and Developmental
a broad sense, to determine if the two kinds of stutter-
Stuttering: A Summary
ing are the “same” thing. Think of the comparison as an
Smith and Weber (2017) suggest the idea that the experimental hypothesis: If symptoms in acquired and
onset and progression of stuttering in children can be developmental stuttering are clearly different, the dis-
explained by the brain anatomy and physiology dif- orders may share a name (stuttering) but are different
ferences previously described, and (importantly) by types of communication disorders. This has implications
the child’s ability to find a compensation for these dif- for both the clinical management of the disorders and
ferences. Two young children who are diagnosed with their underlying theories. But if the symptoms in the
stuttering at age 3 years, for example, may have equal two disorders are the same, stuttering may be viewed
levels of immaturity in brain structures and physiol- as the same phenomenon in both children and adults.
ogy for speech. One child may not recover fluency and This potential outcome, in which stuttering behavior is
have persistent stuttering into adulthood; the other essentially the same in developmental and acquired ver-
may become fluent within months. This hypothetical sions of the disorder, may be regarded as consistent with
comparison illustrates how genetic predispositions for a biological view of all stuttering behavior.
stuttering do not guarantee persistent stuttering. Both As noted earlier, stuttering episodes in devel-
children may have the same, genetically determined, opmental stuttering are not found equally at word
immature brain structures and physiology for speech. beginnings and endings, or on content versus function
The child who becomes fluent does so because an words. Early case reports of acquired (neurogenic) stut-
environmental influence “shapes” the genetic predis- tering suggested that patients may not follow this pat-
tern, but rather stutter frequently at word endings and date have spontaneous recovery. A review of tech-
on function words. More recent data, however, suggest niques used to treat developmental stuttering in pre-
as much similarity as difference for the actual types and school children is by Shenker and Santayana (2018).
locations of dysfluencies observed for developmental
and acquired stuttering (Theys, van Wieringen, & De
Nil, 2008). Types and locations of dysfluencies do not
Chapter Summary
seem to reveal a clear-cut distinction between develop-
mental and acquired stuttering.
Another well-known feature of developmental Stuttering is a speech-language disorder in which the
stuttering, at least in the intermediate and advanced smooth, fluent stream of speech is interrupted by rep-
stages (see Table 17–1), is the presence of secondary etitions, blocks, complete stoppages of speech, and
characteristics that include struggle and release behav- revisions; its cause is unknown.
iors associated with repetitions, prolongations, and Developmental stuttering begins in early child-
blocks. Some case reports (articles written to describe hood and usually involves a slow progression of
a single patient’s behavior, common in the medical lit- symptoms in which the “typical” dysfluencies of early
erature) indicated that secondary characteristics did speech and language development increase in fre-
not occur in acquired stuttering. Other cases show quency and severity as the child matures.
struggle, avoidance, and release behaviors similar to A “natural history” of developmental stuttering
those seen in the later stages of developmental stut- describes how these symptoms change from the early,
tering and in AWS. Theys et al. (2008) concluded that typical dysfluencies to the later blocks, repetitions, and
the presence versus the absence of secondary behaviors especially struggle behaviors that are characteristic of
does not seem to provide a reliable distinction between advanced stuttering.
acquired and developmental stuttering. As many as 80% of children suspected of having
In summary, evidence collected so far has not a stuttering disorder recover with or without therapy.
shown a clear distinction between the core or second- The likelihood of recovery from childhood stuttering is
ary characteristics of acquired and developmental stut- greater for girls, as compared to boys, and for children
tering. A future analysis may reveal a clear distinction, who do not have a first-degree relative who has or had
but for the present it seems best to regard the symp- a stuttering problem.
toms of the two forms of the disorder as overlapping. There are several different theories of stuttering.
Psychological theories, which regard stuttering as
an expression of a neurosis, were popular at one time
Treatment Considerations but do not have much scientific support.
Learning theories are based on the assumption
Many different behavioral techniques exist for the treat- that the normal dysfluencies of early childhood become
ment of stuttering, in both children and adults. In chil- a chronic pattern as the child learns to associate stutter-
dren past the age of 9 or 10 years, and in adults, several ing episodes with negative outcomes, and the “release”
of these techniques seem to work for many, as evalu- of the stuttering episodes with positive outcomes; in
ated by very strict scientific criteria (Baxter et al., 2016; these theories, the child becomes conditioned to stutter.
Bothe, Davidow, Bramlett, & Ingham, 2006). The tech- Biological theories hold that there is some brain
niques work not only in the therapy session, where a difference or dysfunction that explains stuttering; in
person who stutters can reduce or eliminate stuttering contemporary scientific discussions, biological theories
episodes under controlled circumstances and with the are very much intertwined with the idea of a genetic
help of a clinician, but also in real-world talking situ- basis for stuttering.
ations. PWS who have been treated by an SLP have a The genetic basis for stuttering is supported by
real hope that stuttering, and the various feelings and (a) the greater likelihood of stuttering among relatives
behaviors associated with it, can be brought under of PWS, as compared to relatives of fluent speakers;
some degree of control. and (b) the greater occurrence of stuttering in monozy-
Evaluation of treatment effectiveness in preschool gotic as compared to dizygotic twins.
children is complicated by the 80% spontaneous recov- Dysfluencies are sensitive to the linguistic struc-
ery rate of children who have an initial diagnosis of ture of an utterance; these linguistic factors can be inte-
beginning stuttering. Treatment may be the reason for grated with biological theories of stuttering.
recovery of fluency, or may be due to natural recovery Neurogenic stuttering is the term used to describe
(Salturklaroglu & Kalinowski, 2005). Of course, treat- the onset of stuttering in adulthood, usually as a result
ment may hasten recovery in children who at a later of neurological disease such as stroke or head injury.
The symptoms of neurogenic stuttering are in ronmental etiology of stuttering in a selected twin sample.
some ways like those of developmental stuttering, but Behavior Genetics, 30, 359–366.
in some cases, there may be differences (such as the Foundas, A. L., Bollich, A. M., Corey, D. M., Hurley, M., &
absence of struggle behaviors in neurogenic stuttering, Heilman, K. M. (2001). Anomalous anatomy of speech-
language areas in adults with persistent developmental
at least according to some reports).
stuttering. Neurology, 57, 207–215.
Evidence exists in the scientific literature that
Foundas, A. L., Corey, D. M., Angeles, V., Bollich, A. M.,
developmental stuttering can be treated successfully. Crabtree-Hartman, E., & Heilman, K. M. (2003). Atypical
cerebral laterality in adults with persistent developmental
stuttering. Neurology, 63, 1640–1646.
References Frigerio-Domingues, C., & Drayna, D. (2017). Genetic con-
tributions to stuttering: The current evidence. Molecular
Ambrose, N., & Yairi, E. (1999). Normative dysfluency data Genetics and Genomic Medicine, 5, 95–102.
for early childhood stuttering. Journal of Speech, Language, Guitar, B. (2005). Stuttering: An integrated approach to its nature
and Hearing Research, 42, 895–909. and treatment (3rd ed). Baltimore, MD: Lippincott Williams
Anderson, J. D., Pellowski, M. W., & Conture, E. G. (2005). & Wilkins.
Childhood stuttering and dissociations across linguistic Hennessy, N. W., Dourado, E., & Beilby, J. M. (2014). Anxi-
domains. Journal of Fluency Disorders, 30, 219–253. ety and speaking in people who stutter: An investigation
Andrews, G., Morris-Yates, A., Howie, P., & Martin, N. (1991). using the emotional Stroop task. Journal of Fluency Disor-
Genetic factors in stuttering confirmed (Letter). Archives of ders, 40, 44–57.
General Psychiatry, 48, 1034–1035. Ingham, R. J., Ingham, J. C., Euler, H. A., & Neumann, K.
Baxter, S., Johnson, M., Blank, L., Cantrell, A., Brumfitt, S., (2018). Stuttering treatment and brain research in adults:
Enderby, P., & Goyder, E. (2016). The state of the art in A still unfolding relationship. Journal of Fluency Disorders,
non-pharmacological interventions for developmental 55, 106–119.
stuttering. Part 1: A systematic review of effectiveness. Kotani, S., & Furuya, S. (2018). State anxiety disorganizes
International Journal of Language and Communication Disor- finger movements during musical performance. Journal of
ders, 50, 676–718. Neurophysiology, 120, 439–451.
Bothe, A. K., Davidow, J. H., Bramlett, R. E., Franic, D. M., Melnick, K. S., & Conture, E. G. (2000). Relationship of length
& Ingham, R. J. (2006b). Stuttering treatment research and grammatical complexity to the systematic and non-
1970–2005: II. Systematic review incorporating trial qual- systematic speech errors and stuttering of children who
ity assessment of pharmacological approaches. American stutter. Journal of Fluency Disorders, 25, 21–45.
Journal of Speech-Language Pathology, 15, 342–352. Nippold, M. A. (2001). Phonological disorders and stuttering
Bothe, A. K., Davidow, J. H., Bramlett, R. E., & Ingham, R. J. in children: What is the frequency of co-occurrence? Clini-
(2006a). Stuttering treatment research 1970–2005: I. Sys- cal Linguistics and Phonetics, 15, 219–228.
tematic review incorporating trial quality assessment of Nippold, M. A. (2002). Stuttering and phonology: Is there an
behavioral, cognitive, and related approaches. American interaction? American Journal of Speech-Language Pathology,
Journal of Speech-Language Pathology, 15, 321–341. 11, 99–110.
Brown, S., Ingham, R. J., Ingham, J. C., Laird, A. R., & Fox, P. Owens, R. E., Metz, D. E., & Haas, A. (2003). Introduction to
T. (2005). Stuttered and fluent speech production: An ALE communication disorders: A life span approach (2nd ed.). Bos-
meta-analysis of functional neuroimaging studies. Human ton, MA: Allyn & Bacon.
Brain Mapping, 25, 105–117. Penttilä, N., Korpijaakko-Huuhka, A. M., & Kent, R. D. (2019).
Chang, S-E., Erickson, K. I., Ambrose, N. G., Hasegawa-John- Disfluency clusters in speakers with and without neuro-
son, M. A., & Ludlow, C. L. (2008). Brain anatomy differ- genic stuttering following traumatic brain injury. Journal
ences in childhood stuttering. NeuroImage, 39, 1333–1344. of Fluency Disorders, 59, 33–51.
Chang, S. E., Zhu, D. C., Choo, A. L., & Angstadt, M. (2015). Peters, T. J., & Guitar, B. (1991). Stuttering: An integrated
White matter neuroanatomical differences in young chil- approach to its nature and treatment. Baltimore, MD: Wil-
dren who stutter. Brain, 138, 694–711. liams & Wilkins.
Conture, E. G. (2001). Stuttering: Its nature, diagnosis, & treat- Ramig, P. R., & Shames, G. H. (2006). Stuttering and other dis-
ment. Needham Heights, MA: Allyn & Bacon. orders of fluency. In N. B. Anderson & G. H. Shames (Eds.),
Davis, S., Shisca, D., & Howell, P. (2007). Anxiety in speakers Human communication disorders: An introduction (7th ed.,
who persist and recover from stuttering. Journal of Fluency pp. 183–221). Boston, MA: Pearson Education.
Disorders, 40, 398–417. Salturklaroglu, T., & Kalinowski, J. (2005). How effective is
Etchell, A. C., Civier, O., Ballard, K. J., & Sowman, P. F. (2018). therapy for childhood stuttering? Dissecting and rein-
A systematic literature review of neuroimaging research terpreting the evidence in light of spontaneous recovery
on developmental stuttering between 1995 and 2016. Jour- rates. International Journal of Language and Communication
nal of Fluency Disorders, 55, 6–45. Disorders, 40, 359–374.
Felsenfeld, S., Kirk, K. M., Zhu, G., Statham, D. J., Neale, M. Schwartz, H., & Conture, E. (1988). Subgrouping young stut-
C., & Martin, N. G. (2000). A study of the genetic and envi- terers. Journal of Speech and Hearing Research, 31, 62–71.
Shenker, R. C., & Santayana, G. (2018). What are the options Watkins, R., Yairi, E., & Ambrose, N. (1999). Early childhood
for the treatment of stuttering in preschool children? Semi- stuttering. III: Initial status of expressive language abili-
nars in Speech and Language, 39, 313–323. ties. Journal of Speech, Language, and Hearing Research, 42,
Smith, A., & Weber, C. (2017). How stuttering develops: The 1125–1136.
multifactorial dynamic pathways theory. Journal of Speech, Yairi, E. (2007). Subtyping stuttering. I: A review. Journal of
Language, and Hearing Research, 60, 2483–2505. Fluency Disorders, 32, 165–196.
Theys, C., van Wieringen, A., & De Nil, L. (2008). A clinician Yairi, E., & Ambrose, N. G. (1999). Early childhood stutter-
survey of speech and non-speech characteristics of neuro- ing. I: Persistency and recovery rates. Journal of Speech,
genic stuttering. Journal of Fluency Disorders, 33, 1–23. Language, and Hearing Research, 42, 1097–1112.
Theys, C., van Wieringen, A., Sunaert, S., Thijs, V., & De Nil, Yairi, E., & Ambrose, N. G. (2005). Early childhood stuttering.
L. F. (2011). A one year prospective study of neurogenic Austin, TX: Pro-Ed.
stuttering following stroke: Incidence and co-occurring Yairi, E., & Ambrose, N. G. (2013). Epidemiology of stutter-
disorders. Journal of Fluency Disorders, 44, 678–687. ing: 21st century advances. Journal of Fluency Disorders, 38,
Tran, Y., Blumgart, E., & Craig, A. (2011). Subjective distress 66–87.
associated with chronic stuttering. Journal of Fluency Dis- Yairi, E. H., & Seery, C. H. (2015). Stuttering: Foundations and
orders, 36, 17–26. clinical applications (2nd ed.). New York, NY: Pearson.
Walsh, B., Usler, E. Bostian, A., Mohan, R., Gerwin, K. L., Zimmerman, G. (1980). Articulatory dynamics of fluent utter-
Brown, B., . . . Smith, A. (2018). What are predictors for ances of stutterers and nonstutterers. Journal of Speech and
persistence of childhood stuttering? Seminars in Speech and Hearing Research, 23, 95–107.
Language, 39, 299–312.
18
Voice Disorders
Introduction order. These effects include lost workdays; voice disor-

ders have an economic impact.
Chapter 10 describes the role of the vibrating vocal Voice disorders have many different causes.
folds as the primary sound source for speech. The vocal A selected group of voice disorders are presented in
folds, contained within the cartilage framework of the this chapter. The term dysphonia is used to indicate voice
larynx, vibrate periodically and generate a tone whose characteristics that sound abnormal. More specifically,
pitch is proportional to the rate of vibration. The pro- dysphonia is the auditory impression of abnormality in
duction of tone by the vibrating vocal folds is called the pitch, loudness, and/or quality of the voice. Dys-
phonation. phonia may also be used to describe a voice perceived
Perceptual impressions of voice production include as being produced with unusual effort.
pitch, loudness, and quality. The physical (acoustic) Dysphonia may or may not have an obvious cause.
bases of these perceptual impressions are fundamen- The present chapter focuses on adult voice disorders;
tal frequency (F0), intensity (amount of sound energy), pediatric voice disorders are discussed briefly at the end
and spectrum (mix of periodic and noise characteristics of the chapter. (See https://www.asha.org/PRPSpecific
produced by the vibrating vocal folds). Topic.aspx?folderid=8589942600&section=Overview
The first sign of a voice disorder is often a percep- for a comprehensive overview of voice disorders.)
tion of voice abnormality, either by the person produc-
ing the voice or by listeners. The pitch may seem too
low for the speaker’s gender and/or age, the loudness Epidemiology of
too soft, or the quality too rough or breathy. Alterna- Voice Disorders
tively, a first sign of a voice disorder may be a sense
of pain, unusual effort, or tightness in the neck when How prevalent are voice disorders in the general popu-
producing phonation. lation, and are there specific factors that make people
Voice disorders are not only a concern for their more or less likely to have had (or have) a voice disor-
social implications — listeners often do not “like” der? Among people who have been diagnosed with a
abnormal-sounding voices — but also for their potential voice disorder, which voice disorders are most common?
to affect careers. Teachers with serious voice problems Roy, Merrill, Gray, and Smith (2005) conducted
cannot teach, or can teach but with reduced effective- a survey in which approximately 6.5% of the respon-
ness. Actors, tour guides, singers, athletic coaches, and dents claimed they were experiencing a voice disorder.
other professionals are greatly affected by a voice dis- In a survey of 14,794 young adults, Bainbridge, Roy,
243
Losonczy, Hoffman, and Cohen (2017) found that 6% The voice therapist wants to know if a patient is a
reported having a voice problem over the previous professional voice user or has a recent or chronic his-
12 months. tory of unusual voice usage, such as overuse of the
A prevalence estimate of 6% to 7% for voice dis- voice (e.g., shouting, screaming), if there is extended
orders among the general population is startling. In use of the voice every day, as in teachers (Martins,
a state such as Wisconsin, which has a population Pereira, Hidalgo, & Tavares, 2014), and if there is use
of approximately 5,800,000 people, Roy et al.’s work of the voice in a unusual way for long periods of time.
(2005) suggests that 350,000 people have had a voice Professional voice users may develop inflammation
problem over a 12-month period. Voice disorders can of the vocal folds or vocal fold fatigue from excessive,
be short term (such as laryngitis), or longer term, as high-effort phonation. Excessive, high-effort phonation
described later in this chapter. can also occur when the voice is pushed to its operat-
The prevalence of voice disorders reported by sur- ing limits, as in the parent who screams at a child, or a
veyed individuals increases with age, is greater with a child who screams when he or she plays.
family history of voice disorders, is greater for females The voice therapist also wants to know about the
compared with men, and is higher among professional patient’s history of tobacco and alcohol use, current or
voice users (e.g., teachers; Roy et al., 2005) than in the previous use of therapeutic and/or recreational drugs,
general population and if there is a history of frequent laryngopharyngeal
reflux (LPR, a technical name for acid indigestion).
Smoking and drinking alcohol (especially in combina-
Initial Steps in the Diagnosis tion), certain drugs, and LPR are all known to be causes
of Voice Disorders of inflammation and in some cases permanent tissue
change in the vocal folds.
Patients are referred by primary physicians, or refer
themselves, to a voice clinic because of a perceived
change in voice production and/or a sense of extreme Perceptual Evaluation of the Voice
effort during phonation. The voice therapist goes
through a series of steps to diagnose the presence and Voice therapists are trained to listen carefully for
type of voice disorder. These steps include a case his- perceptual signs (voice characteristics) associated
tory, perceptual evaluation of the client’s voice, viewing with specific disorders. The terms used most often to
of the vocal folds via a laryngeal mirror (Chapter 10), describe disordered voice characteristics include, but
recording of vocal fold motion during phonation by are not limited to, rough, breathy, weak, strain, hoarse,
videostroboscopy, and measurement of basic voice spasm, pitch, and loudness; overall severity is also an
parameters. Information gathered in these assessments important perceptual evaluation of a voice disorder
contributes to a diagnosis, which in turn is the basis (Kempster, Gerratt, Verdolini Abbott, Barkmeier-Krae-
of a treatment plan. The treatment plan may include mer, & Hillman, 2009).
behavioral therapy which involves direct modification Perceptual impressions, even those of carefully
of voice, and/or indirect therapy such as counseling trained voice therapists, do not provide specific infor-
which addresses psychological aspects of the voice mation concerning the underlying cause of a voice dis-
disorder. The treatment plan may also include sur- order. For example, the impression of a breathy voice
gery to correct a structural problem of the vocal folds. may be the result of vocal fold paralysis, the aging pro-
Behavioral and surgical treatments are combined for cess, or vocal nodules. Or, a breathy voice quality may
certain diagnoses. not be associated with an underlying disease — it may
be simply one of the variants of voice quality heard in
the population. Perceptual impressions of voice quality
Case History are an important step in the diagnostic process, espe-
cially when they serve the purpose of narrowing down
As in any type of medical setting, a case history is hypotheses of the underlying cause of a voice disorder.
critical to accurate differential diagnosis of a disease or
condition. Differential diagnosis is the systematic iden-
tification of the cause or causes of a symptom (or symp- Viewing the Vocal Folds
toms) by ruling out likely candidates. For example,
when a patient reports hoarseness for 2 weeks, many Indirect laryngoscopy is a technique used to view the
possible causes may explain the voice disorder. A care- vocal folds (Chapter 10, Figure 10–1). The examiner
ful case history can rule out some or most of these can- places the mirror close to the back of the throat while
didates prior to more in-depth testing. holding the tongue with a gauze pad. A strong light
is aimed at the mirror and reflected to illuminate the Measurement of Basic

vocal folds. The laryngeal mirror examination is per- Voice Parameters
formed first as the patient is breathing normally, rather
than during phonation. This allows the examiner to The present discussion focuses on three acoustic mea-
see obvious lesions on the vocal folds, such as nodules, surements that are used often to describe voice char-
polyps, or other growths, and may also provide evi- acteristics: F0 (perceptual correlate = pitch), intensity
dence of vocal fold paralysis, if it exists. (perceptual correlate = loudness), and spectrum (per-
The patient is then asked to phonate while the ceptual correlate = voice quality).
examiner views the vocal folds. The individual cycles
of vocal fold vibration are not visible to the examiner
F0 (Pitch)
because they are faster than the time-resolving ability
of the human eye. If there is a lesion on one or both F0 is the number of cycles of vocal fold vibration com-
vocal folds, however, the examiner may be able to see pleted in 1 second. The perceptual correlate of F0 is
whether or not it interferes with vocal fold closure pitch. All other things being equal, as F0 increases so
during phonation. Videostroboscopy provides a slow- does voice pitch.
motion view of the vibrating vocal folds that allows an F0 has been widely studied in the normal popula-
examiner a more detailed evaluation of possible causes tion. A graphic summary of F0 data by age, for both
of voice disorders. (See Google “laryngeal videostro- males (curve with blue points) and females (curve with
boscopy” for video clips of vibrating vocal folds.) pink points), is shown in Figure 18–1. Age is shown on
300
260
Fundamental frequency (Hz)
220
Female
180
140
Male
100
0 10 20 30 40 50 60 70 80 90 100
Age (years)
Figure 18–1. Plot of fundamental frequency (F0) across age for males (blue points) and females (red
points). Age is on the x-axis, and F0 is on the y-axis. Adapted from Hixon, T. J.,Weismer, G., and Hoit, J. D. (2020).
Preclinical speech science: Anatomy, physiology, acoustics, perception (3rd ed.). San Diego, CA: Plural Publishing.
the x-axis, F0 in Hz on the y-axis. The curves reflect data ness of the background noise. Voice intensity is also
from many sources in the literature, including Baken likely to change with the distance between a speaker
and Orlikoff (2000), Kent (1976), Nishio and Niimi and the listener. To maintain a constant loudness for
(2008), and Lee, Potamainos, & Narayanan (1999). the listener as the distance between speaker and lis-
F0 is related to the size of a person’s larynx — in tener increases, voice intensity must increase as well.
general, the larger the larynx, the lower is the F0. This is because sound energy decreases as it travels
Prior to puberty, the F0 values of males and females over distance. An increased intensity of the voice is a
are similar, because sex-specific anatomical charac- normal adjustment when speaking to someone located
teristics of the larynx do not appear before puberty.1 at a distance.2
Around puberty, the size of both the male and female How do SLPs judge a person’s voice loudness
larynx grow, with more dramatic growth in males. as typical or atypical? Like F0, there is a great deal of
This is reflected in Figure 18–1 in which male F0 low- speaker-to-speaker variation in voice intensity (and
ers substantially around puberty relative to female F0. hence, perceived voice loudness) that is accepted by lis-
Postpuberty F0 values are close to adult values, which teners as “normal.” In most cases, however, voice loud-
remain relatively constant until old age. Female adults ness that calls attention to itself is not subtle. The case
between ages 20 and 70 years have F0 values between of an unusually soft voice, or the less-frequent case of a
190 and 210 Hz. Over this same age range, F0 for males chronically loud voice, is sufficiently noticeable to elim-
ranges between 115 and 135 Hz. This sex-related differ- inate the need for precise measures of voice intensity as
ence in F0 explains why males are typically perceived a requirement to perform further diagnostic tests.
to have lower-pitched voices than females. The diagnostic evaluation of voice intensity is
In old age, the F0 of females decreases (probably further complicated by factors that are internal to the
as a result of hormone changes), and the F0 of males speaker. An individual may speak with what appears
increases (probably as a result of increased stiffness of to be normal loudness but complain about the effort
the vocal folds and the cartilages of the larynx). required to make herself heard, even when the com-
Diagnosis of a voice disorder in which pitch seems munication setting is quiet. Or, a speaker may feel as if
abnormal can make use of the average data shown in he is exerting a normal effort for voice intensity but in
Figure 18–1. These data are a rough guide to “normal.” fact sound too soft. The case history includes questions
However, people with heathy voices, at any given age, to determine the individual’s judgment of effort for the
have a range of F0 values. The F0 values in voice prob- production of loudness appropriate for the communi-
lems with unusual pitch are usually very different from cation situation.
average F0 values such as those shown in Figure 18–1.
Voice Spectrum (Voice Quality)
Intensity (Loudness)
An acoustic spectrum is defined as the relative ampli-
Voice intensity is not precisely the same as voice loud- tudes of the many frequency components that make up
ness. Voice intensity refers to the amount of acoustic a sound. In the sound generated by the vibrating vocal
energy generated by the vibrating vocal folds and folds, the voice spectrum has energy at the F0, as well
modified as the energy passes through the vocal tract as a series of harmonics (sometimes called overtones).
and exits the lips. Voice loudness refers to a listener’s These harmonic components are found at frequencies
perception of the voice. Greater voice intensity typi- 2 times, 3 times, 4 times, . . . n times the F0.3
cally results in greater voice loudness. The precise details of the voice spectrum are
Voice intensity is likely to change with the loud- related to voice quality. For example, breathy voices
ness of background noise, as well as with the distance have fewer frequency components (harmonics) com-
between the speaker and the listener. For example, at pared with normal voices, as well as a substantial
a party where many people are talking, voice intensity degree of noise (aperiodic energy), which is not typi-
must be greater than usual to be heard above the loud- cal of the normal voice. Strained voice qualities (often
1
ubtle signs of sexual dimorphism in the human larynx may begin to appear several years before puberty, and these may account for the
S
slightly lower F0 values seen for males, as compared to females, around 8 to 9 years of age.
2
he relationship between sound intensity and distance from the sound source is described by the inverse square law. This law states that
T
sound intensity decreases from its source at a rate proportional to the inverse of the square of the distance from that source (in formulese,
Sound intensity ~ 1/d2, where d = distance from the source). Also, the relationship between sound intensity and loudness, like that between
frequency and pitch, is not one-to-one.
3
or example, a voice spectrum for a speaker whose F0 = 100 Hz has harmonics at 200, 300, 400, 500, . . . n × F0. The amplitude of the harmonics
F
decrease with increasing frequency.
called “pressed” qualities) have many harmonics with Classification/Types of

excessive intensity. Voice spectra are obtained with Voice Disorders
speech analysis computer programs and are used clini-
cally as an objective, acoustic measure of voice quality. Voice disorders can be classified in several different
What is the value of objective measures of voice ways but do not always fit neatly into one classifica-
production? Scientists and clinicians may prefer acous- tion or another. The classifications presented here are
tic measures of voice production, compared with per- not mutually exclusive, as explained later. Specific
ceptual measures which are notoriously unreliable examples of voice conditions and pathology are pre-
(Kreiman, Gerratt, Kempster, Erman, & Berke, 1993). sented and discussed within the framework of different
Acoustic measures of voice production may be better classification systems. Table 18–1 provides a summary
suited than perceptual measures as metrics of the effec- of ways to classify voice disorders; each of these clas-
tiveness of voice therapy. sification approaches is discussed in the next sections.
Table 18–1. A Summary of Alternative Classifications of Voice Production and Voice Disorders
Classification Description
Hypo-hyperfunctional A description of voice types based on the levels of contraction of laryngeal
muscles and how those levels affect the closing phase and closed phase of vocal
fold vibration. The hypo-hyperfunctional continuum includes many normal voice
types; voice types that are chronically close to or at the ends of the continuum are
often diagnosed as dysphonia.
Phonotrauma A classification based on voice disorders resulting from excessive phonatory

behaviors such as constant and effortful talking (e.g., teachers, actors), singing
(e.g., professional vocalists), and overdriving the phonatory mechanism
(e.g., yelling, screaming). Such phonatory behaviors may result in benign
mass lesions on the vocal folds, such as nodules and polyps. The dysphonia
resulting from phonotrauma may be of the hypo- or hyperfunction type. Chronic
hyperfunction (e.g., excessive screaming) may lead to nodules, which can result
in a hypofunctional-sounding voice, which makes the individual exert excessive
muscular force (hyperfunction) to overcome the difficulty in closing the vocal
folds due to the benign, bilateral masses.
Organic voice Organic voice disorders is a classification of voice disorders that are the result of
disorders benign vocal fold masses. These include (but are not limited to) nodules, polyps,
cysts, and granulomas. These benign masses have the potential to interfere with
vocal fold vibration by preventing adequate closure and stiffening the cover of the
vocal folds. The classification of organic voice disorders may include disorders
classified as resulting from phonotrauma.
Functional voice Dysphonia in the absence of observable pathology in the larynx, or known
disorders neurological disease. Functional voice disorders are in some cases called
psychogenic voice disorders, meaning that psychological issues are partly or
largely responsible for the voice disorders. Muscular tension dysphonia (MTD) is
an example of a functional voice disorder in which some of the cases are regarded
as psychogenic voice disorders. Puberphonia is another functional disorder that is
considered to be psychogenic.
Neurological voice Neurological voice disorders are those in which a known or suspected disease/
disorders condition of the peripheral or central nervous system is the cause of dysphonia.
A neurological voice disorder may be found in diseases of the central nervous
system (e.g., stroke, and degenerative diseases such as Parkinson’s disease and
multiple sclerosis) or damage to a peripheral nerve that supplies the muscles of
the larynx, as in unilateral vocal fold paralysis. Spasmodic dysphonia is a voice
disorder that has a suspected neurological cause.
Cancer of the larynx Cancer of the larynx, in which malignant tumors grow within the larynx, often on
or in the vocal folds, causes dysphonia that worsens as the cancer spreads.
The Hypo-Hyperfunctional ing which the vocal folds are closed. One cycle of vocal
Continuum fold vibration is marked.
Two parts of a single vocal fold cycle, the closing
The hypo-hyperfunctional continuum of vocal fold phase and the closed phase, are critical to understanding
vibration is best understood by a simple illustration of the functional variations of vocal fold vibration. The
how the opening, closing, and closed phases of vocal closing phase is the time between the maximum open-
fold cycles can be changed by muscular activity within ing of the glottis (at the top of the trace for each cycle)
the larynx. to the instant of vocal closure (the left-hand edge of the
Figure 18–2 is a schematic drawing of three consec- horizontal lines at the bottom; Figure 18–2). The closed
utive cycles of vocal fold vibration. The trace shows the phase is the portion of the cycle during which the vocal
space between the two vocal folds (that is, the glottis) folds are completely closed, as described earlier.
as they vibrate over time. The glottal space increases as The closing phase and closed phase intervals are
the vocal folds move apart and decreases as they come the basis of the hypo-hyperfunctional continuum (Fig-
together. As the trace moves up on the graph, the vocal ure 18–3). During normal vocal fold vibration, the clos-
folds are separating (opening); as it moves down, they ing phase occurs quickly. Once the vocal folds close, they
are coming together (closing). The horizontal line at the remain so for a significant portion of the cycle — nearly
bottom of the trace shows the portion of the cycle dur- 40% to 50% of the entire cycle time, which is called
One cycle
Close
Open
Time (sec)
Figure 18–2. A schematic drawing of three cycles of vocal fold vibra-

tion, drawn as the opening and closing of the vocal folds over time. The
horizontal lines show the portion of each cycle during which the vocal
folds are closed.
Average “normal” phonation
HYPOFUNCTION HYPERFUNCTION
Range of “normal” voice qualities
Insufficient muscular tension Excessive muscular tension

Slow closing phase Overly fast closing phase
Short closed phase Too-long closed phase
“Breathy” (weak) voice quality “Pressed” voice quality
Figure 18–3. The hypo-hyperfunctional voice continuum. See text for additional detail.
the period of vocal fold vibration. In certain voice dis- As shown in Figure 18–3, the hypo-hyperfunc-
orders, an excess of muscular tension in the larynx tional continuum of voice qualities includes a wide
may result in an excessively fast closing phase and an range of “normal” voices. An ideal voice does not exist,
overly long closed phase. The excess muscular tension either for the person producing voice or for the person
is called hyperfunction and results in a voice quality that listening to voice. A range of acceptable voice qualities
sounds overly effortful and strained. A hyperfunctional is produced among different people.
voice quality is often called “pressed,” as if the speaker An individual uses different voice qualities for
is pressing the vocal folds together too tightly. different circumstances. Many of these voice qualities
In contrast, insufficient muscular tension during vary within the normal range, but some may be out-
vocal fold vibration results in a very slow closing phase side the normal range, temporarily, to fit a situation, to
and overly short closed phase. This is called laryngeal make a point, to convey a message that supplements
hypofunction and is associated with a breathy, weak words. For example, speakers may use extreme hypo-
voice quality. function to produce a breathy voice outside the normal
The hypo-hyperfunctional continuum shown in range to comfort someone or express tender emotion.
Figure 18–3 reflects the concept of continuous varia- Anger may be expressed with pressed voiced resulting
tion in voice qualities between two extreme endpoints. from extreme hyperfunction.
One endpoint (hypo) reflects too little muscular ten- Extremely breathy or pressed voice is therefore
sion in the larynx, the other endpoint (hyper) too much not unusual in certain situations. These voice qualities
muscular effort. Any combination of closing phase become a clinical issue when they are used chronically
speed and closed duration may occur due to different — when they are a person’s typical voice quality.
amounts of muscular effort, resulting in many different
voice qualities between breathy/weak and strained/
pressed. Because the hypo-hyperfunctional voice con- Phonotrauma
tinuum includes normal voice production, it is not so
much a classification of voice disorders as a range of The vocal folds consist of delicate tissues that may be
phonation styles. The more extreme parts of the range damaged by certain phonation and lifestyle behaviors.
are often associated with voice disorders. Damaged vocal fold tissues interfere with the vibratory
motions that produce normal voice qualities. The term
“phonotrauma” is used to describe damaged vocal
Direct Control of the fold tissues (lesions) and the resulting dysphonia due
Closing and Closed to excessive phonatory behaviors and/or other causes.
Durations? The lesions caused by phonotrauma are referred to as
benign vocal fold lesions, to differentiate them from
When you change voice quality from breathy precancerous or cancerous lesions.
to pressed, the adjustments do not reflect Phonotrauma results from behaviors such as over-
conscious control of specific laryngeal muscles. use of high-intensity voice (as in some kinds of singing
If you make the strange decision to approach or acting), chronic screaming (or sometimes a single
someone on the street and ask, “Can you slow episode of very intense screaming), chronic throat
up the closing phase of your vocal fold vibra- clearing, and chronic use of a hyperfunctional voice
tion so we can see how it affects your voice?” quality. Vocal fold tissues may also be damaged when
in all likelihood a puzzled look will follow. But they are chronically exposed to environmental agents
ask someone to gradually change voice quality such as tobacco and alcohol, in individuals who experi-
between breathy, through “normal,” and then ence chronic reflux, and in persons who have chronic
with increasing tension (the layperson will not cough and/or throat clearing. The phonation behaviors
understand the term “pressed”) — most people that result in phonotrauma are usually associated with
understand, and they can do it. How do they hyperfunctional voice disorders — far to the right in the
do it? Most likely, they imitate a series of voice continuum of voice use (see Figure 18–3).
images, from auditory memories, connected For the discussion that follows, Figure 18–4 can be
with the terms “breathy,” “normal,” and “tense.” used as a reference for the appearance of healthy vocal
Like a continuous movement between smiling folds. The photo on the left shows the vocal folds open,
and frowning, people can do it easily but cannot during inhalation. The point of the “V” at the bottom
state conscious muscular strategies for how it of the frame is the most anterior part of the vocal folds,
is done. where they are attached to the inside of the thyroid
cartilage. The posterior attachment of the vocal folds
Arytenoid cartilage Vocal folds making contact

for closure during phonation
Anterior (Front)
Normal vocal folds Normal vocal folds

Open for inhalation Closed during phonation
Figure 18–4. Normal vocal folds as seen through an endoscope; the point of the “V” is the front
attachment of the vocal folds to the inside of the thyroid cartilage. Left, vocal folds open for inhalation,
note straight edge of both vocal folds at their medial boundary (next to the glottis). The space between
the vocal folds is the glottis; right, vocal fold closure for the closed phase during phonation, the incom-
plete rectangle shows firm closure front to back, with just a small opening at the back. Photos courtesy
of Professor Susan Thibeault, Department of Surgery, University of Wisconsin Clinical Sciences Center.
is to the arytenoid cartilages. The glottis is the space The location and appearance of vocal nodules
between the vocal folds. are shown in Figure 18–5. The photo on the left shows
Notice the smooth, straight edge of each vocal fold bilateral nodules on the open vocal folds. The nodules
as it extends from anterior to posterior along its border are the small “bumps” along the edge of each vocal
with the glottis. Each vocal fold is a mirror image of fold, about one third of the distance between the front
the other — they give the appearance of symmetry. The and back of the glottis. The bumps interrupt the smooth
photo on the right shows the vocal folds during the edges of the vocal folds, and are located at the same
closed phase of phonation. The vocal folds are pressed location on the two folds — they are symmetrical.
together firmly, from front to back, with only a slight Nodules occur one-third of the distance between
opening at the very back of the glottis. the front and back of the vocal folds because the
highest collision forces occur at this location when
the vocal folds come together for each cycle of vibra-
Vocal Nodules
tion. When these forces are chronically excessive, as
Chronic overuse of the voice, whether in singing, in behaviors such as yelling, screaming, and constant
screaming, or cheerleading, may result in growths on talking, nodules may develop and interfere with nor-
the vocal folds called vocal nodules (sometimes called mal phonation.
singers’ nodules). Vocal nodules are callous-like lesions Vocal nodules are likely to change voice quality
resulting from chronic slamming together of the vocal due to incomplete closure during vocal fold vibration.
folds during phonation. The nodules develop in much In Figure 18–5, the folds appear to be closed at the loca-
the same way as callouses develop on a gymnast’s or tion of the nodules, but not in front (forward toward
baseball player’s hands, or a guitarist’s fingertips. In the thyroid cartilage) or back (toward the arytenoid
the early stages, the growths are soft and blister-like. cartilages). Voice quality may be breathy and noisy due
As phonotrauma continues, the growths develop the to the escape of air through these openings in the vocal
hard, fibrous texture of a callous. folds. Attempts to overcome the effect of vocal nodules
Bilateral Nodules prevent full

Anterior (Front) nodules closure for phonation
Figure 18–5. Vocal fold nodules: Two endoscopic views of the vocal folds. Left, vocal folds open
for inhalation, with bilateral nodules indicated by pointers; right, vocal folds during the closed phase of
vocal fold vibration; nodules interfere with complete closure as seen in front of and in back of the point
of contact between the nodules. Photos courtesy of Professor Susan Thibeault, Department of Surgery,
University of Wisconsin Clinical Sciences Center.
on vocal fold closure with additional, excessive muscu- Other Benign Vocal Fold Lesions
lar effort may add strain to the voice quality (Leonard,
Benign vocal fold lesions are not limited to nodules and
2009). The phonotrauma that resulted in the nodules
polyps. Other phonotrauma-related damage to vocal
induces more phonotrauma in an attempt to overcome
fold tissue, including the temporary inflammation of
the poor closure due to the presence of the nodules.
viral laryngitis or the long-term inflammation due to
This may become a viscious cycle of voice behavior
chronic LPR, can result in dysphonia. Cysts (fluid-
which increases the size of the nodules and their effect
filled sacs) may also occur on or in the vocal fold. These
on vocal fold vibration.
benign lesions may interfere with vocal fold closure
The nodules also stiffen the outer layers of the
and/or stiffen the top layer of the vocal folds. Excellent
vocal folds, interfering with their motion and affect-
sources for information on benign vocal fold lesions
ing voice quality. Nodules grow in the outer layer of
are Altman, (2007), Naunheim and Carroll, (2017), and
the vocal folds, restricting the independent motion of
Sapienza and Ruddy (2017).
the different tissue layers. Partial loss of the wave-like
motion of the outer layer of vocal fold tissue relative
to inner layers has a significant effect on voice quality. Treatment of Phonotrauma
Vocal nodules are often treated with vocal rest — if the
Vocal Fold Polyps patient does not talk for a period of time, nodules may
Vocal fold polyps are the result of phonotrauma but disappear, much as callouses disappear when the irri-
have a tissue structure unlike the callous-like nodules. tating cause is removed. Polyps are more likely to be
Polyps are softer and often larger than nodules, and treated surgically, although they may also be treated
may occur as a result of long-term phonotrauma or with behavioral voice therapy.
even from a single instance of extreme vocal use, such Even when a benign vocal fold lesion is treated
as a particularly intense scream or cheer. Polyps are surgically, usually by removal of the mass as in the case
often unilateral, unlike the typically bilateral nodules. of polyps, voice therapists play an important role after
Polyps interfere with phonation in much the same way the surgery. Information on vocal hygiene — using the
as nodules, preventing firm vocal fold closure and inter- voice properly, avoiding overuse, restricting talking
fering with the wave-like motion of the vocal fold cover. time, maintaining proper hydration — can be structured
for the patient to achieve voice recovery. Vocal hygiene peutic focus is on voice behaviors. A patient diagnosed
programs are relevant to the behavioral treatment of with a psychogenic voice disorder may be best treated
vocal fold nodules and are often successful (Hosoya with combined psychiatric and voice therapy.
et al., 2018). There is no clear-cut distinction between func-
tional and psychogenic voice disorders. Patients with
functional voice disorders are likely to have varying
Organic Voice Disorders psychological issues such as anxiety and depression
(Andrea, Dias, Andrea, & Fugeira, 2017; Rosen, Heuer,
Dysphonias that are classified as organic voice disor-
Levy, & Sataloff, 2003). A voice disorder called muscu-
ders are often the result of phonotrauma that leads to
lar tension dysphonia (MTD) illustrates the potential
benign mass lesions — growths on the vocal folds such
role of psychological issues in a well-known functional
as nodules, polyps, and cysts (Carding et al., 2017). The
voice disorder.
presence of a benign mass lesion is a reason to clas-
sify a voice disorder as organic; a misuse of the voice
is a reason to classify the cause of a voice disorder as Muscular Tension Dysphonia
phonotrauma. This is an example of overlap between MTD is a voice disorder in which vocal fold vibration
classification categories for voice disorders. is disturbed by excessive tension in head and neck
muscles. Many of these muscles attach to laryngeal
Functional Voice Disorders cartilages. Hypercontraction of the muscles during
phonation may distort the position and shape of the
Dysphonia can exist in the absence of observable cartilages, the effect of which is to squeeze the vocal
pathology on or around the vocal folds. The phonation folds front to back and side to side, preventing normal
problem may include an inability to produce voice, an vibration.
extremely weak, whispery voice, or a voice interrupted Figure 18–6 shows the closed phase of vocal fold
by apparent spasms. When neurological disease and vibration for phonation in a speaker with healthy vocal
mass lesions are ruled out as explanations for dyspho- folds (left) and in a speaker with MTD (right). This
nia, a functional voice disorder may be diagnosed — one view shows how the laryngeal cartilages and the vocal
not explained by organic pathology. folds are squeezed together in MTD (the incomplete
Individuals diagnosed with functional dysphonia rectangle shows the comparative lengths of the vocal
are often professional voice users. This includes, but folds during phonation by the two individuals). Indi-
is not limited to, teachers, actors, clergy, singers, and viduals with MTD often experience voice fatigue and
tour guides. Extremely talkative and unusually loud neck pain when phonating.
individuals may also be at risk for a functional voice Although a large-scale study of the prevalence
disorder (Bastian & Thomas, 2016). and demographics of MTD has not been published, an
The term “psychogenic voice disorder” may be estimate of gender and age distribution can be made
used to classify a functional voice disorder with roots from various publications. da Cun Pereira, de Oliveira
in a psychiatric disorder. As such, psychogenic voice Lemos, Gadenz, and Cassol (2018), in a review of treat-
disorders may be considered as a subtype of functional ment success in MTD, found that 68% of 252 individu-
voice disorders. According to the famous speech-lan- als diagnosed with MTD were female. The age range of
guage pathologist Arnold Aronson (1990), a psycho- these individuals was 18 to 84 years. Dietrich, Verdolini
genic voice disorder “is a manifestation of one or more Abbott, Gartner-Schmidt, and Rosen (2008) reported
types of psychological disequilibrium, such as anxiety, similar results among 68 individuals with MTD: 82%
depression, conversion reaction, or personality disor- were female within an age range of 18 to 68 years. Simi-
der, that interferes with normal volitional control over lar gender and age data were reported by Eastwood,
phonation” (p. 131).4 Madill, and McCabe (2014). Taken together, MTD
The diagnosis of a functional versus psychogenic seems to be diagnosed largely in women and from
voice disorder may have important implications for people in young to elderly adulthood.
treatment. A patient diagnosed with a functional voice Voice disorders are due to a wide variety of
disorder may be treated by a voice therapist who has causes, ranging from the clearly organic, where there
experience in training patients to regain a normal voice; is a known lesion on the vocal folds, and neurological
psychiatric concerns are not significant, and the thera- disease, to functional disorders in which no underly-
4
Dr. Aronson lived from 1928 to 2018.
Anterior (Front)
Healthy phonation Muscular tension dysphonia

Vocal folds together for phonation Vocal folds together for phonation
No excessive contraction Hypercontraction, vocal folds compressed
side-to-side, front to back
Figure 18–6. Left, closed phase during normal vocal fold vibration. Right, closed phase during vocal
fold vibration in an individual with muscular tension dysphonia. Photos courtesy of Professor Susan Thi-
beault, Department of Surgery, University of Wisconsin Clinical Sciences Center.
ing physical cause is present and the disorder is likely dysphonia is described later in the section, “Neurologi-
to respond to behavioral voice therapy. MTD seems cal Voice Disorders.”
to have characteristics of both organic and functional
voice disorders. Hypercontraction of neck muscles
Treatment of MTD
may be the result of a speaker trying to compensate for
an organic problem in the larynx (e.g., a mass lesion The following information on treatment of MTD is
resulting from phonotrauma), and/or a symptom of based on Andreassen, Litts, and Randall (2017), da
underlying psychological issues (such as anxiety and Cun Pereira et al. (2018), Ramig and Verdolini (1998),
depression, all of which may be associated with per- and the ASHA website on voice therapy (https://www
sonality traits). As stated by Van Houtte, Van Lierde, .asha.org/PRPSpecificTopic.aspx?folderid=8589942600
and Claeys (2011), MTD is “the bridge between the &section=Treatment).
purely functional voice disorders . . . and prominent Voice therapy for MTD can be direct or indirect.
organic disorders” (p. 205). In a direct approach, the therapist works on the indi-
Speakers with MTD have strained voiced qualities vidual’s ability to produce a better voice. Indirect
that are perceived to be produced with excessive effort. approaches may involve education about voice produc-
Hoarseness and breathiness may also be heard. In some tion and vocal hygiene (taking good care of the voice
cases, individuals with MTD may be aphonic — unable mechanism), and counseling when anxiety and depres-
to produce phonation even when attempting to use sion are believed to play a significant role in MTD.
the voice. The general consensus is that voice therapy for
There is controversy surrounding the diagnosis MTD is often successful, but that a specific technique
of MTD. Most often, MTD is confused with a disorder among several available has not emerged as a clear
called spasmodic dysphonia, even when experienced choice for maximally effective treatment (Andreassen
voice therapists, voice scientists, and otolaryngologists et al., 2017). Psychological counseling may be a com
make the diagnoses (Ludlow et al., 2018). Spasmodic ponent of MTD, if the health care team believes there
is a mental health issue as one of the causes of the dis- Treatment of Puberphonia
order (Dietrich, Verdolini Abbott, Gartner-Schmidt, &
Puberphonia, a functional voice disorder considered
Rosen, 2008).
to have psychogenic origins, is treatable. Once patients
are shown the ability to generate a male-appropriate
Voice Therapy by Straw voice pitch, there seems to be little problem in main-
taining a gender-appropriate pitch (Roy, Peterson,
A therapy technique called “semi- Pierce, Smith, & Houtz, 2017).
occluded vocal tract” has shown promise in the
treatment of MTD. Recall from Chapter 10 that
phonation is initiated when the vocal folds are Neurological Voice Disorders
brought together and then blown apart when
the air pressure below them (in the trachea) is Neurologic disorders that affect phonation are seen fre-
sufficiently greater than the air pressure above quently in voice clinics (De Bodt et al., 2015). Degenera-
them (in the vocal tract). As long as this pressure tive neurologic diseases, such as Parkinson’s disease,
difference is maintained, the vocal folds vibrate, multiple sclerosis, and amyotrophic lateral sclerosis
each cycle of vibration defined by an opening, often have a voice problem as a prominent symptom.
closing, and closed phase. To a large degree, the The same can be said of strokes in adults, head injuries
force with which the vocal folds close against in children and adults, and congenital neurologic dis-
each other increases as the pressure difference eases such as cerebral palsy. The focus in this chapter is
across the vocal folds increases. In MTD, part on dysphonia in two neurological diseases: vocal fold
of the phonation problem is that the exces- paralysis and spasmodic dysphonia.
sive tension in head and neck muscles results
in overly forceful closure of the vocal folds. Unilateral Vocal Fold Paralysis
Using the semi-occluded vocal tract technique,
individuals learn to reduce the excessive closing Vocal fold paralysis has many causes, including inflam-
force by phonating into a straw. Phonation into matory conditions, neck or chest trauma, neck or chest
the straw (usually submerged in water) creates tumors, and surgical procedures. A diagnosis of vocal
a partial “block” of the air coming through the fold paralysis implies an absence of innervation by the
vocal tract, which raises the vocal tract pressure nervous system to one or both folds. There are also
to a value closer to the pressure below the vocal cases of vocal fold paresis (“paresis” means “weak”),
folds. This reduces the pressure across the vocal in which the affected fold is not completely paralyzed
folds and therefore reduces the force of vocal fold but is weakened to varying degrees by dysfunction of
closing. The reduced force is thought to lead to the nerves supplying the larynx (Syamal & Benninger,
a less-tense voice, which helps the patient learn 2016). Many of the symptoms of vocal fold paralysis
a more relaxed voice quality. Straw phonation and vocal fold paresis are similar; therefore, the follow-
is gradually eliminated during therapy, with a ing discussion focuses on paralysis.
goal of transferring the reduced-force vocal fold The nerves that control contraction of laryngeal
closing to more natural speech. muscles exit the brainstem and run down the neck and
chest where they are susceptible to compression trauma
(e.g., from upper chest injuries) and accidental surgi-
cal damage (e.g., removal of part or all of the thyroid
Puberphonia gland). Nerves supplying the larynx are on both sides of
Another functional voice disorder that responds well the neck; surgical or traumatic injuries may affect only
to treatment is puberphonia (sometimes called muta- one side, resulting in unilateral vocal fold paralysis.
tional falsetto). This is a disorder of voice pitch in which Figure 18–7 shows two images of unilateral vocal
a postpubescent male with a normal-sized larynx pro- fold paralysis, one during inhalation (left), and the other
duces phonation with very high F0, well outside the when the speaker was attempting to phonate (right).
range typical for adult males. Examination of the lar- The paralysis of the left vocal fold (right, in the image)
ynx fails to show any abnormality or underdevelop- is due to injury to the nerve that innervates the main
ment of laryngeal structures that would be consistent muscle of the vocal folds as well as muscles that open,
with the high-pitched voice. close, and compress them.5
5
he description of the specific nerve paralysis responsible for the vocal fold appearance applies to one branch of the nerves that supply the
T
larynx. There is a nerve that supplies a single muscle of the larynx — the muscle that stretches the vocal fold. This nerve can also be paralyzed,
but it is not considered further in this chapter.
Paralyzed vocal fold

Anterior (Front)
Unilateral paralysis of left Unilateral paralysis of left vocal fold

vocal fold attempt to bring vocal folds together
inhalation
Figure 18–7. Unilateral vocal fold paralysis. Left, paralysis of left vocal fold during inhalation; right,
paralyzed vocal fold during closed phase of phonation. Note the inability of the paralyzed vocal fold to
reach the midline for closure. Photos courtesy of Professor Susan Thibeault, Department of Surgery, Uni-
versity of Wisconsin Clinical Sciences Center.
The image on the left shows the vocal folds open achieve good closure of the folds by means of excessive
for inhalation. Compared with the healthy, right vocal effort. People with a paralyzed vocal fold may experi-
fold, the paralyzed fold is shorter; in many cases, a par- ence fatigue when phonating for extended periods of
alyzed vocal fold appears to have less mass compared time; it is hard work to produce voice when the vocal
with a healthy fold. folds cannot close effectively.
The image on the right shows the position of the
paralyzed vocal fold during the closed interval of
Treatment of Unilateral Vocal Fold Paralysis
vocal fold vibration. Note the lack of contact between
the two vocal folds and the slightly curved appearance Treatment of unilateral vocal fold paralysis combines
(“bowed”) of the paralyzed fold. direct voice treatment with surgical techniques. Direct
A paralyzed vocal fold can vibrate for phonation. voice therapy focuses on improving speech breathing
This is because the rapid opening-and-closing motions to manage the airflow problems of a “leaky” glottis,
of the vocal folds for phonation are controlled by air and voice exercises to achieve better vocal fold clo-
pressures and air flows, not by direct muscular contrac- sure for the closed phase of vibration. Effort exercises,
tions (see Chapter 10 and earlier discussion). However, such as pushing against a fixed-position surface (e.g., a
the loss of muscular control of the paralyzed vocal fold wall) or pulling up on the underside of a chair while
makes it difficult to achieve adequate vocal fold closure sitting in it, may be used to evoke closure of the glottis
for each cycle of vibration. The paralyzed vocal fold during phonation.
vibrates, but weakly. Surgical techniques to improve phonation in uni-
The loss of muscular control of the paralyzed lateral vocal fold paralysis may or may not include
vocal fold affects voice quality. As reviewed by Sam- the injection of biomaterials into the paralyzed fold
lan and Story (2017), a paralyzed vocal fold is likely to “plump it up.” This provides a larger mass against
to result in a breathy, weak, and strained voice quality. which the healthy fold can make contact for phonation.
The strained component of the voice quality reflects Another popular surgical technique is to reposition
an individual’s attempt to overcome the inability to the paralyzed fold closer to the midline, so it is easier
for the healthy fold to contact it during vibration. The Differences between SD and MTD include the fol-
measure of success of these surgical techniques is an lowing: (a) SD does not seem to respond to voice ther-
improved voice quality, as well as better swallowing apy in the way described previously for MTD; (b) the
function (see Chapter 20). symptoms of SD seem to be sensitive to specific speech
Bilateral vocal fold paralysis, a rare disorder in sounds, whereas the symptoms of MTD are more con-
which the nerves serving the muscles of the larynx are stant regardless of which speech sounds are produced;
damaged on both sides of the neck, is a life-threatening and (c) a significant number of patients with SD have a
condition. Because the nerves control the single muscle voice tremor, and patients with MTD do not.
of the larynx that separates (abducts) the vocal folds,
paralysis on both sides prevents separation of the folds
Treatment of Spasmodic Dysphonia
for inhalation. A tracheostomy (opening in the neck,
below the vocal folds) is a common way to provide a SD, as mentioned earlier, does not respond well to
patient with an alternate airway to sustain life. Another behavioral voice treatment. SD is treated by injection
surgical approach is a permanent repositioning of one into the vocal folds or other muscles of the larynx of
or both vocal folds away from the midline to restore the neurotoxin botulinum toxin. Botulinum toxin inhib-
the natural airway (Naunheim, Song, Franco, Alkire, its the release of the neurotransmitter acetylcholine at
& Shrime, 2017). the junction of motor nerves and muscle fibers. Acetyl-
choline is required for the contraction of muscles. By
inhibiting the release of acetylcholine to laryngeal mus-
Spasmodic Dysphonia
cles, laryngeal spasms are less likely to occur, allow-
Spasmodic dysphonia (SD) is a rare voice disorder ing patients to phonate more or less normally until the
regarded by a majority of voice specialists as a neu- effect of the drug wears off.6
rological disease. The estimated prevalence of SD is 1
in every 100,000 people. This estimate is complicated
Cancer of the Larynx
by disagreement among professionals as to the nature
and even existence of the disorder as a neurologically The vocal folds are affected in about 50% of laryngeal
based voice disorder (Hintze, Ludlow, Bansberg, Adler, cancers, but cancer can occur at any site in the larynx.
& Lott, 2017). Cancer of the larynx is almost always a disease of
Patients diagnosed with SD typically have inter- adulthood and is substantially more frequent in males
mittent, irregularly occurring voice spasms when they compared with females. At age 60 years, the 2005 inci-
attempt to phonate; they may also have voice tremors dence in males and female was approximately 35 in
(shaking voice). The spasms are related to massive and 100,000 and 5 in 100,000, respectively (Schultz, 2011).
sustained hypercontraction of laryngeal muscles. In In most cases, cancer of the larynx is a result of
some cases, hypercontraction is observed in muscles long-term tobacco and/or alcohol use. Other environ-
above the larynx, such as the tongue. Patients with SD mental factors (such as exposure to certain chemicals)
appear to be exerting tremendous effort to initiate and also contribute to the risk of laryngeal cancer.
maintain phonation. Often, the initial symptoms of laryngeal cancer
SD is a controversial disorder. The reader may are voice changes that become increasingly severe
notice a similarity between symptoms of SD and those of over time. The dysphonia in laryngeal cancer has been
MTD. The two diagnoses — SD, implying a neurologic described as hoarse, rough, and irregular. Cancerous
cause, MTD a psychological cause, are often in dispute. lesions on the vocal folds interfere with the motions
Voice specialists who are asked to make a diagnosis of essential to normal phonation. The dysphonia worsens
one or the other disorder, based on audio recordings of as the lesions increase in size.
speech or views of the vocal folds during phonation,
often disagree. Even the case histories and other charac-
Treatment of Cancer of the Larynx
teristics of the individuals with the two disorders can be
very similar. For example, in both disorders, a large per- An initial bout of dysphonia may not be regarded as
centage of patients are women, typically professional unusual by an individual. Voice changes such as lar-
and/or frequent voice users in the middle years of life. yngitis may interfere with phonation for several days
Many patients with either of the two diagnoses report a or even longer but may not seem to warrant a visit to
traumatic event preceding the onset of voice symptoms. an otolaryngologist. Even persistent voice changes
Botox  injections in laryngeal muscles are effective for about 3 to 4 months. Patients return for periodic injections to maintain the ability to
6
phonate without spasms.

coupled with swallowing problems may not seem like The child larynx is not simply a scaled-down ver-
compelling reasons to see a physician. sion of the adult larynx (Chapter 10). Most importantly,
Laryngeal cancers in early stages may be treated the layer structure of adult vocal folds is not fully
by surgeries to remove the lesion while preserving developed in children. Benign vocal fold masses are, in
sufficient tissue for voice production and swallowing. adults, typically located within the cover of the vocal
More advanced stages of laryngeal cancer require more folds — the outer two layers. In adults, it is relatively
extensive surgeries, ranging from removal of large easy to remove a mass such as a polyp and reattach the
parts of the larynx to removal of the entire larynx. This cover to preserve near-normal phonation. In children,
latter procedure, called a laryngectomy, requires the the absence of well-defined outer layers increases the
creation of a new airway because the path through the difficulty of a surgeon’s task to remove a mass without
larynx is eliminated. The trachea is attached via a small damaging the developing layers.
opening in the front of the neck, providing the ability
to breathe for life.
Surgeries for the removal of laryngeal cancers are Types of Childhood Voice Disorders
performed with an objective of restoring the ability
to produce speech with a sound source. Most often, Vocal nodules are the most frequent cause of dyspho-
a valve connecting the trachea with the esophagus is nia in childhood. The cause of vocal nodules in chil-
inserted during surgery. Air within the trachea is con- dren is similar to their cause in adults — phonotrauma,
nected to the esophagus; when tracheal pressure is resulting from excessive closing forces during vocal fold
raised in the respiratory system, the increased pressure vibration, and subsequent damage to the outer edges of
is forced through into the esophagus. The esophageal vocal fold tissue. The dysphonia heard in cases of child-
pressure causes sound-producing vibration of the ring hood vocal nodules is often described as hoarseness.
of muscles at the top of the esophageal tube, resulting Other benign vocal fold masses such as polyps and
in a sound source for the articulation of speech sounds. cysts are found in children. Like nodules, these masses
The sound source does not have the quality of vibrat- have the potential to interfere with adequate vocal fold
ing vocal folds, but it is sufficient for intelligible speech. closure. Also like nodules, polyps, and cysts in adults,
Other techniques for a substitute sound source follow- the interference with closure may result in the compen-
ing removal of the larynx are also available (Sapienza sation of excessive muscular effort to achieve closure.
& Ruddy, 2017). The original problem — poor closure of the vocal folds
due to a phonotrauma-related mass — is magnified by
attempts to overcome the dysphonia with increased
Pediatric Voice Disorders phonotraumatic behaviors.
For children under the age of about 12 years, vocal
Many vocal fold conditions/diseases that result in dys- nodules are more common among boys compared
phonia are similar to those described previously for with girls. There is evidence that children — and per-
adults. The summary that follows is based on a selected haps especially boys — with extraverted, talkative, and
group of relevant publications (Lee, Roy, & Dietrich, immature behavior styles are more likely to develop
2018; Possamai & Hartley, 2013; Smillie, McManus, vocal nodules. Similar evidence of a link between per-
Cohen, Lawson, & Wynne, 2014; Smith, 2013; Verdo- sonality type and vocal nodules exists for adults (Roy,
lini Abbott, 2013). A comprehensive textbook on voice Bless, & Heisy, 2000).
disorders in children has been written by Kelchner,
Brehm, and Weinrich (2014).
Treatment of Childhood
Voice Disorders
Prevalence of Childhood
Voice Disorders The treatments described earlier for adult voice dis-
orders are, in many cases, applied in childhood voice
The prevalence of voice disorders in children has been disorders. Vocal fold nodules are first treated with
estimated to be 4% to 6%. Some authors believe the behavioral therapy to minimize talking for a period of
prevalence may be even greater because a childhood time, and to teach and practice the use of a “normal”
dysphonia may not be regarded as a health concern. voice. When nodules do not respond to voice therapy,
The prevalence estimate of 4% to 6% is similar to the the masses may be removed surgically. Polyps and
prevalence estimate of voice disorders in the adult pop- cysts are more likely than nodules to be treated surgi-
ulation (Bainbridge et al., 2017). cally and followed up with voice therapy.
Treatment for voice disorders ranges from direct

Chapter Summary therapy to reduce or eliminate dysphonia, includ-
ing voice exercise and laryngeal manipulation, indi-
There are many different types of voice disorders, with rect therapy to counsel patients on issues that may be
many different causes. associated with dysphonia, and surgical techniques to
Dysphonia is the term used to describe a voice that remove benign masses and cancerous lesions.
seems abnormal in pitch, loudness, and/or quality. Pediatric voice disorders such as benign vocal fold
Based on survey data, it is estimated that approxi- masses result in a dysphonia often called “hoarseness”
mately 6% to 7% of the adult population experiences and may not be recognized by parents or teachers as
dysphonia over any 12-month period. a disorder requiring professional attention; a combi-
Dysphonia is more common among professional nation of voice and/or surgical therapy may be used
voice users (teachers, singers, actors) than in the gen- for pediatric voice disorders, depending on the type of
eral population and can have profound effects on pathology causing the dysphonia.
social, emotional, and employment aspects of life.
A case history, perceptual evaluation, laryngo-
scopic study of the vocal folds, and measurement of References
acoustic and aerodynamic parameters are important to
the accurate diagnosis of a voice disorder. Altman, K. W. (2007). Vocal fold masses. Otolaryngologic Clin-
Important acoustic measures of the voice include ics of North America, 40, 1091–1108.
fundamental frequency (F0), intensity, and a voice Andrea, M., Dias, Ó, Andrea, M., & Fugeira, M. L. (2017).
spectrum; the perceptual correlates of these measures Functional voice disorders: The importance of the psy-
are pitch, loudness, and quality, respectively. chologist in clinical voice assessment. Journal of Voice, 31,
Voice disorders are classified in several ways 507.e13–507.e22.
that overlap; the different classifications are not Andreassen, M., Litts, J. K., & Randall, D. R. (2017). Emerging
techniques in assessment and treatment of muscle tension
independent.
dysphonia. Current Opinion in Otolaryngology and Head and
Voice disorders can be classified along a hypo-
Neck Surgery, 25, 447–452.
hyperfunctional continuum that ranges from very Aronson, A. (1990). Clinical voice disorders (3rd ed., p. 131).
weak to overly forceful closure of the vocal folds dur- New York, NY: Thieme Medical.
ing their vibration. Bainbridge, K. E., Roy, N., Losonczy, K. G., Hoffman, H. J., &
Phonotrauma is a classification for a group of voice Cohen, S. M. (2017). Voice disorders and associated risk
disorders resulting from excessive use of the voice that markers among young adults in the United States. Laryn-
results in damage to vocal fold tissue, which in some goscope, 127, 2093–2099.
cases may lead to nodules, polyps, and other lesions on Baken, R. J., & Orlikoff, R. F. (2000). Clinical measurement of
the vocal folds. speech and voice. San Diego, CA: Singular Publishing.
Organic voice disorders are classifications of dys- Bastian, R. W., & Thomas, J. P. (2016). Do talkativeness and
vocal loudness correlate with laryngeal pathology? A study
phonias caused by benign mass lesions on the vocal
of the vocal overdoer/underdoer continuum. Journal of
folds such as nodules and polyps, which typically are
Voice, 30, 557–562.
the result of phonotrauma. Carding, P., Bos-Clark, M., Fu, S., Gillivan-Murphy, P., Jones,
Functional voice disorders are dysphonias in S. M., & Walton, C. (2017). Evaluating the efficacy of voice
which there is no laryngeal pathology that explains the therapy for functional, organic and neurological voice dis-
disorder, as in MTD; a subset of functional voice disor- orders. Clinical Otolaryngology, 42, 201–217.
ders, called psychogenic voice disorders, are thought to da Cun Pereira, G., de Oliveira Lemos, I., Gadenz, C., & Cas-
have their roots in psychological disorders. sol, M. (2018). Effects of voice on muscle tension dyspho-
Neurological voice disorders are those in which nia: A systematic literature review. Journal of Voice, 32,
damage to peripheral nerves or structures of the central 546–552.
nervous system affects vocal fold vibration as a result De Bodt, M., Van den Steen, L., Mertens, F., Raes, J., Van Bel,
L., Heylen, L., . . . van de Heyning, P. (2015). Characteris-
of weakness, paralysis, or dyscoordination of laryngeal
tics of a dysphonic population referred for voice assess-
muscles; laryngeal spasms during phonation may also
ment and/or voice therapy. Folia Phoniatrica et Logopaedica,
reflect a central nervous system disorder. 67, 178–186.
Laryngeal cancer may occur anywhere within the Dietrich, M., Verdolini Abbott, K., Gartner-Schmidt, J., &
larynx; lesions of the vocal folds occur in nearly half Rosen, C. A. (2008). The frequency of perceived stress,
of all laryngeal cancers and interfere with vocal fold anxiety, and depression in patients with common pathol-
vibration and therefore result in dysphonia. ogy affecting voice. Journal of Voice, 22, 472–488.
Eastwood, C., Madill, C., & McCabe, P. (2014). The behav- Naunheim, M. R., & Carroll, T. C. (2017). Benign vocal fold
ioural treatment of muscle tension voice disorders: A sys- lesions: Update on nomenclature, cause, diagnosis, and
tematic review. International Journal of Speech-Language treatment. Current Opinion in Otolaryngology and Head and
Pathology, 17, 287–303. Neck Surgery, 25, 453–458.
Hintze, J. M., Ludlow, L., Bansberg, S. F., Adler, C. H., & Naunheim, M. R., Song, P. C., Franco, R. A., Alkire, B. C., &
Lott, D. G. (2017). Spasmodic dysphonia: A review. Part 1: Shrime, M. G. (2017). Surgical management of bilateral
Pathogenic factors. Otolaryngology–Head and Neck Surgery, vocal fold paralysis: A cost-effectiveness comparison of
157, 551–557. two treatments. Laryngoscope, 127, 691–697.
Hixon, T. J., Weismer, G., & Hoit, J. D. (2020). Preclinical speech Nishio, M., & Niimi, S. (2008). Changes in speaking funda-
science: Anatomy, physiology, acoustics, perception (3rd ed.). mental characteristics with aging. Folia Phoniatrica et Logo-
San Diego, CA: Plural Publishing. paedica, 60, 120–127.
Hosoya, M., Kobayashi, R., Ishii, T., Senarita, M., Kuroda, H., Possamai, V., & Hartley, B. (2013). Voice disorders in children.
Misawa, H., . . . Tsunoda, K. (2018). Vocal hygiene edu- Pediatrics Clinics of North America, 60, 879–892.
cation program reduces surgical interventions for benign Ramig, L. O., & Verdolini, K. (1998). Treatment efficacy: Voice
vocal fold lesions: A randomized controlled trial. Laryngo- disorders. Journal of Speech, Language, and Hearing Research,
scope, 128, 2593–2599. 41, S101–S116.
Kelchner, L. N., Brehm, S. B., & Weinrich, B. D. (2014). Pediat- Rosen, D. C., Heuer, R. J., Levy, S. H., & Sataloff, R. T. (2003).
ric voice: A modern, collaborative approach to care. San Diego, Psychologic aspects of voice disorders. In J. S. Rubin, R.
CA: Plural Publishing. T. Sataloff, & G. S. Korovin (Eds.), Diagnosis and treatment
Kempster, G. B., Gerratt, B. R., Verdolini Abbott, K., Barkmeier- of voice disorders (2nd ed., pp. 479–506). Clifton Park, NY:
Kraemer, J., & Hillman, R. E. (2009). Consensus auditory- Delmar Learning.
perceptual evaluation of voice: Development of a standard- Roy, N., Bless, D. M., & Heisey, D. (2000). Personality and
ized clinical protocol. American Journal of Speech-Language voice disorders: A superfactor analysis. Journal of Speech,
Pathology, 18, 124–132. Language, and Hearing Research, 43, 749–768.
Kent, R. D. (1976). Anatomical and neuromuscular matu- Roy, N., Merrill, R. M., Gray, S. D., & Smith, E. (2005). Voice dis-
ration of the speech mechanism: Evidence from acous- orders in the general population: Prevalence, risk factors,
tic studies. Journal of Speech and Hearing Research, 19, and occupational impact. Laryngoscope, 115, 1988–1995.
421–445. Roy, N., Peterson, E. A., Pierce, J. L., Smith, M. E., & Houtz,
Kreiman, J., Gerratt, B. R., Kempster, G. B., Erman, A., & D. R. (2017). Manual laryngeal reposturing as a primary
Berke, G. S. (1993). Perceptual evaluation of voice qual- approach for mutational falsetto. Laryngoscope, 127, 645–650.
ity: Review, tutorial, and a framework for future research. Samlam, R. A., & Story, B. H. (2017). Influence of left-right
Journal of Speech and Hearing Research, 36, 21–40. asymmetries on voice quality in simulated paramedian
Lee, J. M., Roy, N., & Dietrich, M. (2018). Personality, psy- vocal fold paralysis. Journal of Speech, Language, and Hear-
chological factors, and behavioral tendencies in children ing Research, 60, 306–321.
with vocal nodules: A systematic review. Journal of Voice. Sapienza, C., & Ruddy, B. H. (2017). Voice disorders (3rd ed.).
https://doi.org/10.1016/j.jvoice.2018.07.016 San Diego, CA: Plural Publishing.
Lee, S., Potamianos, A., & Narayanan, S. (1999). Acoustics of Schultz, P. (2011). Vocal fold cancer. European Annals of Otorhi-
children’s speech: Developmental changes of temporal nolaryngology, Head and Neck Diseases, 128, 301–308.
and spectral parameters. Journal of the Acoustical Society of Smillie I., McManus K., Cohen W., Lawson, E., & Wynne, D.
America, 105, 1455–1468. M. (2014). The paedriatic voice clinic. Archives of Disorders
Leonard, R. (2009). Voice therapy and vocal nodules in adults. in Childhood, 99, 912–915.
Current Opinion in Otolaryngology and Head and Neck Sur- Smith, M. E. (2013). Care of the child’s voice: A pediatric oto-
gery, 17, 453–457. laryngologist’s perspective. Seminars in Speech and Lan-
Ludlow, C. L., Domangue, R., Sharma, D., Jinnah, H. A., Perl- guage, 34, 63–70.
mutter, J. S., Berke, G., . . . Stebbins, G. (2018). Consensus- Syamal, N. M., & Benninger, M. S. (2016). Vocal fold paresis:
based attributes for identifying patients with spasmodic A review of clinical presentation, differential diagnosis,
dysphonia and other voice disorders. JAMA Otolaryngol- and prognostic indicators. Current Opinion in Otolaryngol-
ogy–Head and Neck Surgery, 144, 657–665. ogy and Head and Neck Surgery, 24, 197–202.
Kent, R. D. (1976). Anatomical and neuromuscular matura- Van Houtte, E., Van Lierde, K., & Claeys, S. (2011). Patho-
tion of the speech mechanism: Evidence from acoustic physiology and treatment of muscle tension dysphonia:
studies. Journal of Speech, Language, and Hearing Research, A review of the current knowledge. Journal of Voice, 25,
19, 421–447. 202–207.
Martins, R. H., Pereira, E. R., Hidalgo, C. B., & Tavares, E. Verdolini Abbot, K. (2013). Some guiding principles in emerg-
L. (2014). Voice disorders in teachers: A review. Journal of ing models of voice therapy for children. Seminars in
Voice, 28, 716–724. Speech and Language, 34, 80–93.
19
Craniofacial Anomalies
Introduction more in-depth presentation of general embryology and

with excellent chapters on head and neck embryology
Craniofacial anomalies include a wide range of dis- see, Schoenwolf, Bleyl, Brauer, and Francis-West (2014)
orders with effects on speech production that are, in and Moore, Persaud, and Torchia (2019).
a general sense, predictable from the structural (ana- Most cases of cleft lip and/or cleft palate result
tomical) problems in the head and neck area. This from errors in embryological development that result
chapter describes the origin and nature of craniofa- in incomplete or incorrectly formed structures. People
cial anomalies and relates the structural problems to who study such embryological errors are called dys-
speech production problems. In addition, the concept morphologists; dysmorphogenesis is the process of
of syndromes is introduced. A craniofacial anomaly is abnormal tissue development during embryological
often one component of a syndrome, in which there development.
are multiple anomalies. The knowledge that cranio-
facial anomalies are often a syndrome component is
important in health care settings where the totality of a Embryological Development
child’s needs must be considered and integrated across of the Upper Lip and
various specialists who treat the different problems. Associated Structures
Figure 19–1 shows a photo of an embryo, 26 to 28 days

Definition and Origins of after fertilization; an artist’s drawing of the embryo
Craniofacial Anomalies is at the right. The view is from the side. The struc-
tures labeled “Pharyngeal arches” are duplicated on
A craniofacial anomaly is defined as any deviation from the other side of the embryo (that is, the structures of
normal structure, form, or function in the head and the embryo are bilaterally symmetrical). Each of the
neck area of an individual. The craniofacial anomalies pharyngeal arches, the four ridges in sequence along
that receive the most attention in this chapter are cleft the side of the embryo, are separated from the adjacent
lip and cleft palate. Cleft lip and cleft palate can occur arch by a deep groove. The pharyngeal arches are num-
independently of each other but often occur together. To bered from one (closest to the head) to four as labeled
appreciate the potential independence of cleft lip and in Figure 19–1. A fifth and sixth pharyngeal arch are
cleft palate, basic aspects of embryological development located behind the fourth, but they lack the prominent
of head and neck structures must be considered. For a ridge-like bulge of the first four and are not easily seen.
261
Pharyngeal arches
12 3 4
Figure 19–1. Left, photo of embryo 26 to 28 days after fertilization; right, artist’s drawing of the
embryo with first four arches labeled.
The pharyngeal arches are the source of embryo- structures right next to the maxillary prominences (Fig-
logical tissue for the development of the majority of ure 19–2, upper right panel), which could be mistaken
head and neck structures. The first pharyngeal arch is for the primitive eyes but are actually the early version
the source of tissue for the development of the lower of the nose, are moved toward the center of the face by
lip, jaw, upper lip, and hard palate, as well as many this push.
structures of the ear. As the maxillary prominences and circular struc-
A frontal view of the embryonic head tissue that de- tures move toward the center of the face, they begin
velops into head and neck structures is shown at 5, to fuse, meaning that tissues of the two structures knit
6, 8, and 10 weeks post-fertilization in Figure 19–2. together. Figure 19–2 shows this process as a sequence
The structures labeled “mandibular prominence” and of drawings over time. The fusion is complete around
“maxillary prominence” are generated from the first 10 weeks, and when successful creates an upper lip that
pharyngeal arch. In the center of the embryo, the small is continuous from the right to left corner of the mouth,
space just above the mandibular prominence (hence- a well-formed nose, and a small wedge of bone in the
forth, “mandibular arch”) is the primitive mouth which front of the mouth. Figure 19–3 (discussed further in
is called the “stomodeum.” Embryological develop- the next paragraphs) shows on the right another view,
ment of the mandibular arch results in the lower jaw looking up at the roof of the mouth at the completed
(the mandible). The development of the maxillary upper lip and wedge of bone. Keep in mind that the
prominence results in the fully formed upper lip, parts wedge of bone that will contain the upper four teeth is
of the nose, a small wedge of bone that will eventually formed with the upper lip and soft tissue of the nose. It
contain the upper four front teeth, and the bones that is not considered part of the hard palate, even though
form the roof of the mouth. Most of the muscular soft it becomes bone.
palate (velum) is formed from the 4th and 6th pharyn- Figure 19–3 shows the formation of the hard pal-
geal arch. ate in two views, one looking directly into the mouth
The formation of the upper lip and its associated (left column of images), and the other looking up at the
structures proceeds as follows. Roughly 6 weeks after roof of the mouth (right column of images). At 7 weeks,
fertilization, the two maxillary prominences begin to the two shelves of embryological tissue (called the
“push” toward the center of the face. The nearly circular palatine shelves, see Figure 19–3) that become the
5 weeks 6 weeks
Maxillary Eye
prominence
Mandibular
prominence Stomodeum
8 weeks 10 weeks
Maxillary Eye
prominence
Mandibular
prominence Philtrum
Figure 19–2. Embryological development of upper lip and associated structures, nose, philtrum, inter-
maxillary segment (wedge of bone with upper four front teeth, see Figure 19–3), at 5, 6, 8, and 10 weeks
post-gestation).
bony hard palate are widely separated; there is an At 9 weeks, the tongue has lowered, and the pala-
open space between them. The top image in the left tine shelves have snapped up into the horizontal posi-
column shows why this is so. The palatine shelves are tion (seen in both the left and right images of the middle
oriented at a downward angle and, in fact, are beneath row). The lowering of the tongue is made possible by
the tongue tissue that is high in the mouth because the growth of the mandible, which allows the tongue to
mandible (lower jaw) is still very small. The embryologi- drop down in the mouth and free the palatine shelves
cal tongue prevents the palatine shelves from “snapping to move. The shelves are not yet touching each other
up” into the horizontal position, which is required for as shown in both of these middle row images, but the
the shelves to meet and fuse, thus forming the hard opening between the shelves is smaller at 9 weeks than
palate. As indicated by the downward-pointing arrow it was at 7 weeks.
(left column, top image), the tongue must be low- The tissue of both shelves grows toward the
ered to get out of the way of the palatine shelves. The midline to fuse and form a continuous hard palate that
right-hand image at 7 weeks, this view from above the separates the oral cavity from the nasal cavity. This
tongue and looking directly up at what becomes the roof growth is indicated in the image in the right column
of the mouth, shows the wide separation between the by the short horizontal arrows pointing toward each
palatine shelves. other.
7 weeks
Primary
Nasal palate
chamber Nasal
Septum
Palatine
Tongue shelf
9 weeks
Primary
Nasal Nasal palate
chamber Septum
Palatine
shelf
Tongue
12 weeks
Nasal Incisive
chamber foramen
Nasal
Septum
Fused
palatal shelves
Oral
cavity
Tongue
Uvula
Figure 19–3. Embryological development of hard and soft palates at 7, 9, and 12 weeks post-
gestation. Note at 7 weeks palatine shelves trapped under tongue; at 9 weeks mandible has grown
substantially, allowing tongue to drop in oral cavity so palatine shelves can snap up into horizontal
position and grow to midline where they fuse together, front to back. Left column of images, look-
ing directly into the mouth; right column of images, looking upward to the roof of the mouth.
The bottom images show the fused palatine shelves systematically from front to back, like a closing zipper.
forming a continuous hard palate. The closure of the This fusion pattern is important for understanding par-
hard palate is seen looking up to the roof of the mouth tial clefts of the palates, which are discussed later.
(left image) and in the frontal view (right image).
The images at the bottom show the development
of the structures at 12 weeks post-gestation. The image Embryological Errors and
in the right column contains a vertical arrow point- Clefting: Clefts of the Lip
ing from the front of the mouth to the back. This is the
direction of fusion of the hard and soft palates. Fusion Cleft lips are the result of an error (dysmorphogenesis)
occurs first at the front, immediately behind the yellow in the embryological process of upper lip develop-
wedge of bone described previously. Fusion continues ment. The errors are failures of tissue moving from the
two sides of the developing lip to meet each other and Figure 19–4, right, shows bilateral (both sides),
“knit” tissue together to form a continuous lip. These complete clefts of the upper lip. The mass of tissue
embryological errors occur in varying degrees, and on in the center of the lip is the intermaxillary segment
one or both sides of the lip midline. A cleft on one side (wedge of bone in Figure 19–3, in yellow), which like
only is a unilateral cleft lip; a cleft on both sides is a the lip is cleft on both sides and therefore not attached
bilateral cleft lip. Unilateral cleft lips are more common to the maxillary arch (pink in Figure 19–3).
than bilateral cleft lips.
Unilateral cleft lips may be partial, sometimes no
more than a “notch” in the upper lip, or a cleft that is Embryological Errors and
halfway between the upper lip and floor of the nostrils. Clefting: Clefts of the Palate
A complete unilateral cleft lip extends into the base of
the nasal cavity (Figure 19–4, left). Note how the cleft Clefts of the palate are the result of an error in develop-
is to the left of the center of the mouth (see Box, “More ment of the palatal shelves, or of the mandible, or both.
Embryology”). Like clefts of the upper lip, clefts of the palate occur in
More Embryology
Two questions that are often asked they are also the location of fusion of embryo-
by students about orofacial embryology are as logical tissue to make a complete upper lip. An
follows: (a) why are clefts of the lip off the midline embryological error of failure to fuse tissue occurs
(to the left or right of the center of the mouth), at these boundary points — hence, the lip clefts
and (b) what is the meaning of a unilateral versus are to the side of the center of the mouth. Second,
bilateral cleft of the palate? Both questions can be note in the left-hand column of images the nasal
answered with reference to Figure 19–3. First, note septum, which during embryological development
in the right-hand column of images the boundar- of the palate descends and fuses with the nasal
ies between the yellow wedge of bone (called the surface of the palate. What happens to the nasal
intermaxillary segment) and the front part of the septum when there is a cleft palate? Either the
maxillary arch (the pink part). Remember that nasal septum is attached to one maxillary (palatine)
the embryological formation of the intermaxillary shelf, or it is attached to neither. A unilateral cleft
segment is part of the formation of the upper palate is when the septum is attached to one
lip. The boundaries between the intermaxillary shelf; a bilateral cleft palate is when the septum is
segment and the maxillary arches are off center; attached to neither shelf.
Unilateral, complete cleft lip Bilateral, complete cleft lip
Figure 19–4. Left, complete, unilateral cleft of the lip; right, complete bilateral clefts of the lip.
varying degrees, from a minimal, notch-like split at the Photos of partial and complete clefts of the palate
back of the soft palate to a full split from the back of the are shown in Figure 19–6. A partial cleft is on the left
soft palate to the bone wedge formed with the develop- and a complete cleft is on the right. The partial cleft
ment of the upper lip. Figure 19–5 shows drawings of on the left is called a submucous cleft. The mucous
three degrees of cleft palate severity, ranging from the membrane that covers the muscles of the soft palate
notch-like defect in the back of the soft palate (left), to is intact, but the underlying muscle is cleft. Note the
a partial cleft of the soft and hard palates (middle), and small “notch” (arrow) in the soft palate. A complete
to a complete defect of the hard palate, extending to the cleft of the hard and soft palates is seen on the right of
wedge of bone formed with the upper lip. The middle Figure 19–6. Both photographs show clefts of the palate
image shows a partial cleft of the soft and hard palates; in the absence of clefts of the lips.
the dashed line is the approximate boundary between
the hard and soft palates.
Clefts of the palate also result when the mandible Cleft Lip With or Without a
fails to grow sufficiently to allow the tongue to lower Cleft Palate; Cleft Palate Only
in the oral cavity so that the palatine shelves are free (Isolated Cleft Palate)
to snap up in the horizontal position. As previously
described, early in the embryological development As noted, cleft lips and palates occur as independent
of the hard palate, the palatine shelves are trapped dysmorphologies. Children are born with cleft lips but
beneath the developing tongue tissue. When the man- a fully formed palate, or isolated cleft palates with a
dible fails to grow at a specific time during the embry- fully formed upper lip. Some scientists and clinicians
ological schedule of development, the two shelves believe, however, that a cleft lip with a cleft palate is
remain trapped and lose their ability to fuse together. a more severe form of cleft lip alone. In other words,
The cleft palate is a result not of a primary embryo- clefting of the lip and palate together is viewed as one
logical error of palatal development, but of a primary category of clefting, with cleft lip alone being a less
error of mandible development that prevents the pala- severe form of the defect. This category is referred to
tine shelves from lifting and fusing. The mandible may as cleft lip with or without a cleft palate (abbreviated
eventual grow sufficiently to allow the tongue to drop as CL/P).
in the oral cavity, and the palatine shelves may elevate Isolated cleft palate (abbreviated as CPO) is
to the horizontal position. The problem is that the tis- regarded as a category separate from CL/P. The dis-
sue growth required of the two palatine shelves to tinction between the two categories is made because
extend to the midline for fusion is, like much of the rest (a) sex ratios computed for babies born with clefts favor
of embryological development, on a schedule. When boys for CL/P but favor girls for isolated cleft palate;
the schedule is not met or prevented, as when the high (b) the incidence (frequency of occurrence) of the two
tongue prevents the shelves from meeting their time categories is different — approximately 1 in 700 babies
for growth, the schedule for the growth is cancelled. are born with CL/P, but the incidence for isolated cleft
Figure 19–5. Left, a partial cleft of the soft palate; center, partial cleft of soft and hard palates, extending from
the back of the soft palate to the middle of the hard palate; right, complete cleft of the soft and hard palates. The
dashed, horizontal line is the approximate boundary between the hard and soft palates.
Submucous cleft palate Complete cleft palate
Figure 19–6. Left, partial cleft of the soft palate, submucous cleft with a small notch indi-
cated by arrow; right, complete cleft of hard and soft palates.
palate is about 1 in 2,000; and (c) CL/P occurs more fre- dren had clift lip only, and 58% had a cleft lip and cleft
quently in certain racial groups as compared to others palate (Harville, Wilcox, Lie, Vindenes, & Abyholm, 2005).
(e.g., more frequently in Asian as compared to African Prevalence of CL/P in different countries should
American populations), but isolated cleft palate does be kept in mind when thinking about the place of cleft-
not seem to vary with racial group (Shkoukani, Law- ing in the big picture of medical care. In the United
rence, Liebertz, & Svider, 2014). States, CL/P is among the most common birth defects.
For more information, see: https://www.asha
.org/PRPSpecificTopic.aspx?folderid=8589942918&
Epidemiology of Clefting section=Incidence_and_Prevalence
The epidemiology of clefting is complicated because it

depends on where the data were collected. Prevalence Speech Production
of clefting at birth varies by country. European birth in CL/P and CPO
registries for the years 1993 to 1998 showed a substan-
tially higher rate of CL/P in Sweden and Norway com- As presented in Chapter 10, the velopharyngeal port
pared with Spain. Even within a single country, such as (hereafter, VP port) is the passageway between the oral
France, the rate of CPO for the same years was higher and nasal cavities. The VP port is opened and closed
than the rate of CL/P. by muscular forces. These forces lift the soft palate to
An average global estimate of the prevalence of the posterior pharyngeal wall, as shown in Figure 19–7.
orofacial clefting (both CL/P and CPO) is about 1 in The salmon-colored soft palate is shown in the open
every 700 live births (Mossey, Little, Monger, Dixon, position, which allows airflow coming from the lungs
& Shaw, 2009). In the United States, the prevalence and through the larynx to pass through the pharynx,
of live-birth clefting is lower, estimated at 1 in every into the nasal cavities, and through the nostrils to the
940 live births for Cl/P, and 1 in every 1,574 for CPO atmosphere. In contrast, when the soft palate is lifted by
(Parker et al., 2010). muscular forces, it is pushed against the posterior pha-
In a Norwegian study covering nearly 2,700 live ryngeal wall (soft palate shown as red structure with
births of children with clefting between the years 1967 dashed outline). The muscular lift of the soft palate is
and 1998, CL/P was roughly 1.5 times more likely than accompanied by muscular activity of the pharyngeal
CPO. The same study reported that 42% of these chil- walls. Pharyngeal muscles squeeze the sides of the soft
Nasal cavities
Soft palate
Lips
Tongue Pharynx
Vocal
folds
Figure 19–7. Vocal tract with lips closed for bilabial stop con-
sonant /b/. With the VP port open (salmon-colored soft palate) the
airflow leak into the nasal cavities prevents the pressure buildup in
the vocal tract required for correct production of /b/. With the VP
port closed (red soft palate) the vocal tract is completely sealed and
pressure can be developed for /b/.
palate. These muscular actions — lifting the soft palate and vowel-like sounds can be produced with an open
and squeezing its sides when it makes contact with the VP port, but they will sound displeasing because of
back (posterior) wall of the pharynx — close off commu- excessive nasality resulting from the transmission of
nication between the nasal cavities and oropharyngeal sound through the nasal cavities.
cavities. The effect of squeezing the sides of the soft Closure of the VP port for obstruent consonants —
palate with contraction of the pharyngeal muscles is as stops, fricatives, and affricates — is required for the
relevant to VP closure as the contact of the soft palate buildup of oral pressure in the vocal tract. Obstruents
with the posterior pharyngeal wall. are speech sounds that have a tight or complete con-
Opening of the velopharyngeal port from the striction in the vocal tract. When airflow encounters
closed position is largely due to relaxation of the lifting these constrictions, pressure behind them rises above
and squeezing muscles which allows gravity to pull atmospheric pressure provided there is no leak to the
down on the soft palate. A small degree of muscular atmosphere. These positive pressures are necessary for
contraction may also contribute to opening the VP port. the correct production of obstruents, as discussed in
Overall, the VP port is closed and opened like the tight- Chapters 10 and 12. A potential leak through the VP
ening and loosening of a sphincter. port is prevented by its closure, so that the effort of
The VP port is constantly opening and closing increasing air pressure behind the obstruent constric-
during speech production, opening for nasal sounds tion is not compromised by the escape of air through
such as “m,” “n,” and the “ng” sound in words like the nasal passageways.
“ring,” and closing for vowels and obstruent conso- The requirement that the VP port be closed for
nants. In English, the VP port closes for vowels and production of obstruents is illustrated in Figure 19–7
other vocalic-like sounds (such as “r,” “l,” “w,” and by closure of the lips for the bilabial stop consonant
diphthongs like the sounds in “eye,” “ay,” and “oy”) /b/. The lip closure seals the vocal tract at its front end
to prevent them from sounding too nasal. These vowel and closure of the VP port seals the vocal tract toward
the back. As air flows into the closed volume of the Diagnosis of VPI
vocal tract, pressure builds up, just as it should for a
stop consonant. A completely sealed vocal tract cavity Perceptual analysis is one approach to the diagnosis
is necessary for all stops, fricatives, and affricates. of VPI. Hypernasality may be judged by listening and
This review suggests two major problems when an making a qualitative judgment (e.g., mild hyperna-
individual does not have good control over the open- sality versus severe hypernasality) or using simple
ing and closing of the VP port during speech produc- scales, such as a 5-point scale with one end of the scale
tion. First, the individual is likely to have intermittent defined as “no hypernasality,” and the other end of
or chronic hypernasality. Hypernasality denotes exces- the scale defined as “severe hypernasality.” In addi-
sive nasality during speech, and especially during tion, listening for consonant errors made by a child
vowels. Hypernasality is regarded by most listeners as or adult under evaluation for a speech disorder can
aesthetically displeasing and has the potential to probe a valuable source of diagnostic information. Most
duce a muffled-sounding sequence of sounds, mak- speech-language pathologists, including those who do
ing speech relatively difficult to understand. Second, not regularly provide services for children or adults
a speaker who has difficulty closing the VP port can- with cleft palate, are able to listen to a speech sample
not produce obstruents correctly because of a partial from an individual and, with a high degree of certainty,
or complete inability to develop a positive pressure in identify the presence of VPI due to cleft palate.
the vocal tract. An open VP port during an attempt to Perceptual analysis, however, cannot give reliable
produce a stop consonant results in air leaking through information on the magnitude of, or specific reasons
the nasal cavities and into the atmosphere. This makes for, the VPI problem. For example, two patients may
it nearly impossible to build up and maintain an oral have the same-sounding hypernasality and consonant
pressure for the time interval during which the oral errors due to VPI but have different underlying reasons
cavity is sealed by either the lips (as in “b” and “p”) or for the VPI. Recall the sphincter-like closure of the VP
by contact between the tongue and part of the palate port, produced by several different muscles. Different
(“t,” “d,” “k,” “g”). Failure to develop a positive oral muscles may be responsible for VPI in different speak-
pressure and produce obstruents correctly is likely to ers. Precise knowledge of which muscles are contrib-
have a major effect on speech intelligibility. uting to VPI may have direct implications for surgical
Patients who have problems with control of the VP and behavioral treatment of the problem.
port have velopharyngeal insufficiency (typically abbre- Fortunately, speech-language pathologists and
viated VPI). VPI is a matter of degree; patients may surgeons have techniques for direct visualization of
have no control over the opening and closing of the VP the VP port. Two of these techniques are described
port, whereas others may have some, but not complete briefly here; the interested reader can consult Peterson-
control. It is not unusual for a person with a repaired Falzone, Trost-Cardamone, Karnell, and Hardin-Jones
cleft palate to have some remaining VPI, and therefore (2017), and Kummer (2018, 2020) for further informa-
hints of hypernasality and perhaps audible escape tion on these and other techniques.
of air through the nose when he produces obstruent Most patients with craniofacial anomalies and
consonants. VPI have x-ray studies performed to gain knowledge
An individual with an unrepaired cleft palate (an on the function of the VP port. These studies focus the
almost nonexistent circumstance in developed coun- x-rays at the VP port, from several different angles.
tries for children who are old enough to have speech A technique called videofluoroscopy generates x-ray
and language skills) has VPI of the most severe kind, motion pictures to evaluate movement of VP port
because the structural basis for closing the VP port structures during speech. An example of the benefit of
does not exist. The concept of VPI is more applicable these studies is the ability to determine whether VPI
to those patients born with a craniofacial anomaly and is due to inadequate movement of the soft palate, the
who have had surgery to restore the structural basis pharyngeal walls (due to contraction of the pharyn-
for opening and closing the VP port. The surgeries are geal muscles, as previously described), or both. This
not only designed to close the palate, thus eliminating information, which cannot be derived from perceptual
the “hole” in the roof of the mouth, but also to reat- evaluation, is very informative for the combined efforts
tach muscles in their proper orientation so the VP port of an SLP and craniofacial surgeon to determine future
can be closed for speech production (and, it should management of an ongoing problem with VPI.
be added, efficient swallowing: see Chapter 20). As Another way to visualize the VP port is by means
noted previously, surgery does not always eliminate of a nasoendoscope inserted through the nose and
VPI completely. positioned just above the port. Advantages of this kind
of examination are the ability to see the entire, circu- VPI, Consonant Articulation,
lar perimeter of the port, and to see which parts of the and Speech Intelligibility
circle are not moving in cases of VPI. Nasoendoscopy
is usually performed using a topical anesthetic applied Speech-language pathologists are interested in the kinds
within the nasal cavities. This lessens the discomfort of of consonant errors made by individuals who seek ther-
the procedure. A skilled endoscoper can collect valu- apeutic services for speech intelligibility problems. The
able information from this procedure without exposing thought is that if a set of errors is identified by means of
the patient to radiation. careful testing, the nature of specific consonant errors,
or a pattern revealed by several different consonant
errors, can guide a therapy plan (Kummer, 2011).
VPI and Hypernasality Clinical observation and formal research stud-
ies have established the kinds of consonant obstruent
As previously noted, the VP port opens and closes dur- errors made by individuals with clefts and VPI. These
ing speech. The three sounds of English called nasals errors fall into one of two categories: obligatory errors
(“m,” “n, “ng”) are produced with an open VP port and compensatory errors (Kummer, 2011).
because they are meant to sound nasal. An open VP
port during the production of vowels and other vocalic
Obligatory Errors
sounds, however, causes speech to sound hypernasal.
This is because sound waves move through the nasal Obligatory obstruent errors are produced in the cor-
cavities where they produce characteristic nasal reso- rect way except for the required closure of the VP port.
nances. The vowels of the person with VPI therefore For example, a child with a repaired cleft palate and
have the characteristic resonances of the vocal tract — VPI may produce the correct articulatory placement
those usually associated with vowels and vocalic sounds for a “d,” with the tongue tip properly placed at the
— mixed with the nasal resonances. It is as if the vowels front of the palate, immediately behind the upper front
are “colored” by nasal acoustic energy, a coloring not teeth. However, the VPI results in air leakage through
present when the VP port is closed for vowel production. the VP port, which prevents the proper buildup of oral
The idea of closure of the VP port for obstru- pressure within the vocal tract. The resulting obstruent
ents and opening for nasals is a simplification of the speech sound is weak, possibly sounding somewhat
behavior of the port during connected speech such as like a nasal of the same place of articulation (e.g., an
occurs during conversation. The size of the VP port is “n” instead of the intended “d”). The sound may also
constantly changing during connected speech, from be perceived as a more complex error because there
completely closed to completely open, as well as many is audible nasal emission of air during the attempt to
sizes in between these two extremes. The complexity of produce the stop. In this case, the sound has stop-like
VP port behavior during speech produced by a healthy characteristics but is not heard as correct due to the
adult can be observed at https://www.youtube.com/ nasal emission. Audible nasal emission is the result of
watch?v=-kHtGlhPs3Y . air rushing through the nasal cavities as a result of VPI.
Children who are born with a cleft palate and Speech sound errors such as these are called
who have had surgery to repair the cleft around 12 obligatory because even with correct positioning of the
months of age may still, in later childhood, be per- articulators, other than the soft palate and pharyngeal
ceived with excessive hypernasality. Among recently walls, the sound is obligated to be produced as an error
studied 10-year-olds with repaired clefts and variable because of the VPI.
severity of hypernasality, children with greater severity
had other developmental difficulties (such as delayed
Compensatory Errors
language or reading difficulties). Most children with
repaired clefts and “normal” nasality did not have such Unlike obligatory errors, compensatory errors are not
problems (Feragen, Auckner, Særvold, & Hide, 2017). produced with correct placement of the articulators.
When compared to children who are not hyperna- Compensatory errors, as the category label suggests,
sal, children who have a surgically repaired hard palate are a response to VPI, specifically as a way speakers
but remain hypernasal are perceived as less intelligent, “solve” the problem of air leakage through the VP port.
less likely to make friends, and more likely to be teased The lips are the front end of the vocal tract tube
(Watterson, Mancini, Brancamp, & Lewis, 2013). These and the larynx is the back end; the VP port is like a
are negative social consequences of hypernasality in valve midway between these two locations. The valve,
cleft palate speech. when open, allows air coming through the glottis to
flow through the VP port and into the nasal cavities. of stop consonants and hissing noise of fricatives (see
Some speakers with VPI compensate for the leak at Chapters 10 and 12).
the VP port by forming an articulatory constriction for The compensatory errors produced by children (or
an obstruent consonant before (i.e., in back of) the leak. adults) with VPI due to cleft palate are errors of place
This is shown in Figure 19–8, where a compensatory of articulation; manner of articulation is retained. The
articulation error (right) is contrasted with an obliga- compensatory stop consonant shown in Figure 19–8 is
tory articulation error (left). In both cases, the “target” produced with the characteristics of a stop (complete
sound is /d/, a voiced stop consonant. The left image blockage of airflow in the vocal tract for a brief interval,
shows the constriction in the vocal tract made by the released suddenly to make the popping noise), but at
tongue tip pressing against the palate just in back of the wrong place of articulation.
the upper teeth; this is the correct place of articulation Many children with repaired cleft palates who
for this sound. The VP port is open, however, which continue to have VPI make these kinds of errors. The
prevents the required buildup of pressure inside the error sounds are called pharyngeal stops, pharyngeal
vocal tract. The image on the right shows a compensa- fricatives, and glottal stops. In pharyngeal stops, the
tory solution to this problem. The constriction for the airstream is completely blocked by placing the back of
stop is made in the pharynx, before the airflow reaches the tongue against the posterior pharyngeal wall and
the open VP port. then releasing it suddenly. The articulation of pharyn-
The place of articulation for all English obstruents geal fricatives is similar, except that the constriction
is in front of the VP port. Attempts to produce any of formed between the back of the tongue and the pos-
these sounds with tongue placement in the “correct” terior pharyngeal wall is not quite complete, allow-
way (obligatory errors) cannot avoid a leak caused ing air to be forced through the narrow passageway
by VPI. Production of obstruents behind the VP port to generate a hissing noise. Pharyngeal stops are used
avoids a leak, allowing a buildup of air pressure behind to replace the “correct” stops of English (most often
the constriction to generate the desired popping noise “k” and “g”) and pharyngeal fricatives the “correct”
Nasal
cavities
Nasal
cavities
Outer
nose Outer nose
Velopharyngeal
port
Airflow
Airflow
Figure 19–8. Left, an obligatory error for /d/, tongue makes the required constriction at the front of the hard palate to
block airflow at the alveolar place of articulation (immediately behind the teeth) but VP port is open due to VPI, preventing
the buildup of pressure within the vocal tract; right, a compensatory error for /d/, tongue blocks air at a place of articulation
behind the leak, in the pharynx, with the resulting speech sound retaining the manner of articulation (stop) but produced
at the wrong place of articulation.
fricatives (“f,” “v,” “th,” “s,” “z,” “sh,” and “zh”). Note Clefting and Syndromes
that the manner of articulation is preserved in these
error patterns.1 A syndrome is a group of anomalies, including ana-
Glottal stops are frequent compensatory errors tomical, physiological, and/or behavioral components,
in speakers with VPI. They are made by bringing the that are observed in an individual and have been
vocal folds together forcefully and after a short inter- observed previously in other individuals. Some of the
val blowing them apart with a high tracheal pressure. components of a syndrome may seem unrelated, such
The two vocal folds are like two articulators that form as heart and specific psychiatric problems, but their
a complete constriction, and the pressure immediately presence in a group of individuals suggests that they
below them, in the trachea, is like the positive oral are part of a pattern, perhaps with a single, underlying
pressure typical of stop consonants. Speakers with VPI cause (Shprintzen & Bardach, 1995). Syndromes may
typically substitute glottal stops for the “correct” stops also be defined as a collection of anomalies that occur
of English, especially voiceless stops /p/, /t/, and /k/ together in individuals but without a single, underly-
in words such as “puppy” and “light.” ing known cause.
Many speakers with VPI produce a mixture of The syndromes discussed here are all conditions
obligatory and compensatory errors. When children present at birth or known to be present at birth. Many
begin producing compensatory errors early in their syndromes have a genetic cause, such as velocardio-
developmental history, the errors may become habit- facial syndrome and Treacher Collins syndrome (see
ual and resistant to therapeutic modification. This later discussion in this chapter). Environmental factors
is an argument for early surgery to correct a cleft pal- may also result in syndromes present at birth, such as
ate, to establish a normal or near-normal VP mecha- fetal alcohol syndrome and syndromes associated with
nism, and to prevent compensatory errors before they vitamin deficiencies.
begin and over time become fixed as part of the child’s More than 275 syndromes include some form of
sound system. craniofacial anomaly (Leslie & Marazita, 2013). Many
Obligatory and compensatory speech sound errors, of these syndromes are rare; the following summa-
or a mixture of both kinds, affects a speaker’s intelligi- ries are of more frequently occurring syndromes with
bility. When there is substantial hypernasality in addi- CL/P or CPO as a component. The reader is encour-
tion to these errors, the child’s ability to be intelligible, aged to search the Internet for photographs of indi-
and thus to communicate, is further impaired (Kum- viduals with the syndromes described in the following
mer, 2011). sections.
Palatoplasty and Speech Sound Errors

Palatoplasty is the surgery performed palatoplasty, because the errors are due to more
to close a cleft palate. It is more than simply than the functioning of the VP port. The child with
closing the hole, however. Muscles that lift and compensatory errors has, in a sense, learned a new
shape the soft palate for contact with the posterior system for producing obstruents — move the place
pharynx are not attached correctly in a cleft palate; of articulation behind the leak. It is almost as if
a critical part of the surgery for speech outcomes the child has invented a new sound system to deal
is the reconfiguration and correct attachment with the available speech structures. This is why
of these muscles to the tissue of the soft palate. palatoplasty is usually recommended as early as
Obligatory speech sound errors are likely to be possible — perhaps 9 to 12 months of age, before
eliminated by successful surgery. This is because the child is developing the speech sound system
the problem in obligatory errors is the inability in the context of first words. Compensatory errors
to close the VP port — the rest of the articulatory are less likely when VPI is significantly reduced or
characteristics are correct for the target sound. eliminated by surgery prior to first words.
Compensatory errors are not “fixed” by successful
1
haryngeal stops and fricatives are not part of the sound inventory of English but are present in the sound inventory of Arabic and Hebrew.
P
It may be the case that these sounds are not made with a constriction between the tongue and pharynx, but rather between the epiglottis and
pharynx (Ladefoged & Maddieson, 1996). It is equally possible that an epiglottal-pharyngeal articulation is the basis for some of the compen-
satory errors heard in speakers with VPI.
22q11.2 Deletion Syndrome Stickler Syndrome

(Velocardiofacial Syndrome) Stickler syndrome includes a range of anomalies, all of
Children with velocardiofacial syndrome (VCFS), also which — including cleft palate — can be traced to prob-
known as 22q11.2 deletion syndrome, often have clefts lems with genes that control the production of colla-
of the palate. The anomalies appearing most often in gen. Collagen is a protein important for the creation of
VCFS include an isolated cleft palate (hence, the “velo” cartilage, connective tissue, and some bony structures
part of the syndrome name), heart defects (hence, that have cartilaginous origins during embryological
“cardio”), and a characteristic appearance of the face development. Specifically, an individual with Stickler
(“facial”), which has been described as pear shaped syndrome is likely to have a round, flattened face, a
with a long nose having a broad root, a small jaw, and variety of eye problems including detachment of the
small ears. Many children with VCFS have learning dis- retina and degeneration of the fluid inside the eyeball,
abilities, and approximately 10% may develop severe problems with joints, poor growth of long bones result-
psychiatric disturbance around the time of puberty. ing in short stature, hearing loss resulting from struc-
VCFS is a genetic disorder, involving a deletion of tural problems in the end organ of hearing (called the
genetic material on the 22nd chromosome pair. VCFS cochlea, analogous to the retina in the eye: see Chap-
occurs in 1/4,000 births, and usually involves speech ter 22), an underdeveloped mandible, and cleft pal-
and language disorders (McDonald-McGinn & Sul- ate. When a cleft palate occurs in Stickler syndrome,
livan, 2011). Like many of the genetic disorders dis- it is probably a result of the underdeveloped jaw that
cussed in previous chapters, the phenotype of VCFS prevented the palatine shelves from moving into the
varies widely, even with the same genotype. People proper position for fusion (see earlier). Like VCFS and
with VCFS have characteristics of the syndrome that Treacher-Collins syndrome, Stickler syndrome has a
range from very mild to very severe. genetic basis. The genes that fail to produce a normal
pattern of collagen production are apparently located
on the 1st, 6th, and 12th chromosomes. An estimate of
Treacher-Collins Syndrome the prevalence of Stickler syndrome is 1 in 7,500 to 1 in
(Mandibulofacial Dysostosis or First 9,000 births (Printzlau & Andersen, 2004).
Pharyngeal Arch Syndrome)
Treacher-Collins syndrome is also called first pha- Craniosynostosis (Apert and
ryngeal arch syndrome because its main characteris- Crouzon Syndromes)
tics reflect problems in the development of structures Craniosynostosis is a syndrome in which the bones
originating from this arch. As described earlier, the first of the skull do not fuse together at the proper time.
pharyngeal arch generates all structures of the face, Craniosynostosis includes Apert and Crouzon syn-
as well as the hard palate, the external ear, and parts dromes (as well as other nonsyndromic variants),
of the middle ear. Typical facial differences in children which have slightly different characteristsics (not
born with Treacher-Collins syndrome include down- discussed here). In a study conducted in Atlanta. cra-
ward-slanting eyes, underdeveloped or missing cheek niosynostosis occurred in 4.3 in every 10,000 births
bones, a small jaw, underdeveloped or malformed between the years 1989 and 2003 (Boulet, Rasmussen,
ears, and an unusually shaped mouth. An individual & Honein, 2008). A frequent consequence of the early
with Treacher-Collins syndrome may have an unusu- fusion of skull bones is cleft palate. Craniosynostosis
ally shaped palate or a cleft palate. In addition, the has a genetic basis, often associated with anomalies of
small bones in the middle ear that conduct sound the 10th chromosome.
(see Chapter 22) may be affected, producing some hear-
ing loss.
Treacher-Collins syndrome is believed to result Cleft Palate: Other Considerations
from a problem with chromosome 5, which is respon-
sible for the embryological development of the first CL/P and CPO are complex health care problems. The
pharyngeal arch. The syndrome is fairly rare, being focus in this chapter has been on speech production in
found in roughly one in every 10,000 births. Treacher- children with these craniofacial anomalies, but many
Collins syndrome is distinctive, however, for having children with CL/P or CPO have additional issues that
its symptom set confined to the structural malforma- must be dealt with by a health care team. Children with
tions described previously. In most cases, persons with CL/P and CPO have problems with dentition, multiple
Treacher-Collins have normal cognitive abilities and no surgeries, socialization, language development, hear-
other anomalies in distant structures or functions. ing, and nutrition (especially as young children). The
health care team is therefore likely to include dentists, Knowledge of embryological development of the
maxillofacial and plastic surgeons, psychologists/ head and neck is important to understanding how and
social workers, speech-language pathologists, audiolo- where clefts of the upper lip and palate occur.
gists, and nutritionists. The embryological errors that result in cleft lip and
Of particular interest are the possibilities of delayed cleft palate are independent; a child can be born with
language development and hearing loss in children with a cleft lip and cleft palate, a cleft lip only, or a cleft pal-
CL/P and CPO. In a review of language development ate only.
in CL/P and CPO, Morgan and colleagues (2017) note Clefts of the upper lip almost always occur to the
that language disorders are not unusual in this group side of the center of the mouth; the clefts may be uni-
of children and are likely to be more severe in children lateral (one side only) or bilateral (both sides) and vary
with CPO. The greater severity of language disorders in in severity from a very small notch in the lower lip to a
children with CPO may be related to the greater likeli- complete cleft lip from the lower lip through the floor
hood of CPO being associated with a syndrome, com- of the nasal cavity.
pared with CL/P. Many syndromes such as VCFS in Clefts of the palate may arise from two types of
which CPO is a component may be characterized by embryological error: one in which the palatine shelves
language learning and cognitive delays (Hardin-Jones snap up into a horizontal position but fail to generate
& Chapman, 2011). This may explain why children with sufficient tissue to meet in the middle and fuse, the
CPO have more severe language disorders compared other in which the palatine shelves cannot snap up
with children who have CL/P. into the horizontal position because they are trapped
Hearing problems in children with CL/P and beneath the tongue, which has not been lowered due
CPO are largely due to frequent middle ear infec- to incomplete growth of the mandible.
tions (technically, otitis media with effusion [OME]: Clefts of the palate may be minor ranging from a
see Chapter 23). Frequent OMEs are a problem with small notch at the back of the soft palate, to a complete
nearly all children with clefts. The hearing loss is con- cleft, splitting the palate from back to front.
ductive, meaning that it is caused by interference with Two general categories of clefts are recognized:
the middle ear mechanism, not the sensory organs of cleft lip with or without a cleft palate (CL/P) and cleft
the inner ear. The mild-to-moderate hearing loss asso- palate only (CPO).
ciated with OME is present during the infection but The incidence of clefting varies depending on
not when it resolves. The high percentage occurrence country/region and other factors; worldwide the inci-
and recurrence of OME in children with clefts (75% or dence is about 1 in every 700 births.
greater), compared to children with normal head and The velopharyngeal port (VP port), the passage-
neck structures (around 20%), is not well understood way between the pharyngeal and nasal cavities, is
but is thought to be related to the orofacial defects that opened by gravity and perhaps some muscular forces,
are part of clefting (Flynn, Möller, Jönsson, & Lohman- and closed by complex muscular forces of the soft pal-
der, 2009). Not all children with recurrent OMEs have ate and pharynx.
hearing loss when the middle ear is infected. As in the When the VP port is open, air can flow from the
general population, the frequency of OMEs in children pharyngeal cavity to the nasal cavities; during speech,
with clefts decreases with age. an open VP port is consistent with the production of
OMEs are a health problem for children regard- nasal sounds (such as /m/, /n/, /ŋ/ [final sound in
less of their other concerns. A specific concern with the “ring”]).
high rate of OMEs in children with clefts is that the fre- When the VP port is closed, air is prevented from
quent conductive hearing loss, even though not severe, flowing into the nasal cavities; if there is another “seal”
may interfere with language development and have last- in the vocal tract, such as the closure of the lips for a
ing effects on children’s literacy skills. Happily, there is /b/, the vocal tract is a closed volume and the flow
little evidence that this is the case (Roberts et al., 2004). of air into it from the lungs and through the vibrating
vocal folds causes the pressure in the vocal tract to rise
above atmospheric pressure.
Children or adults who have difficulty producing
Chapter Summary
closure of the VP port even after surgical repair of pala-
tal clefts have some degree of velopharyngeal insuf-
A craniofacial anomaly is defined as any deviation ficiency (VPI) which is likely to result in hypernasality
from normal structure, form, or function in the head and obstruent errors (stops, fricatives, and affricates,
and neck area of an individual; this chapter focuses on speech sounds that require a buildup of pressure inside
cleft lip and cleft palate. the vocal tract).
Obstruent speech sound errors that result from Kummer, A. K. (2011). Speech therapy for errors secondary to
VPI are categorized as obligatory errors and compen- cleft palate and velopharyngeal dysfunction. Seminars in
satory errors. Speech and Language, 32, 191–198.
Obligatory errors are those in which the speaker Kummer, A. K. (2018). A pediatrician’s guide to communica-
tion disorders secondary to cleft lip/palate. Pediatric Clin-
has the correct articulatory placement for the sound, but
ics of North America, 65, 31–46.
the leak through the VP port prevents the buildup of air
Kummer, A. K. (2020). Cleft palate and craniofacial conditions
pressure required for correct production of the sound. (4th ed.). Burlington, MA: Jones and Bartlett Learning.
Compensatory errors are those in which the Ladefoged, P., & Maddieson, I. (1996). The sounds of the world’s
speaker changes the place of articulation for the “tar- languages. Oxford, UK: Blackwell.
get” sound to avoid the leak, and thus produces the Leslie, E. J., & Marazita, M. L. (2013). Genetics of cleft lip and
correct manner of articulation (e.g., a stop manner or a cleft palate. American Journal of Medical Genetics. Part C.
fricative manner); compensatory errors are often made 163, 246–258.
with a pharyngeal or glottal place of articulation. McDonald-McGinn, D. M., & Sullivan, K. E. (2011). Chromo-
Many syndromes (a collection of symptoms and/ some 22q11.2 deletion syndrome (DiGeorge syndrome/
or anomalies that occur together and are seen repeat- velocardiofacial syndrome). Medicine, 90, 1–18.
Moore, K. L., Persaud, T. V. N., & Torchia, M. G. (2019). The
edly in a number of children) have cleft palate as one of
developing human (11th ed.). Philadelphia, PA: Saunders.
their characteristics; it is more typical for these clefts to
Morgan, A. R., Belluci, C. C., Coppersmith, J., Linde, S.B.,
be isolated clefts of the palate, rather than cleft lip with Curtis, A., Albert, M., . . . Kapp-Simon, K. (2017). Language
or without a cleft palate. development in children with cleft palate with or without
Diagnosis of VPI is done by perceptual and instru- cleft lip adopted from non–English-speaking countries.
mental methods, the latter including x-ray and endo- American Journal of Speech-Language Pathology, 26, 342–354.
scopic techniques that provide direct visualization of Mossey, P. A., Little, J., Monger, R. G., Dixon, M. J., & Shaw,
the VP port to determine why it is not closing correctly. W. C. (2009). Cleft lip and palate. The Lancet, 374, 21–27.
Clefting is a complex health care problem, requiring Parker, S. E., Mai, C. T., Canfield, M. A., Rickard, R., Wang,
a team that may include speech-language pathologists, Y., Meyer, R. E., . . . Correa, A. (2010). Updated national
audiologists, surgeons, nutritionists, and psychologists. birth prevalence estimates for selected birth defects in the
United States, 2004–2006. Birth Defects Research Part A:
Clinical and Molecular Teratology, 88, 1008–1016.
Peterson-Falzone, S. J., Trost-Cardamone, J. E., Karnell M. P.,
References & Hardin-Jones, M. A. (2017). The clinicians guide to treating
cleft palate speech (2nd ed.). St. Louis, MO: Elsevier.
Boulet, S. L., Rasmussen, S. A., & Honein, M. A. (2008). Printzlau, A., & Andersen, M. (2004). Pierre Robin sequence
A population–based study of craniosynostosis in metro- in Denmark: A retrospective population-based epidemio-
politan Atlanta, 1989–2003. American Journal of Medical logical study. Cleft Palate and Craniofacial Journal, 41, 47–52.
Genetics, 146, 984–991. Roberts, J., Hunter, L., Gravel, J., Rosenfeld, R., Berman, S.,
Feragen, K. B., Auckner, R., Særvold, T. K., & Hide, Ø. (2017). Haggard, M., . . . Wallace, I. (2004). Otitis media, hearing
Speech, language, and reading skills in 10-year-old chil- loss, and language learning: controversies and current
dren with palatal clefts: The impact of additional condi- research. Journal of Developmental and Behavioral Pediatrics,
tions. Journal of Communication Disorders, 66, 1–12. 25, 110–122.
Flynn, T., Möller, C., Jönsson, R., & Lohmander, A. (2009). The Schoenwolf, G. C., Bleyl, S. B., Brauer, P. R., & Francis-West, P.
high prevalence of otitis media with effusion in children H. (2014). Larsen’s human embryology (5th ed.). New York,
with cleft lip and palate as compared to children without NY: Elsevier.
clefts. International Journal of Pediatric Otorhinolaryngology, Shkoukani, M. A., Lawrence, L. A., Liebertz, D. J., & Svider,
73, 1441–1446. P. F. (2014). Cleft palate: A clinical review. Birth Defects
Hardin-Jones, M., & Chapman, K. L. (2011). Cognitive and Research (Part C), 102, 333–342.
language issues associated with cleft lip and palate. Semi- Shprintzen, R. J., & Bardach, J. (1995). Cleft palate speech man-
nars in Speech and Language, 32, 127–140. agement. St. Louis, MO: Mosby.
Harville, E. W., Wilcox, A. J., Lie, R. T., Vindenes, H., & Aby- Watterson, T., Mancini, M., Brancamp, T. U., & Lewis, K. E.
holm, F. (2005). Cleft lip and palate versus cleft lip only: (2013). Relationship between the perception of hyperna-
Are they distinct defects? American Journal of Epidemiology, sality and social judgments in school-aged children. Cleft
162, 448–453. Palate-Craniofacial Journal, 50, 498–502.
20
Swallowing
Introduction Anatomy of Swallowing
The ease of eating and drinking is deceptive. These Figure 20–1 shows the structures that participate in
are complicated activities that require coordinated swallowing. These structures extend from the lips
actions of the lips, mandible, tongue, velum, pharynx, to the stomach. Structures such as lips, jaw, tongue,
larynx, esophagus, and other structures. Because eat- soft palate, pharynx, and larynx are used in speech
ing and drinking engage many of the same structures production as well as swallowing (see Chapter 10).
and much of the same airway as those used for speak- The esophagus and stomach, critical structures for the
ing and breathing, it is not uncommon for there to be process of swallowing, are not used in the speech pro-
competition between these activities or for tradeoffs duction process.
to occur when trying to do them simultaneously. For
example, chewing must stop to be able to speak clearly,
and breathing must stop to be able to swallow safely. Esophagus
The entire act of placing liquid or solid substance
in the oral cavity, moving it backward to the phar- The esophagus is a flexible tube, about 20 to 25 cm
ynx, propelling it into the esophagus, and allowing long in adults, which extends from the lower part of
it to make its way to the stomach is called deglutition. the pharynx to the stomach. The esophagus begins
Although the word swallowing is sometimes used as a below the base of the larynx and runs behind the tra-
synonym for deglutition, swallowing actually includes chea and lungs. It runs through the diaphragm (see
only certain phases of deglutition. Nevertheless, to Figure 20–1) and enters the abdominal cavity where
simplify the explanations that follow, the term “swal- it connects to the stomach. It is composed of a combi-
lowing” is used in place of deglutition and is meant to nation of skeletal (voluntary) muscle (the top part of
include all phases of deglutition. the tube) and smooth muscle (the bottom part of the
Swallowing disorders are common in hospital set- tube). Smooth muscle is not under “willful” control
tings, and in fact account for a significant number of (like, e.g., muscles of the arm), but contracts in reaction
hospital deaths. The deaths are usually a result of pneu- to various stimuli.
monia due to food and/or drink going into the lungs The upper end of the esophagus is normally closed.
instead of the stomach. Food and drink in the lungs During swallows, it is opened by muscular action (see
cause bacterial infections — serious cases of pneumonia. later in this chapter).
277
Oral cavity
Pharyngeal cavity
Vocal folds
Esophagus
Lower
esophageal
sphincter Diaphragm
Pyloric Stomach
sphincter
Small
intestine
Figure 20–1. Structures of the swallowing mechanism.
Stomach The Act of Swallowing
The stomach is a large, saclike structure made up of Although many of the structures that participate in
smooth muscle, mucosa, and other tissue. It is on the swallowing are the same as those that are used for
left side of the abdominal cavity, against the undersur- speaking, the forces and movements for the two activi-
face of the diaphragm. The stomach is connected to the ties are very different. In general, the forces are greater
lower esophagus on one end, and to the small intestine and many of the movements are slower during swal-
at the other end. After a typical meal, the stomach holds lowing than during speech production.
about a liter of solid and/or liquid substance. Gastric There are four phases of swallowing. These
juices in the stomach break up ingested substances so phases, illustrated in Figure 20–2, are the oral prepa-
that they can be absorbed into the body through the ratory phase, oral transport phase, pharyngeal phase,
stomach lining. and esophageal phase. The phases are used to describe
20 Swallowing 279
the movement of a bolus through the oral, pharyngeal, being swallowed. The physiological events associated
and esophageal regions of the anatomical structures of with each of these phases are described in the follow-
the swallowing process. Bolus is the word used to refer ing text and summarized in Table 20–1. In Figure 20–2,
to the volume of liquid or the mass of solid substance the green areas show the location and shape of the
Oral Oral Pharyngeal Esophageal

preparatory transport
Figure 20–2. Images of the oral preparatory, oral transport, pharyngeal, and esophageal phases of swallowing. Table
20–1 provides a summary of these phases.
Table 20–1. Summary of the Actions Associated With the Four Phases of Swallowing
Swallowing Phase Actions

Oral preparatory This phase begins as the liquid or solid substance comes in contact with the oral opening and
ends with the bolus held in the oral cavity with the back of the tongue elevated to contact the
velum and create an impenetrable wall. This phase can be as short as 1 second when ingesting
liquid and as long as 20 seconds when chewing (preparing) a solid food.
Oral transport During this phase the bolus is transported back through the oral cavity to the pharynx. To do so,
the tongue elevates in progressively more posterior regions to push the bolus back toward the
pharynx, the velum begins to elevate, and the upper esophageal sphincter begins to relax. This
phase lasts less than 1 second.
Pharyngeal During this phase, the bolus usually divides to run through the right and left sides of the bottom
of the tongue and is transported through the pharynx to the upper esophageal sphincter. This
phase is “triggered” automatically once the bolus passes the back of the oral cavity and is
associated with numerous and rapid events: the velopharynx closes, the tongue pushes the bolus
backward, the pharynx constricts segmentally, the hyoid bone and larynx move upward and
forward, the larynx is closed, and the upper esophageal sphincter opens. This phase lasts less
than 1 second.
Esophageal This phase begins when the bolus enters the upper esophageal sphincter. At the same time, the
lower esophageal sphincter relaxes. The bolus is moved through the esophagus by peristaltic
contractions. This phase ends when the bolus enters the stomach and can last from 8 to 20
seconds.
bolus for each phase; the time sequence of the phases into a well-formed bolus and position it on the front
is from left to right. surface of the tongue. The lips may close (though this
is not necessary) while the mandible moves to grind
the bolus. During chewing, the mandible moves up
Oral Preparatory Phase and down, forward and backward, and side to side.
The soft palate makes contact with the back part of the
The oral preparatory phase is depicted in the first panel tongue to seal off the oral from the pharyngeal cavity
of Figure 20–2. This phase begins as a solid or liquid and prevent the bolus from moving into the pharynx
substance makes contact with the structures at the front and larynx. The velopharyngeal port is open during
of the oral cavity. The jaw is lowered and the lips part preparation of the bolus, and breathing may either con-
in anticipation of the swallow (Shune, Moon, & Good- tinue or may stop momentarily (McFarland & Lund,
man, 2016). What happens next depends on the nature 1995; Palmer & Hiiemae, 2003). The duration of the oral
of the substance to be swallowed. preparatory phase may last from as short as 3 seconds,
when chewing a soft cookie, to as long as 20 seconds,
when chewing a tough piece of steak.
Liquid Swallows
At the end of the oral preparatory phase, the sub-
If the substance is liquid, the jaw elevates and the lips stance in the oral cavity is ready to be consumed. Usu-
close. This creates a tight closure at the front of the ally the bolus is immediately transported back toward
mouth to contain the bolus. The bolus is contained in the the pharynx (oral transport phase, see the next section).
front region of the oral cavity by actions of the tongue
and other structures, and held there momentarily (for
about 1 second). The front of the tongue depresses and Oral Transport Phase
the sides of the tongue elevate to form a cup for the
bolus (Dodds et al., 1989). The back of the tongue ele- The oral transport phase is shown in the second panel
vates to make contact with the soft palate to form a back of Figure 20–2. From the ready position (the oral pre-
wall that separates the oral from the pharyngeal cavities paratory phase), the bolus is transported back through
and helps ensure that no substance leaks through into the oral cavity. This is done by using the tongue tip to
the throat, and possibly into the pulmonary airways. squeeze the bolus against the hard palate; then pro-
During the oral preparatory phase, the velopha- gressively more posterior regions of the tongue elevate
ryngeal port is open so that breathing can continue and squeeze the bolus against the palate, moving the
with air flowing to and from the lungs through the bolus back toward the pharynx. At the same time as
nasal passageways. Many people stop breathing the bolus is being moved in the direction of the palate,
momentarily at this point in the swallow (this is called the velopharyngeal port begins to close as the top of the
the apneic interval) or even before the glass or straw esophagus begins to open. The oral transport phase is
reaches the lips (Martin, Logemann, Shaker, & Dodds, short, lasting less than a second (Cook et al., 1994; Tracy
1994; Martin-Harris, Brodsky, Price, Michel, & Walters, et al., 1989).
2003; Martin-Harris, Michel, & Castell, 2005). The stop-
page of breathing reduces the risk of aspiration, which
is the invasion of food or drink below the vocal folds Pharyngeal Phase
and into the lungs.
The pharyngeal phase of the swallow is “triggered”
when the bolus approaches the boundary between the
Solid Swallows
back of the oral cavity and the pharynx. During this
The events of the oral preparatory phase are different phase, depicted in the third panel of Figure 20–2, sev-
when the substance to be swallowed is solid rather eral events occur rapidly and nearly simultaneously
than liquid, primarily because solid substances need to move the bolus quickly through the pharynx while
to be chewed into smaller pieces and mixed with saliva protecting the airway. “Protecting the airway” means
before being transported toward the esophagus. Saliva not allowing any part of the bolus to travel through the
is an important ingredient in this process because it vocal folds and into the trachea, and then deeper into
moistens the solid substance to facilitate its transport. the lungs.
Saliva also introduces enzymes that begin to break This pharyngeal phase is under “automatic” neu-
down the substance for digestion. ral control, so that once triggered, it proceeds as a
Actions of the mandible (and teeth), lips, tongue, relatively fixed set of events that cannot be altered vol-
and cheeks grind and manipulate the solid substance untarily (except in certain respects that are not covered
20 Swallowing 281
here). These events occur within about half a second bone. Elevation of the larynx also causes the pharynx
(Cook et al., 1994; Tracy et al., 1989) and include velo- to shorten.
pharyngeal closure, elevation of the hyoid bone and Closure of the larynx for swallowing forms a seal
larynx, laryngeal closure, pharyngeal constriction, and to the entrance of the trachea to protect the pulmo-
opening of the upper entrance to the esophagus. nary airways. Closure occurs at multiple levels, which
During the pharyngeal phase, the velopharynx include the vocal folds, the false vocal folds (locat-
closes like a sphincter valve by elevation of the velum ed immediately above the true vocal folds), and the
and constriction of the pharyngeal walls. This closure epiglottis. Upward and forward movement of the lar-
is forceful (more forceful than for speech production) ynx contribute to airway protection by tucking the
so as to prohibit substances from passing through the larynx against the root of the tongue and moving the
nasopharynx into the nose. trachea away from the pathway of the bolus to the
The hyoid bone and larynx (Figure 20–3) move esophagus.
upward and forward as a result of contraction of As the tongue propels the bolus into the pharynx,
muscles that attach to the hyoid bone. These muscles the muscles of the pharynx contract from top to bot-
have their origins on the jaw, and in the floor of the tom. The tongue root moves backward and the pha-
mouth. As the hyoid bone is pulled upward and for- ryngeal walls constrict to squeeze the bolus toward the
ward, the larynx is pulled along with it by its mus- esophagus. The top-to-bottom contraction of pharyn-
cular and nonmuscular connections to the hyoid geal muscles is like a directional “squeeze” toward the
Hyoid bone
Thyroid cartilage
Epiglottis
Arytenoid cartilages
Cricoid cartilage
Figure 20–3. Skeletal framework of the larynx and associated

structures; note especially the hyoid bone and cartilages of the
larynx.
esophagus. This is conceptually similar to the front-to-

back squeezing by the tongue against the palate in the
oral transport phase, described previously.
As all of these events are taking place, the upper
border of the esophagus is opening to allow the bolus
“in.” Two sets of actions appear to contribute to its
opening: (a) stretching of the upper esophageal sphinc-
ter by forward and upward movement of the hyoid
bone and larynx, respectively, and (b) relaxing of a
muscle that forms a ring around the top of the esopha-
gus (Omari et al., 2016).1 As suggested by the concept
of a muscle relaxing to accomplish a task (in this case,
swallowing), this muscle is usually in a state of contrac-
tion, holding the top of the esophagus closed.
Esophageal Phase
The esophageal phase, the initial part of which is illus-

trated in the right-most panel of Figure 20–2, begins
when the bolus enters the upper esophagus and ends
when it passes into the stomach through the lower
opening of the esophagus (seen in Figure 20–1). This
phase may last anywhere from 8 to 20 seconds (Dodds,
Hogan, Reid, Stewart, & Arndorfer, 1973). The bolus is
propelled through the esophagus by peristaltic actions
(alternating waves of contraction and relaxation) of the
esophageal walls. Peristaltic contraction raises pressure
behind the bolus and relaxation lowers pressure in front
Figure 20–4. Eating, in which one part of the bolus has
of the bolus, creating the pressure differential needed to
been moved into the lower pharynx, while another part is
propel it toward the stomach. The esophageal phase of
still in the oral preparatory phase of swallowing.
swallowing, like the pharyngeal phase, is “automatic”
in the sense of not being under voluntary control.
Breathing and Swallowing

Overlap of Phases
Protection of the pulmonary airways during swal-
Although the phases of swallowing are described as lowing depends, in large part, on the coordination of
though they are discrete and occur one after the other, breathing and swallowing. Without such coordination,
in fact, they overlap substantially. When a person is inspiration might occur at the same time a substance
eating a solid substance, for example, preparation of is being transported through the pharynx, and that
part of the bolus in the oral cavity may continue while substance might be “sucked” through the larynx into
another part of the bolus moves into the pharyngeal the pulmonary airways (aspiration). This is avoided, as
area, as illustrated in Figure 20–4. The two boluses, reviewed earlier, by closing the larynx, an action that
originally a single bolus, are shown in Figure 20–4 as stops breathing for a brief period during the swallow.
the brown-speckled material. The partial bolus in the The risk of aspiration appears to be further reduced
lower pharynx may remain above the esophagus as by timing the swallow to occur during the expiratory
long as 10 seconds before it merges with the rest of the phase of the breathing cycle. During single swallows
bolus and the pharyngeal transport phase of the swal- (swallowing a single bolus), the most common pat-
low is triggered (Hiiemae & Palmer, 1999). tern is expiration-swallow-expiration; that is, expi-
1
his ring of muscle is the same tissue that can be used as an alternate sound source when a patient has had his larynx removed because of
T
cancer, as described in Chapter 18.
20 Swallowing 283
ration begins, the swallow occurs (accompanied by in protection of the airway in swallowing, and the
breath-holding), and then expiration continues (Martin esophagus has a muscle that relaxes to open its top
et al., 1994; Martin-Harris, 2006; Nishino, Yonezawa, to allow entry of food and drink, and the muscles of
& Honda, 1985; Perlman, Ettema, & Barkmeier, 2000; the esophagus push the bolus down into the stomach,
Selley, Flack, Ellis, & Brooks, 1989; Smith, Wolkove, damage to this one nerve has the potential to create
Colacone, & Kreisman, 1989). major swallowing problems. The coordinated phases
of swallowing, described previously, can be signifi-
cantly disrupted by damage to this single cranial nerve
Nervous System Control (Corbin-Lewis & Liss, 2015).
of Swallowing
A person with swallowing problems is said to have Role of the Central Nervous System
dysphagia. Dysphagia has many causes, including neu-
rological disease. People with Parkinson’s disease, Although swallowing and speech production are
multiple sclerosis, and amyotrophic lateral sclerosis executed using many of the same peripheral nerves,
(ALS), as well as people recovering from stroke, may central nervous system control of these two activities
have dysphagia as a prominent component of their is quite different. This means that a given structure,
challenges. Thus, it is important to have some idea such as the tongue, is under one form of neural control
of how the nervous system controls the swallowing during swallowing and under another form of neural
process. control during speech production. Because of this, it is
The neural control of swallowing is complex and possible to have central nervous system damage that
not completely understood. Nevertheless, studies of impairs the function of a neural structure for speech
humans and animals have offered important insights production but not swallowing, and vice versa. There
into how swallowing is controlled by the nervous sys- are two major regions within the central nervous sys-
tem. Some of the important features of that control are tem that are responsible for the control of swallowing.
discussed in the next sections. One is in the brainstem, and the other is in cortical and
subcortical areas.
The brainstem center is located primarily in the
Role of the Peripheral medulla, the part of the brainstem that is contigu-
Nervous System ous with the uppermost part of the spinal cord. The
brainstem center has primary control over the more
Nearly all the structures involved in swallowing are the automatic phases of swallowing (pharyngeal and
same as those involved in speech production (the most esophageal phases).
notable exceptions being the esophagus and stomach). Many cortical and subcortical regions contribute
Those structures that participate in both swallowing to the generation and shaping of swallowing behav-
and speech production are innervated by the spinal iors. Activity in cortical and subcortical areas of the
nerves and cranial nerves as described in Chapter 2. cerebral hemispheres has a strong influence over the
The cranial nerves are involved in swallowing through control of the more voluntary phases of swallowing
their innervation of the lips, mandible, tongue, velum, (oral preparatory phase, including chewing, and oral
pharynx, and larynx; spinal nerves are primarily transport phase). Cortical damage due to stroke (and
involved in breathing and its cessation as they relate to possibly other diseases that affect structures in the cere-
swallowing. Disease or trauma problems that affect the bral hemisphere), however, may also affect the auto-
cranial nerves that serve these structures, such as dam- matic phases of swallowing (pharyngeal phase and
age to a laryngeal nerve that results in unilateral vocal esophageal phase).
fold paralysis (Chapter 18) or damage to the nerves Sensory input that travels from structures of the
that control the contraction of pharyngeal muscles, are head and neck to the central nervous system, via cranial
likely to affect swallowing ability. nerves, is also an important component of swallowing.
The voluntary and smooth muscles of the esopha- Many forms of sensory input play a role in swallow-
gus are also innervated by nerve branches that arise ing, including (among others) bolus size, texture, tem-
from a single cranial nerve. This nerve is called cranial perature, taste, and unpleasant stimuli (e.g., something
nerve X, or the vagus nerve. It is noteworthy that the that tastes bad, has a texture that is sufficiently dif-
vagus nerve and its branches also innervate muscles ferent from the range of familiar bolus textures, and
of the larynx. Because the larynx plays a critical role so forth).
ynx and left there for several seconds while the remain-
Sphenopalatineganglio der of the bolus continues to be chewed (Hiiemae &
neuralgia Palmer, 1999; Palmer, Rudin, Lara, & Crompton, 1992;
see Figure 20–5, later in the chapter). Although some-
This sounds like something you wouldn’t want thing similar can also happen with liquids (Linden,
to meet in the dark. But it comes from something Tippett, Johnston, Siebens, & French, 1989), it is much
really good. As a child (or even as an adult) less common, except in cases where a combined liquid-
you may have said the phrase, “I scream, you and-solid bolus is chewed and swallowed (Saitoh et al.,
scream, we all scream for ice cream.” Scream 2007), such as what might occur during mealtime eat-
has a meaning of anticipation in this context, ing (Dua, Ren, Bardan, Xie, & Shaker, 1997). Even when
but it can also have a meaning of hurting. You not combined, the consistency of liquids and the tex-
know the feeling. You take a bite of ice cream tures of solid food influence swallowing behavior.
and momentarily hold it against the roof of your Liquid substances can be characterized accord-
mouth before you swallow it. Then suddenly ing to consistency, ranging from as thin as water to
you get an intense, stabbing pain in your fore- as thick as pudding. Differences in consistency have
head. What’s up? The pain is caused as your been shown to influence swallowing. Thick liquids
hard palate warms up after you made it cold. or puree consistencies tend to take longer to swal-
Cold causes vasoconstriction (reduction in blood low than thin liquids (Chi-Fishman & Sonies, 2002;
vessel diameter) in the region, which is followed Im, Kim, Oommen, Kim, & Ko, 2012). Tongue forces
by rapid vasodilation (increase in blood vessel in the oral preparatory and oral transport phases are
diameter). It is the rapid vasodilation that hurts higher when swallowing thick substances compared
and gets your attention. The technical term for to thin liquids (Chi-Fishman & Sonies, 2002; Steele &
this pain is “sphenopalatineganglioneuralgia.” van Lieshout, 2004). As might be predicted, it is more
The common term (and the one more easily difficult to maintain a cohesive (single) bolus when
pronounced) is “brain freeze.” Fortunately, the swallowing thinner liquids as compared to thicker
pain lasts only a few seconds. liquids. As a result, laryngeal penetration (where part
of the bolus moves into the opening to the larynx but
remains above the vocal folds; Robbins, Hamilton, Lof,
& Kempster, 1992) is more common when swallowing
Variables That Influence thin liquids than when swallowing thicker substances
Swallowing (Daggett, Logemann, Rademaker, & Pauloski, 2006;
Steele et al., 2015).
A number of variables influence swallowing, including The textures of solid substances can also influ-
characteristics of the bolus, the swallowing mode, and ence the swallow (Steele et al., 2015). For example, the
developmental and aging effects. The sections below harder and drier the substance, the greater the num-
discuss each of these. ber of chewing cycles (Engelen, Fontijn-Tekamp, &
van der Bilt, 2005), the longer the duration of the ini-
tial transport of the bolus from the anterior oral cavity
Bolus Characteristics to the region where the oral cavity meets the pharynx
(Mikushi, Seki, Brodsky, Matsuo, & Palmer, 2014), and
Although the act of swallowing occurs generally as the greater number of times the tongue squeezes the
described near the beginning of this chapter, the pre- bolus back toward the pharynx (Hiraoka et al., 2017).
cise nature of the swallow is determined, in part, by
what exactly is being swallowed. Bolus consistency
Volume
and texture, volume, and taste are variables that have
been found to influence the act of swallowing. It seems intuitive that the volume (size) of the bolus
might affect the swallow, and most studies indicate
that, in fact, it does (Chi-Fishman & Sonies, 2002;
Consistency and Texture
Cook et al., 1989; Kahrilas & Logemann, 1993; Loge-
One of the most important contrasts that determines mann et al., 2000; Logemann, Pauloski, Rademaker, &
swallowing behavior is the difference between liquids Kahrilas, 2002; Perlman, Palmer, McCulloch, & Van-
and solids. Whereas a liquid bolus is usually held briefly Daele, 1999; Perlman, Schultz, & VanDaele, 1993; Tasko,
in the front of the oral cavity before being propelled to Kent, & Westbury, 2002). When a person is swallow-
the pharynx, a solid bolus may be moved to the phar- ing a larger bolus compared to a smaller bolus, tongue
20 Swallowing 285
movements are generally larger and faster, hyoid bone the floor of the oral cavity; (b) the infant’s oral cavity
movements begin earlier and are more extensive, pha- goes from being edentulous (lacking teeth) to having a
ryngeal wall movements and laryngeal movements full set of “baby” teeth; (c) the cheeks of the infant have
are larger, and the upper esophageal sphincter relaxes fatty pads (sometimes called sucking pads) that even-
and opens earlier and stays open longer (Cock, Jones, tually disappear, to be replaced with muscle; (d) the
Hammer, Omari, & McCulloch, 2017; Kahrilas & Loge- infant goes from having essentially no oropharynx
mann, 1993). to having a distinct one as the larynx descends; and
Despite the success of the adjustments made (e) the infant’s larynx goes from being one-third adult
to accommodate a larger bolus, there tends to be a size, with relatively large arytenoid cartilages and a
greater frequency of laryngeal penetration as bolus high position within the neck, to the adult configura-
size increases, at least for liquid boluses. For example, tion and position.
part of the bolus penetrates the opening into the larynx Swallowing (of amniotic fluid) begins well before
more than twice as often when swallowing a 10-mL birth, as early as 12.5 weeks’ gestation (Humphrey,
bolus than when swallowing a 1-mL bolus (Daggett 1970). Interestingly, although many of the components
et al., 2006). Nevertheless, when laryngeal penetration of swallowing are in place before birth, velopharyn-
occurs in healthy individuals, the substance is almost geal closure during swallowing is not (Miller, Sonies,
always pushed away from the larynx and transported & Macedonia, 2003). Immediately after birth, the velo-
to the esophagus without being aspirated (going below pharynx closes for swallowing and the infant exhibits
the vocal folds). a suckling pattern characterized by forward and back-
ward (horizontal) movements of the tongue (Bosma,
1986; Bosma, Truby, & Lind, 1965). These tongue move-
Sword Throats ments are accompanied by large vertical movements
of the mandible and serve to draw liquid into the oral
Sword swallowing is an ancient art cavity. Around the age of 6 months, this suckling pat-
that continues to be practiced. There is even a tern converts to a sucking pattern that is characterized
Sword Swallowers Association International by raising and lowering (vertical) movements of the
with both professional and amateur members tongue, firm approximation of the lips, and less pro-
from all over the world. The practice and ill nounced vertical movements of the mandible. Sucking
effects of sword swallowing were discussed is stronger than suckling and allows the infant to pull
in an article in the prestigious British Medical in thicker substances into the oral cavity and to begin
Journal (Witcombe & Meyer, 2006). Major the ingestion of soft food (Arvedson & Brodsky, 2002).
complications from sword swallowing are more During the first few months of life, the infant relies
likely when the swallower is distracted or when on breastfeeding (or nipple feeding from a bottle) for
swallowing unusual swords. Complications all nutritional intake. This form of feeding consists
can include — little wonder — perforation of the of suck-swallow or suck-swallow-breathe sequences,
pharynx or esophagus, gastrointestinal bleeding, typically repeated several times (8 to 12 times) and fol-
pneumothorax (collapsed lung), and chest pain. lowed by a rest period (several seconds). During this
Novice sword swallowers must learn to desen- period, several oral reflexes that aid in early feeding
sitize the gag reflex, align the upper esophageal are active. These disappear around 6 months of age,
sphincter with the neck hyperextended, open the with the exception of the gag reflex, which remains
upper esophageal sphincter, and control retching active throughout childhood and adulthood. Knowl-
as the blade is moved down the pipe. All in all, it edge of the neural interactions among feeding, swal-
does not sound like fun; it also makes for a very lowing, and airway protection is essential for providing
long bolus. quality care for infants with impairments in any of
these functions (Jadcherla, 2017).
By about 6 months of age, infants are ready to
begin eating solid foods and being fed by spoon. Foods
Development such as crackers and soft fruits and vegetables are intro-
duced during the next few months. The basic patterns
Many important anatomical changes influence swal- for chewing are in place by 9 months and continue to
lowing during the period from infancy through develop over the next few years of life (Green et al.,
childhood. Among these changes are the following 1997; Steeve, Moore, Green, Reilly, & Ruark McMur-
(Arvedson & Brodsky, 2002): (a) the infant’s tongue trey, 2008). By 2 to 3 years of age, the child is able to eat
goes from nearly filling the oral cavity to filling only regular table food.
Age with the opening and closing of the velopharyngeal

port (Chapter 18). Videofluoroscopy is also used rou-
As with most physiological functions, swallowing tinely in clinical settings to evaluate swallowing in
changes with age across adulthood. The most promi- clients with suspected dysphagia. When used for this
nent age-related change is that swallowing becomes purpose, the substance to be swallowed is mixed with
slower, particularly after age 60 years (Leonard & Mc- barium sulfate. Barium is a contrast material that allows
Kenzie, 2006; Logemann et al., 2002; Robbins et al., the bolus to be tracked visually as it travels through the
1992; Sonies, Parent, Morrish, & Baum, 1988). oral, pharyngeal, and esophageal regions.
An outcome of the age-related slowing of the The videofluoroscopic swallow examination, some-
swallow (combined with age-related reductions in times called a modified barium swallow (MBS) study,
sensory function; for example, see Malandraki, Perl- was first described by Logemann, Boshes, Blonsky,
man, Karampinos, & Sutton, 2011) is that the fre- and Fisher (1977). The adjective “modified” is used to
quency of laryngeal penetration increases with age differentiate this study from a barium swallow study,
(Daniels et al., 2004; Robbins et al., 1992). Laryngeal which is conducted by a gastroenterologist to evaluate
penetration occurs in people over 50 years about twice esophageal structure and function.
as often as it occurs in adults under 50 years, and A videofluoroscopic examination is usually con-
more frequently when swallowing liquids than when ducted with the client seated in a specially designed
swallowing solids (Daggett et al., 2006). Although chair. The examination is performed in a radiology
this appears to be a dangerous situation and a pos- laboratory, with a radiologist (or radiology technician)
sible precursor to aspiration, in healthy individuals running the x-ray equipment and a speech-language
the substance is moved out of the space immediately pathologist directing the swallowing protocol. The
above the vocal folds to be rejoined with the rest of the examination protocol typically consists of the swallow-
bolus (Daggett et al., 2006). Nevertheless, the risk of ing of a series of liquid and solid substances (mixed
aspiration may be higher in older adults, compared with barium or accompanied by ingestion of a barium
to younger adults, because of their greater tendency to capsule to provide contrast) that vary in volume and
inspire immediately after swallowing (Martin-Harris, consistency or texture.
Brodsky, et al., 2005). Figure 20–5 is an example of a videofluoroscopic
image of the oral preparatory phase of swallowing. The
single frame was extracted from a moving image (i.e., a
Measurement and Analysis movie) recorded during a swallow.
of Swallowing Both timing and spatial measurements can be
made from the videofluoroscopic images. Temporal
Measurement and analysis of swallowing are not only (timing) measurements can be made to determine the
critical to research endeavors but have also become time from the beginning of bolus movement from the
essential to clinical practice and to the diagnosis and oral cavity to its arrival at the upper esophageal sphinc-
management of dysphagia (meaning swallowing dis- ter. Spatial measurements may be used to determine
orders and pronounced dis-FAY-juh). Measurement of the extent of velar elevation or extent of the upward
swallowing using instruments is especially important and forward movement of the hyoid bone. There are
when considering that as many as half of the clients also measures of the amount of penetration (entry of
who aspirate do so “silently” without any signs of food or liquid to the laryngeal area) and/or aspiration
coughing or other signs of visible or audible struggle (Rosenbek, Robbins, Roecker, Coyle, & Wood, 1996).
(Logemann, 1998). In such cases, aspiration can only It is generally agreed that videofluoroscopy pro-
be detected through instrumental examination (as pre- vides the most comprehensive evaluation of swallow-
viously defined, aspiration is the invasion of food or ing, and for many it is considered the “gold standard”
drink below the vocal folds and into the lungs). of measurement. It has several advantages over other
There are many approaches to measuring and ana- measurement approaches, including the following:
lyzing swallowing. Two are highlighted here — video- (a) it provides relatively clear views of nearly all the
fluoroscopy and endoscopy. important structures involved in swallowing and their
movements, with the exception of the vocal folds;
Videofluoroscopy (b) it is possible to visualize barium-laced substances
through the oral preparatory, oral transport, pharyn-
Videofluoroscopy is an x-ray technique used to image geal, and esophageal phases; (c) it is possible to view
the movements of speech production. Videofluoros- swallowing events from at least two different perspec-
copy can also be used to image movements associated tives (from the side, and from the front); and (d) it is
20 Swallowing 287
is called flexible endoscopic evaluation of swallowing

(FEES). No x-rays are used in endoscopy, and therefore
no barium-infused boluses are required for evaluation
of swallowing disorders. A FEES station is shown in
Figure 20–6.
The examination usually includes a preliminary
viewing of the structures included in the process of
swallowing, such as (among others) the velopharyn-
geal region, pharyngeal walls, back part of the tongue,
entrance to the larynx, and the vocal folds. Abnormali-
ties in structure or color are noted and are used to help
interpret abnormal swallow behaviors. The evaluation
protocol is similar to that used for videofluoroscopic
examination, using liquid and solid substances of dif-
ferent consistencies, textures, and volumes.
Endoscopy offers certain advantages over other
approaches to evaluating swallowing. For example,
the equipment is portable so that the examination can
be done at bedside in a hospital; there is no exposure
to x-rays and no need to use barium products; and it
Figure 20–5. Videofluoroscopic image showing the is possible to see structural and color abnormalities.
oral preparatory phase of a swallow. The large, dark area In addition, the procedure can often be performed
in the oral region is the bolus. The thin, dark line that runs by a speech-language pathologist without the direct
along the tongue to the epiglottis indicates that there may oversight of a physician or the aid of other health care
be some trace bolus residue from a previous swallow or professionals. The speech-language pathologist can
that there has been some premature spillage during the also observe the client eat an entire meal at the client’s
oral preparatory phase. Modified and reproduced with usual pace.
permission from “Dynamic swallow studies: measurement A disadvantage of endoscopy for evaluation of
techniques,” by R. Leonard and S. McKenzie in Dysphagia swallowing is that there are some clients who cannot
assessment and treatment planning: A team approach (2nd tolerate the procedure, including those with structural
ed., p. 273). Edited by R. Leonard and K. Kendall, 2008, San abnormalities such as a deviated nasal septum, certain
Diego, CA: Plural Publishing, Inc. Copyright 2008 by Plural movement disorders, bleeding disorders, and certain
Publishing, Inc. cardiac conditions. Furthermore, it is sometimes diffi-
cult to detect penetration and aspiration with endos-
copy, although there are techniques to get around this.
possible to identify penetration and aspiration events.
The major disadvantages of videofluoroscopy are that
it requires exposure to radiation, it must be coordinated Client Self-Report
with radiology, and it cannot be conducted at bedside.
An important form of measurement, especially in
clinical settings, is the client self-report. The client self-
Endoscopy report can reveal symptoms (e.g., pain during swallow-
ing, lump in the throat, difficulty swallowing certain
Another way to visualize swallowing is with endoscopy foods) that indicate the need to perform instrumental
(Langmore, Schatz, & Olson, 1988), also mentioned in evaluation of swallowing using measurement proce-
Chapter 19. This approach requires the use of a flexible dures such as those just described. One way to glean
fiberoptic endoscope, like the one used for visualizing information about the client’s perspective of the swal-
the larynx and velopharynx. To view the swallowing lowing problem is by using an unstructured interview.
apparatus, the endoscope is inserted through one of An alternative or complementary approach is to use a
the nares (following the administration of topical anes- more formal, symptom-specific assessment tool. There
thesia and decongestant), routed through the velopha- are several to choose from, one of which is called the
ryngeal port, and guided until its tip is positioned in Eating Assessment Tool-10 (EAT-10; Belafsky et al.,
the laryngopharynx (the bottom part of the pharynx, 2008). The EAT-10 contains several statements, such
just above the opening into the larynx). This approach as “My swallowing problem has caused me to lose
Figure 20–6. Fiberoptic endoscope used for evaluation of swallowing. Reproduced

with permission provided courtesy of KayPENTAX, Montvale, NJ.
weight” and “I become short of breath when I eat,” include (among others) tumors, diverticula (abnormal
that the client rates on a scale ranging from “No prob- pouches in the wall of a structure), deformation caused
lem” to “Severe problem.” Research has shown a good by surgical removal of tissue or trauma (e.g., as in sur-
correspondence between the symptoms reported on gical removal of tissue in treatment of head and neck
the EAT-10 and the identification of dysphagia-related cancers), congenital malformations, and tracheostomy
signs obtained from measures in patients with head (a surgically created opening at the front of the neck).
and neck cancer (Arrese, Carrau, & Plowman, 2017) Dysphagia can also have neurogenic causes such as
and in amyotrophic lateral sclerosis (Plowman et al., stroke, degenerative diseases (such as Parkinson’s dis-
2015). ease), and traumatic brain injury.
Evaluation and management of swallowing disor-
ders often require a team of health care professionals
Health Care Team including a speech-language pathologist, radiologist,
for Individuals With gastroenterologist, otolaryngologist, dietitian, occupa-
Swallowing Disorders tional therapist, and others, depending on the nature of
the swallowing disorder.
Evaluation and management of dysphagia (swallow- The speech-language pathologist is responsible for
ing disorder) are a large part of the clinical practice of the evaluation and behavioral management of oropha-
speech-language pathology. Although there are cases ryngeal dysphagia (swallowing disorders involving
of functional dysphagia (wherein there is no known the oral preparatory, oral transport, and pharyngeal
physical reason for the dysphagia) (Baumann & Katz, phases). Usually the speech-language pathologist is
2016), dysphagia usually has an identifiable structural, asked by a physician to evaluate a client with a poten-
neurogenic, and/or systemic cause. Structural causes tial swallowing disorder. The speech-language pathol-
20 Swallowing 289
ologist is responsible for the instrumental aspects of

GERD and LPR videofluoroscopic swallow (modified barium swal-
low) studies and barium swallow studies and, in some
Your stomach is rich with chemicals instances, may help in their interpretation.
that have about the same acidity as the battery
acid in your car. That’s right, that’s the same
battery acid that will burn a hole in your clothes
if you splash some of it on you. GERD, an Chapter Summary
acronym for gastroesophageal reflux disease,
is a chronic condition in which acid from the Eating and drinking involve intricately coordinated
stomach backs up into the esophagus when actions of the lips, mandible, tongue, velum, pharynx,
the lower esophageal sphincter (the valve that larynx, esophagus, and other structures.
separates the esophagus and stomach) fails to The term swallowing is as a synonym for degluti-
do its job properly. Although a certain amount tion and is used as such in this chapter, although swal-
of reflux (backflow) from the stomach into the lowing technically involves only part of the deglutition
esophagus is considered normal, too much can process.
cause heartburn and the need to see a gastro- Many of the important anatomical and physiologi-
enterologist (GI doctor). When stomach acid cal components of the speech production apparatus,
travels all the way through the esophagus and discussed in Chapter 10 of this text, are also important
spills onto the larynx it is called laryngopharyn- anatomical and physiological components of the swal-
geal reflux, or LPR. LPR can irritate and erode lowing apparatus.
laryngeal tissue. LPR can cause a hoarse voice, The stomach is a liter-sized sac whose upper end
chronic cough, frequent throat clearing, and connects to the esophagus and whose lower end con-
other problems that may lead to the need to seek nects to the small intestine through the pyloric sphincter.
help from an otolaryngologist (ENT doctor). The forces and movements associated with the
Some helpful hints for avoiding GERD and LPR: act of swallowing can be categorized in four phases
do not stuff yourself before you go to bed, lay off that include an oral preparatory phase, oral transport
foods that make it worse, and sleep with your phase, pharyngeal phase, and esophageal phase.
body inclined so that your head is higher than The oral preparatory phase involves taking liquid
your feet. or solid substances in through the oral vestibule and
manipulating them within the oral cavity to prepare
the bolus (liquid volume or lump of solid) for passage.
ogist may begin by performing a bedside swallowing The oral transport phase involves moving the
evaluation, which includes a case history interview, a bolus (or a part of it) through the oral cavity toward
physical examination of the swallowing structures, and the pharynx by rearward propulsion due largely to a
visual, auditory, and tactile observation of the client squeezing action of the tongue, against the palate, that
during swallowing of water and possibly other sub- moves from front to back of the oral cavity.
stances. If a problem is suspected, the speech-language The pharyngeal phase is usually “triggered” when
pathologist will perform a videofluoroscopic swal- the bolus passes the junction of the oral cavity and
lowing examination in collaboration with a radiolo- pharynx, and consists of a combination of compres-
gist. During the swallow study, the speech-language sive actions that force the bolus downward toward the
pathologist may screen for esophageal problems, and if esophagus while at the same time protecting the lower
any are noted, a gastroenterologist is notified. Alterna- airways.
tively, a fiberoptic endoscopic evaluation of swallow- The esophageal phase begins when the bolus
ing may be conducted, a procedure that can usually enters the esophagus and continues as the bolus is
be performed by the speech-language pathologist inde- moved toward the stomach by a series of peristaltic
pendently. Behavioral management might include the waves of muscular contraction and relaxation that
teaching of postural strategies to improve swallowing, progress down the muscular tube.
diet (consistency) recommendations, therapeutic exer- Although it is convenient to describe swallowing
cises (to improve strength and coordination of swal- as four discrete phases, the reality is that there is sub-
low-related structures), and counseling regarding the stantial overlap among the phases.
swallowing disorder. Structures within the brainstem exert neural con-
The radiologist has a limited, but critical, role in trol over the automatic aspects of swallowing (i.e., the
the evaluation of swallowing. Specifically, the radi- pharyngeal and esophageal phase), and structures
within the cerebral hemispheres exert neural control relaxation and opening during volume swallowing. Dys-
over the voluntary aspects of swallowing (i.e., the oral phagia, 32, 216–224.
preparatory and oral transport phases). Cook, I., Dodds, W., Dantas, R., Kern, M., Massey, B., Shaker,
Characteristics of the bolus can influence the swal- R., & Hogan, W. (1989). Timing of videofluoroscopic,
manometric events, and bolus transit during the oral and
lowing pattern, including bolus consistency and tex-
pharyngeal phases of swallowing. Dysphagia, 4, 8–15.
ture, volume, and taste.
Cook, I., Weltman, M., Wallace, K., Shaw, D., McKay, E.,
The development of swallowing from infancy is Smart, R., & Butler, S. (1994). Influence of aging on oral-
rapid and complex and moves through different suck- pharyngeal bolus transit and clearance during swallow-
ing and chewing patterns toward adult-like eating ing: Scintigraphic study. American Journal of Physiology,
and drinking behaviors, and carries with it important 266, G972–G977.
developmental processes related to social and emo- Corbin-Lewis, K., & Liss, J. (2015). Clinical anatomy and physi-
tional development. ology of the swallow mechanism. Independence, KY: Cengage
The effect of aging on the swallow is an overall Learning.
slowing and a subtle deterioration in the spatial and Daggett, A., Logemann, J., Rademaker, A., & Pauloski, B.
temporal coordination among certain structures of the (2006). Laryngeal penetration during deglutition in nor-
mal subjects of various ages. Dysphagia, 21, 270–274.
swallowing apparatus.
Daniels, S., Corey, D., Hadskey, L., Legendre, C., Priestly, D.,
Several ways to measure and analyze swallowing
Rosenbek, J., & Foundas, A. (2004). Mechanism of sequen-
are available and include videofluoroscopy (which uses tial swallowing during straw drinking in healthy young
x-ray to image all phases of swallowing), and endos- and older adults. Journal of Speech, Language, and Hearing
copy (which uses a flexible endoscope to image the Research, 47, 33–45.
pharyngeal and laryngeal regions); client self-report Dodds, W., Hogan, W., Reid, D., Stewart, E., & Arndorfer, R.
(which provides insight into the client’s experiences (1973). A comparison between primary esophageal peri-
with swallowing) is also a useful measure that helps stalsis following wet and dry swallows. Journal of Applied
determine if further instrumental testing is warranted. Physiology, 35, 851–857.
Some of the more important health care profession- Dodds, W., Taylor, A., Stewart, E., Kern, M., Logemann, J., &
als who work with clients with dysphagia (impaired Cook, I. (1989). Tipper and dipper types of oral swallows.
American Journal of Roentgenology, 153, 1197–1199.
swallowing) include speech-language pathologists,
Dua, K., Ren, J., Bardan, E., Xie, P., & Shaker, R. (1997). Coordi-
radiologists, gastroenterologists, otolaryngologists,
nation of deglutitive glottal function and pharyngeal bolus
dieticians, and occupational therapists. transit during normal eating. Gastroenterology, 112, 73–83.
Engelen, L., Fontijn-Tekamp, A., & van der Bilt, A. (2005). The
influence of product and oral characteristics on swallow-
References ing. Archives of Oral Biology, 50, 739–746.
Green, J., Moore, C., Ruark, J., Rodda, P., Morvee, W., & Van-
Arrese, L., Carrau, R., & Plowman, E. (2017). Relationship Witzenburg, M. (1997). Development of chewing in chil-
between the Eating Assessment Tool-10 and objective dren from 12 to 48 months: Longitudinal study of EMG
clinical ratings of swallowing function in individuals with patterns. Journal of Neurophysiology, 77, 2704–2716.
head and neck cancer. Dysphagia, 32, 83–89. Hiiemae, K., & Palmer, J. (1999). Food transport and bolus
Arvedson, J., & Brodsky, L. (2002). Pediatric swallowing and formation during complete feeding sequences on foods of
feeding: Assessment and management (2nd ed.). Clifton Park, different initial consistency. Dysphagia, 14, 31–42.
NY: Thomson Learning (Singular Publishing Group). Hiraoka, T., Palmer, J., Brodsky, M., Yoda, M., Inokuchi, H.,
Baumann, A., & Katz, P. (2016). Functional disorders of swal- & Tsubahara, A. (2017). Food transit duration is associated
lowing. Handbook of Clinical Neurology, 139, 483–488. with the number of stage II transport cycles when eating
Belafsky, P., Mouadeb, D., Rees, C., Pryor, J., Postma, G., solid food. Archives of Oral Biology, 81, 186–191.
Allen, J., & Leonard, R., (2008). Validity and reliability of Humphrey, T. (1970). Reflex activity in the oral and facial area
the Eating Assessment Tool (EAT-10). Annals of Otology, of the human fetus. In J. Bosma (Ed.), Second symposium on
Rhinology and Laryngology, 117, 919–924. oral sensation and perception (pp. 195–233). Springfield, IL:
Bosma, J. (1986). Development of feeding. Clinical Nutrition, Charles C. Thomas.
5, 210–218. Im, I., Kim, Y., Oommen, E., Kim, H., & Ko, M. (2012). The
Bosma, J., Truby, H., & Lind, J. (1965). Cry motions of the effects of bolus consistency in pharyngeal transit duration
newborn infant. Acta Paediatrica Scandinavica, 163, 63–91. during normal swallowing. Annals of Rehabilitation Medi-
Chi-Fishman, G., & Sonies, B. (2002). Effects of systematic cine, 36, 220–225.
bolus viscosity and volume changes on hyoid movement Inamoto, Y., Saitoh, E., Okada, S., Kagaya, H., Shibata, S., Ota,
kinematics. Dysphagia, 17, 278–287. K., . . . Palmer, J. (2013). The effect of bolus viscosity on
Cock, C., Jones, C., Hammer, M., Omari, T., & McCulloch, T. laryngeal closure in swallowing: Kinematic analysis using
(2017). Modulation of upper esophageal sphincter (UES) 320-row area detector CT. Dysphagia, 28, 33–42.
20 Swallowing 291
Jadcherla, S. (2017). Advances with neonatal aerodigestive sistency and initial bolus size. Archives of Oral Biology, 59,
science in the pursuit of safe swallowing in infants: Invited 379–385.
review. Dysphagia, 32, 15–26. Miller, J., Sonies, B., & Macedonia, C. (2003). Emergence of
Kahrilas, P., & Logemann, J. (1993). Volume accommodation oropharyngeal, laryngeal and swallowing activity in the
during swallowing. Dysphagia, 8, 259–265. developing fetal upper aerodigestive tract: An ultrasound
Langmore, S., Schatz, K., & Olson, N. (1988). Fiberoptic endo- evaluation. Early Human Development, 71, 61–87.
scopic evaluation of swallowing safety: A new procedure. Nishikubo, K., Mise, K., Ameya, M., Hirose, K., Kobayashi,
Dysphagia, 2, 216–219. T., & Hyodo, M. (2015). Quantitative evaluation of age-
Leonard, R., & McKenzie, S. (2006). Hyoid-bolus transit laten- related alteration of swallowing function: Videofluoro-
cies in normal swallow. Dysphagia, 21, 183-190. scopic and manometric studies. Auris Nasus Larynx, 42,
Linden, P., Tippett, D., Johnston, J., Siebens, A., & French, J. 134–138.
(1989). Bolus position at swallow onset in normal adults: Nishino, T., Yonezawa, T., & Honda, Y. (1985). Effects of
Preliminary observations. Dysphagia, 4, 146–150. swallowing on the pattern of continuous respiration in
Logemann, J. (1998). Evaluation and treatment of swallowing human adults. American Review of Respiratory Disease, 132,
disorders (2nd ed.). Austin, TX: Pro-Ed. 1219–1222.
Logemann, J., Boshes, B., Blonsky, E., & Fisher, H. (1977). Omari, T., Jones, C., Hammer, M., Cock, C., Dinning, P.,
Speech and swallowing evaluation in the differential diag- Wiklendt, L., . . . McCulloch (2016). Predicting the acti-
nosis of neurologic disease. Neurologia, Neurocirugia, and vation states of the muscles governing upper esopha-
Psiquiatria, 18(2–3 Suppl.), 71–78. geal sphincter relaxation and opening. American Journal
Logemann, J., Pauloski, B., Rademaker, A., Colangelo, L., of Physiology — Gastrointestinal and Liver Physiology, 310,
Kahrilas, P., & Smith, C. (2000). Temporal and biomechani- G359–G366.
cal characteristics of oropharyngeal swallow in younger Palmer, J., & Hiiemae, K. (2003). Eating and breathing: Inter-
and older men. Journal of Speech, Language, and Hearing actions between respiration and feeding on solid food.
Research, 43, 1264–1274. Dysphagia, 18, 169–178.
Logemann, J., Pauloski, B., Rademaker, A., & Kahrilas, P. Palmer, J., Rudin, N., Lara, G., & Crompton, A. (1992). Coor-
(2002). Oropharyngeal swallow in younger and older dination of mastication and swallowing. Dysphagia, 7,
women: Videofluoroscopic analysis. Journal of Speech, Lan- 187–200.
guage, and Hearing Research, 45, 434–445. Perlman, A., Ettema, S., & Barkmeier, J. (2000). Respiratory
Malandraki, G., Perlman, A., Karampinos, D., & Sutton, B. and acoustic signals associated with bolus passage during
(2011). Reduced somatosensory activations in swallowing swallowing. Dysphagia, 15, 89–94.
with age. Human Brain Mapping, 32, 730–743. Perlman, A., Palmer, P., McCulloch, T., & VanDaele, D. (1999).
Martin, B., Logemann, J., Shaker, R., & Dodds, W. (1994). Electromyographic activity from human laryngeal, pha-
Coordination between respiration and swallowing: Respi- ryngeal, and submental muscles during swallowing. Jour-
ratory phase relationships and temporal integration. Jour- nal of Applied Physiology, 86, 1663–1669.
nal of Applied Physiology, 76, 714–723. Perlman, A., Schultz, J., & VanDaele, D. (1993). Effects of age,
Martin-Harris, B. (May 16, 2006). Coordination of respiration gender, bolus volume, and bolus viscosity on oropharyn-
and swallowing. GI Motility Online. doi:10.1038/gimo10 geal pressure during swallowing. Journal of Applied Physi-
Martin-Harris, B., Brodsky, M., Michel, Y., Ford, C., Walters, ology, 75, 33–37.
B., & Heffner, J. (2005). Breathing and swallowing dynam- Plowman, E., Tabor, L., Robison, R., Gaziano, J., Dion, C.,
ics across the adult lifespan. Archives of Otolaryngology- Watts, S., . . . Gooch, C. (2015). Discriminant ability of the
Head and Neck Surgery, 131, 762–770. Eating Assessment Tool-10 to detect aspiration in individ-
Martin-Harris, B., Brodsky, M., Price, C., Michel, Y., & Wal- uals with amyotrophic lateral sclerosis. Neurogastroenterol-
ters, B. (2003). Temporal coordination of pharyngeal and ogy and Motility, 28, 85–90.
laryngeal dynamics with breathing during swallowing: Robbins, J., Hamilton, J., Lof, G., & Kempster, G. (1992). Oro-
Single liquid swallows. Journal of Applied Physiology, 94, pharyngeal swallowing in normal adults of different ages.
1735–1743. Gastroenterology, 103, 823–829.
Martin-Harris, B., Michel, Y., & Castell, D. (2005). Physiologic Rosenbek, J., Robbins, J., Roecker, E., Coyle, J., & Wood, J.
model of oropharyngeal swallowing revisited. Otolaryn- (1996). A penetration-aspiration scale. Dysphagia, 11, 93–98.
gology-Head and Neck Surgery, 133, 234–240. Saitoh, E., Shibata, S., Matsuo, K., Baba, M., Fujii, W., &
McFarland, D., & Lund, J. (1995). Modification of mastication Palmer, J. (2007). Chewing and food consistency: Effects
and respiration during swallowing in the adult human. on bolus transport and swallow initiation. Dysphagia, 22,
Journal of Neurophysiology, 74, 1509–1517. 100–107.
Mendell, D., & Logemann, J. (2007). Temporal sequence Selley, W., Flack, F., Ellis, R., & Brooks, W. (1989). Respiratory
of swallow events during the oropharyngeal swallow. patterns associated with swallowing: Part I. The normal
Journal of Speech, Language, and Hearing Research, 50, 1256– adult pattern and changes with age. Age and Ageing, 18,
1271. 168–172.
Mikushi, S., Seki, S., Brodsky, M., Matsuo, K., & Palmer, J. Shune, S., Moon, J., & Goodman, S. (2016). The effects of age
(2014). Stage I intraoral food transport: Effects of food con- and preoral sensorimotor cues on anticipatory mouth
movement during swallowing. Journal of Speech, Language, Steeve, R., Moore, C., Green, J., Reilly, K., & Ruark McMur-
and Hearing Research, 59, 195–205. trey, J. (2008). Babbling, chewing, and sucking: Oroman-
Smith, J., Wolkove, N., Colacone, A., & Kreisman, H. (1989). dibular coordination at nine months. Journal of Speech,
Coordination of eating, drinking, and breathing in adults. Language, and Hearing Research, 51, 1390–1404.
Chest, 96, 578–582. Tasko, S., Kent, R., & Westbury, J. (2002). Variability in tongue
Sonies, B., Parent, L., Morrish, K., & Baum, B. (1988). Dura- movement kinematics during normal liquid swallow. Dys-
tional aspects of the oral-pharyngeal phase of swallow in phagia, 17, 126–138.
normal adults. Dysphagia, 3, 1–10. Tracy, J., Logemann, J., Kahrilas, P., Jacob, P., Kobara, M., &
Steele, C., Alsanei, W., Ayanikalath, S., Barbon, C., Chen, J., Krugler, C. (1989). Preliminary observations on the effects
Cichero, J., . . . Wang, H. (2015). The influence of food tex- of age on oropharyngeal deglutition. Dysphagia, 4, 90–94.
ture and liquid consistency modification on swallowing Ulysal, H., Kizilay, F, Ünal, A., Güngor, H., & Ertekin, C.
physiology and function: A systematic review. Dysphagia, (2013). The interaction between breathing and swallow-
30, 2–26. ing in healthy individuals. Journal of Electromyography and
Steele, C., & van Lieshout, P. (2004). Influence of bolus con- Kinesiology, 23, 659–663.
sistency on lingual behaviors in sequential swallowing. Witcombe, B., & Meyer, D. (2006). Sword swallowing and its
Dysphagia, 19, 192–206. side effects. British Medical Journal, 333, 1285–1287.
21
Hearing Science I:
Acoustics and Psychoacoustics
Introduction he and his brothers would amuse themselves by each

taking two rocks, submerging underwater at opposite
Acoustics is the science of sound. The study of sound ends of the lake, and sending Morse code signals by
is relevant to Communication Sciences and Disor- banging the rocks together. The impact of the rocks
ders because speech is produced as an acoustic signal, created a pressure disturbance in the water — a sound
and hearing uses acoustic signals as “data.” The term wave — that was transmitted rapidly and heard clearly
“acoustic signal” in this text refers to a disturbance in underwater, at a substantial distance across the lake.
air pressure, created by a vibrating source. We are spe- A more dramatic example is putting an ear to a railroad
cifically interested in acoustic signals that fall within track and hearing an approaching train even though it
the frequency range of human hearing. Hixon, Weis- is more than a mile away. The train wheels vibrate the
mer, and Hoit (2020) and Kramer and Brown (2019) steel track, setting up a pressure disturbance in steel
present more detailed information on basic acoustics, track molecules. This disturbance, or pressure wave, is
and the Internet is an endless source of outstanding transmitted down the rails to the person whose ear in
websites on acoustics, many of which include anima- on the track.
tions of sound wave events. The difference in speed of sound wave trans-
This chapter covers the transmission of sound in mission is determined by the relative densities of the
air. Sound is a disturbance in air pressures but can also molecules in the different media. All other things
be a disturbance in the molecules of fluid or of solids. being equal, the denser (more highly packed) the mol-
This disturbance is the bunching up and spreading ecules, the faster is the speed of sound transmission.
apart of molecules in response to an external source. Air has less dense molecules than water, which has
The changing density of air molecules results in pres- less dense molecules than steel. Among these three
sure variations. Sounds, in fact, are pressure waves. sound-conducting media, therefore, steel conducts
When the author was a child, his family made sound waves at the greatest speed and over the great-
regular summer trips to a lake in New Jersey, where est distances.
293
Inherent Forces, Constant Motion

Air molecules can, in theory, sustain forces of elasticity and inertia are intrinsic to the
indefinitely an oscillation like the one shown in air molecule, meaning the molecule itself, or more
Figure 21–1. Air molecules, like springs, are elastic. precisely its motions, produce the forces and keep
They have a rest position (Figure 21–1, time 1), and the molecule moving without the assistance of any
when they are stretched away from that position “external” forces other than the initial stretch. Start
(Figure 21–1, time 2), they generate a recoil force the molecule moving, and in theory, it will oscillate
to get back to the rest position. Air molecules forever under its own power. In the real world,
also have mass, which means they demonstrate motions of air molecules are opposed by external
the force of inertia. When after being stretched forces, the most common one being friction.
the molecule recoils back to the rest position, it Friction is a force that opposes the movement of
is moving quickly and cannot “stop on a dime” molecules and generates heat as a product of this
at the rest position (“A body in motion tends to opposition. When molecules in motion rub against
stay in motion,” courtesy of Sir Isaac Newton); each other or against surfaces of containers, walls,
the molecule moves through the rest position or human tissue, they generate heat. The intrinsic
(Figure 21–1, time 3, same as the initial rest posi- forces of elasticity and inertia are degraded, or run
tion) and is again stretched away from it (Figure down, by the generation of heat; when the forces
21–1, time 4), building up recoil force to spring of friction overcome the forces of elasticity and
back to the rest position (Figure 21–1, time 5). The inertia, the motion stops.
Oscillation referred to as periodic vibrations because of their repeti-

tive nature.
The motion depicted in Figure 21–1 is called an oscil- Imagine that we timed the motion of the mol-
lation. The back-and-forth motion repeats itself over ecule in 21–1 from the original position shown at time
time. The repetition of this motion over time allows it 1 through one complete back-and-forth oscillation to
to be described according to its period (and its inverse, the position shown at time 5. The time taken for one
frequency) and amplitude. In fact, such oscillations are complete cycle of this back-and-forth motion is called
Time 1
Time 2
Time 3
Time 4
Time 5
Figure 21–1. Motions of a single air molecule at five consecutive

instants in time after the molecule is “bumped” by some unknown
force. The arrow at time 1 indicates the force moving the molecule to
the right, and all subsequent arrows indicate the direction of the
resulting motion of the molecule. The motions at different times are
shown on separate rows for clarity, but the path of the motion really
occurs in a single dimension, back and forth along the same path.
the period of vibration and is symbolized by the letter The waveform in Figure 21–2 is called a sine wave.
T. T is expressed in seconds (sec) or milliseconds (msec; A sine wave is a sound composed of a single period,
1 msec = 0.001 sec). An oscillation with T = 0.001 sec and, therefore, only a single frequency. In Figure 21–2,
means that the motion from time 1 to time 5 is com- left, the period is marked as T, and the value of the
pleted in 1/1,000th of a second. period is 0.008 sec, or 8 msec. T is the inverse of fre-
Periodic motion can also be described in terms of quency, so the frequency of this sine wave can be cal-
frequency. Frequency (symbolized as F) is the number culated: F = 1/0.008 = 125 Hz.
of complete cycles of oscillation that occur in 1 sec. Fre-
quency and period are the inverse of each other — if you
know one, you know the other. For example, if the com- Spectrum
plete cycle of molecule movement shown in Figure 21–1
has a duration of 0.001 sec (T = 1 msec), then F = 1/0.001 The spectrum is the amplitude of vibration plotted as
= 1,000 cycles per second (cps), or 1000 hertz (Hz). a function of frequency; amplitude is on the y-axis,
The human ear responds to a large range of frequen- and frequency is on the x-axis (Figure 21–2, right). This
cies, from a low frequency of about 20 Hz (T = 1/F, = spectrum has a single frequency component at 125 Hz,
1/20 = 0.050 sec, or 50 msec) to a high frequency of about as expected from a sine wave with a period of 0.008 sec.
20,000 Hz (T = 1/20,000 = 0.00005 sec or 0.05 msec). Even
in the case of the lowest frequency, the time taken for
one complete cycle, 0.05 sec, is merely a fraction of a full Waveform and Spectrum
second. Acoustic oscillations occur very rapidly.
The waveform and spectrum are two different ways to
represent the same event. A waveform is the time rep-
Waveform resentation (“time domain” in Figure 21–2) of an acous-
tic event; a spectrum is the frequency representation
A waveform is the amplitude of vibration as a function (“frequency domain” in Figure 21–2) of the event. The
of time, as shown on the left side of Figure 21–2; ampli- difference between the two is conveyed by the x-axis
tude is on the y-axis, and time is on the x-axis. of the two representations: time for the waveform, and
Time Domain Frequency Domain

Relative Amplitude (dB)
Relative Amplitude
T = 0.008 sec
0 2 4 6 8 125 250 375 500
Time (msec) Frequency (Hz)
Figure 21–2. Left, molecule motion from Figure 21–1 replotted as a sine wave. The
graph is a waveform, with amplitude (displacement) on the y-axis and time on the x-axis;
right, spectrum of the waveform in the left side of the figure. In the spectrum, amplitude is
on the y-axis, and frequency is on the x-axis.
frequency for the spectrum. Both representations have Note first the shape of the waveform: it is much
amplitude on the y-axis. more complicated compared with the shape of the sine
wave in Figure 21–2. That is because this is a complex
periodic waveform, reflecting a sound with many dif-
Complex Periodic Acoustic Events
ferent frequencies and amplitudes. Second, the wave-
A spectrum with a single frequency event is called a form has a clearly repeating pattern — it is periodic.
pure tone. Chapter 23 describes how pure tones are The period (T) for one of the cycles is marked on the
used as one way to evaluate hearing. waveform. Third, the spectrum contains many sharply
Most sounds in nature, however, including speech, defined frequency components (the series of closely
consist of many different frequencies having many spaced amplitude peaks along the frequency scale
different amplitudes. Sound waves that are made (x-axis). These multiple-frequency components vary
up of many different frequencies are called complex greatly in their amplitude (the height of the peaks on
acoustic events. They are like a collection of single the y-axis).
frequencies that are all added together. Even when
an acoustic event consists of many different frequen-
Complex Aperiodic Acoustic Events
cies, its waveform can still repeat itself over time. That
is to say, complex acoustic events, like sine waves, Complex acoustic signals can also be aperiodic. Like
can be periodic. Figure 21–3, left, shows a waveform complex, periodic acoustic events, complex aperiodic
(top) and spectrum (bottom) for the vowel “ah”. The signals contain more than one frequency. Unlike com-
frequency scale for this spectrum is marked off in kilo- plex periodic events, the waveforms of complex peri-
hertz (kHz), meaning “1” is 1000 Hz, “2” is 2000 Hz, odic events do not repeat over time. That is why they
and so forth. are called aperiodic acoustic events.
7.99 ms
Time Time
10dB 10dB
Relative amplitude (dB)
Relative amplitude (dB)
1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10
Frequency (kHz) Frequency (kHz)
Figure 21–3. Left, a waveform (top) of an “ah” vowel and its’ spectrum (bottom). Note the repeating pattern in the wave-
form, which allows measurement of a period, as marked; note also the difference between the appearance of this waveform
and the appearance of the sine wave waveform in Figure 21–2; the more complicated looking waveform in this figure
indicates that it reflects multiple frequency components, as shown in the spectrum. This waveform and spectrum reflect a
complex, periodic acoustic event. Right, a waveform of a noise like the sound of the ocean; there is no repeating pattern in
this waveform, thus no period can be measured. The spectrum shows energy as a function of frequency, but not at specific
“peaks” of frequency as in the spectrum on the left side of this figure. The right-hand waveform and spectrum reflect a
complex, aperiodic acoustic event.
The right side of Figure 21–3 shows a waveform

and spectrum for a sound like the ocean. The wave- Tacoma Narrows Bridge
form does not contain a repeating pattern; therefore,
a period cannot be measured. The spectrum does not In 1940, the Tacoma Narrows bridge
contain sharply defined peaks at specific frequencies. collapsed as a response to the forces of nature.
The spectrum appears to be made up of continuous This suspension bridge, a mile-long span over
energy, rather than a series of frequency components the Puget Sound, connected Tacoma, Washing-
as seen in the spectrum of the complex periodic event. ton, with Gig Harbor, Washington. The bridge
was set into motion by strong winds. (It had a
previous history of vibrating before it collapsed;
Resonance travelers likened crossing the bridge in a car to
a roller coaster ride.) Eventually, on the fateful
Every vibrating object has a natural frequency of vibra- day, the bridge began to twist rhythmically,
tion, also called a resonant frequency. In some cases, an reaching such violent, periodic amplitudes that
object will have multiple resonant frequencies. A reso- it eventually fell apart completely. Fortunately,
nant, or natural, frequency is the frequency at which an no one was injured, as the rhythmic twisting
object vibrates with maximal amplitude. grew over a period of hours, allowing people to
The phenomenon of resonance has central impor- get off the bridge. Some have claimed that the
tance to an understanding of both speech (see Chap- collapse occurred because the wind “excited” the
ter 11) and hearing (Chapter 22). For example, the bridge at its resonant frequency; the explanation
resonant frequencies of the vocal tract are changed is probably more complicated than that, but the
when the shape of the vocal tract is changed; this is the resonant frequency of the bridge played a role
basis for the acoustic difference between different vow- in the collapse. Google “Tacoma Narrows” and
els (Chapter 11). As described later, in Chapter 22, the enjoy the still photos and film clips of the bridge
resonant frequency of the external ear canal, which car- rolling and twisting in the wind like a child’s toy.
ries sound energy from the external world to the ear- Note the periodic twisting motions of the bridge.
drum, contributes to the sensitivity of hearing for certain
important frequencies in the understanding of speech.
Resonance has interesting applications to other The ability to “excite” objects at their resonant fre-
events which may be more familiar in the popular quencies and cause them to shatter or disintegrate also
imagination. For example, when a glass shatters in has positive applications. A therapy for kidney stones,
response to a singer’s high-pitched, strong note, you a condition in which a hard mass is formed somewhere
are seeing a particularly dramatic example of reso- in the urinary tract (which includes the kidneys), pul-
nance. Physiological events in the singer’s speech verizes the stones with very high-frequency sound
mechanism (see Chapter 10) produce powerful sound waves which match the resonant frequencies of the
waves, some of which have air molecule oscillations at stones. The stones shatter because they vibrate vio-
the resonant frequency of the glass. These oscillating lently, much like the wine glass shatters when set into
air molecules apply force to the glass, which responds vibration at its resonant frequency.
with forceful vibration of its own because its resonant
frequency has been “excited” by the same frequency
of the air vibrations. The forceful vibration of the glass Psychoacoustics
eventually exceeds the elastic limit1 of the material, and
the glass shatters. Psychoacoustics is the science of the psychological
Bridges can also be “excited” at their resonant response to sound. Important psychological terms
frequency by marching soldiers. Marching is a peri- for an understanding of speech and hearing are pitch,
odic event, and if the marching frequency matches loudness, and quality. Localization, another psycho-
a bridge’s resonant frequency, the bridge may be set acoustic phenomenon, is not discussed in this chapter.
into forceful vibration and eventually collapse. This The following discussion simplifies the relation-
explains why marching soldiers (at least in the past) ships between physical characteristics of sound (fre-
break ranks when crossing a bridge. quency, amplitude, and complex acoustic events) and
1
“ Elastic limit” means the degree to which an object can be stretched before there are permanent changes in the object’s shape. In the current
example, the shape of the glass material that makes up the wine glass is “changed” permanently — the glass shatters — when the vibration
amplitude exceeds the elastic limit of the glass. Excessive stretch of a rubber band is another good example.
the psychological responses to them more. Detailed the struck strings have higher frequencies of vibration.
information on psychoacoustics is presented in Hixon, Look inside a piano and note how the strings become
Weismer, and Hoit (2020) and Kramer and Brown (2019). increasingly thinner and shorter as you move from left
to right. Thicker, longer strings have a lower frequency
of vibration as compared to thinner, shorter strings.
Pitch Thus, a relationship exists between frequency of vibra-
tion and perceived pitch.
Pitch is a term generally understood as the per- A general statement of the relationship between
ceived “height” of a tone. In lay conversation, pitch frequency of vibration and pitch perception is that
often refers to musical notes. The left-hand keys of a pitch increases with frequency. The relationship is not
piano or organ produce lower pitches than the right- simple, because equal changes in frequency do not
hand keys. Striking keys from left to right on a key- result in equal changes in pitch.
board produces increasingly higher-pitched notes. This relationship is illustrated in Figure 21–4,
Not surprisingly, piano strings struck when the keys where a piano keyboard standing on its side with the
to the left of the keyboard are depressed have lower low notes at the bottom is pictured next to a graph
frequencies of vibration; to the right of the keyboard, showing the relationship between changes in fre-
A BCDE FGA BCDE FGA BCDE FGA BCDE FGA BCDE FGA BCDE FGA BCDE FGA BC
1 1 1 1 1 1 1 2 2 2 2 2 2 2 3 3 3 3 3 3 3 4 4 4 4 4 4 4 5 5 5 5 5 5 5 6 6 6 6 6 6 6 7 7 7 7 7 7 7 8
High
A7
A6
Pitch (musical note)
A5
A4
A3
A2
A1
A0
220 440 880 1760 3520
Low 110
55 Frequency (Hz)
27.5
Figure 21–4. Relationship between the perceptual variable pitch (y-axis) and the physical variable frequency (x-axis).
The pitch axis is represented as a sequence of “A” notes from the lowest to highest on the piano keyboard. The graph shows
that the perception of the pitch of adjacent octaves (e.g., A4–A5 and A5–A6), or of more separated octaves (e.g., A2–A3, and
A6–A7), which sound like equivalent pitch ranges, is not matched by equivalent frequency ranges. The frequency range for
octaves increases with increasingly higher octaves on the piano keyboard.
quency and changes in perceived pitch. All “A” notes which gives the user a decibel (dB) value for the sound’s
on the piano keyboard are highlighted, ranging from intensity. This is a measure of the physical energy of a
the lowest-pitched A0 (frequency = 27.5 Hz) to the sound. For the purposes of this chapter, the dB scale is
highest-pitched A7 (frequency = 3520 Hz). The fre- considered the standard measurement scale for sound
quency of each “A” note can be verified by the vertical, intensity, and higher values are associated with greater
dashed lines dropped from each note to the x-axis. perceived loudness. As in the case of pitch, there is not
Most people, musicians and nonmusicians alike, a one-to-one relationship between decibel values and
know that consecutive “A”s (or any other consecutive perceived loudness.2
notes such as “B”s, “C”s, and so forth) cover a range The following decibel values provide general stan-
called an octave. Octaves are frequency and perceptual dards for the meaning of numbers on the decibel scale.
ranges: for a listener, the perceptual distance between A sound intensity of roughly 30 dB is measured when,
A3 and A4 is the same as between A4 and A5. These in the dead of night, you place a sound level meter in
two pitch distances are psychologically equivalent. The the middle of a desert, far from any city. Thirty dB is
less-than-simple relationship between frequency and obviously very quiet. When speaking in a normal voice
pitch is well illustrated by this example of consecu- to a friend separated from you by 1 m, the intensity of
tive musical notes of the same letter (i.e., consecutive your voice measured at your listener’s head is some-
pitches) and the frequencies of those consecutive notes. where between 60 and 70 dB. Sound intensity at a con-
In Figure 21–4, the frequency difference between cert involving a band with gigantic amplifiers — the
A3 and A4 is 220 Hz (A3 = 220 Hz, A4 = 440 Hz), and kind of concert where you cannot hear what your
the frequency difference between A4 and A5 is 440 Hz friend is saying, even though she is standing next to
(A4 = 440 Hz, A5 = 880 Hz). The frequency ranges of you — is anywhere from 100 to 120 dB, measured in the
these consecutive A’s are different, but the pitch change middle of the room.
associated with these two ranges is the same. Even in a A simple example of the lack of a one-to-one rela-
comparison of the small frequency difference between tionship between a physical measure of sound energy
A2 and A3 (110 Hz) and the large frequency difference and a perceptual measure of loudness is as follows.
between A6 and A7 (1760 Hz)the pitch change for the Assume you were doing an experiment in which a par-
two frequency differences is the same (the two differ- ticipant turns a dial to adjust the loudness of a sound
ences are associated with equivalent pitch changes). presented at 50 dB. The dial controls the intensity (the
Frequency and pitch do not change in a one-to-one physical measure) of the sound, and the participant
relationship. has access only to the dial — there is no access to num-
Sine waves (pure tones), complex periodic events bers labeled on the dial with decibel values. You, as the
(like the acoustic result of piano string vibration), experimenter, have that access. You ask the participant
and complex aperiodic events all have pitches, even to turn the dial to double the loudness of the sound.
if pitches of the aperiodic events may be fuzzy. The Most participants turn the dial for a doubling of loud-
pitch of complex acoustic events is much more compli- ness to about 60 dB; they do not turn the dial to 100 dB.
cated compared with the pitch of sinusoids. The pitch The loudness of a sound presented at 100 dB is many
of complex acoustic events is not pursued further in times the loudness of a 50-dB sound. As with frequency
this chapter. and pitch, the physical quantities (the decibel values)
are not related directly to the perceptual values.
Loudness
Sound Quality
“Loudness” refers to the perceived volume of a sound.
Sounds can be perceived as very soft, comfortably When someone says, “That’s an interesting sound, lis-
loud, and uncomfortably loud, as well as all degrees of ten to it and tell me what you think,” they are asking
loudness between these three examples. for a subjective description. Psychoacousticians and
The perceptual phenomenon of loudness has a speech scientists often refer to the subjective impres-
very complex relationship to the physical intensity sion of an acoustic event as the quality of the sound.
of the sound. The measure of sound intensity can be Quality, like pitch and loudness, can be scaled percep-
obtained using a device called a sound level meter, tually with numbers. The nature of the scaling task,
2
o be precise, a sound level meter measures a quantity of sound pressure level (SPL), which varies in response to sound energy much like the term
T
we are using, sound intensity. The difference between the two measures is not relevant to the point being made in the text. Almost any textbook
on hearing science and audiology describes the difference between sound intensity and SPL and how one measure can be derived from the other.
however, must be specified more precisely than simply,

“Scale the quality of this sound.” A sound may have Chapter Summary
any number of different quality terms, such as rough,
weak, shrill, boomy, piercing, thin, fat, slappy, light, Acoustics is the science of sound; sound waves are
and smooth, to name a few of the many, many terms pressure waves, initiated by a vibrating source.
people use to describe how an acoustic event “sounds.” The periodic motion of air molecules can be used
Perceptual scaling of sound quality may involve as a model for sine waves, in which the molecules oscil-
assigning numbers to just one of these descriptions — for late back and forth.
example, “Scale the boominess of the sounds you will Sine waves are the most basic type of sound wave;
hear.” Perceptual scaling of sound qualities tends to sine waves have a single frequency.
be less reliable than scaling of pitch or loudness, but Periodic motions of sine waves can be quantified
it can be done and has been the object of a great deal (measured) in terms of the period of vibration (the time
of research in communication sciences and disorders. taken to complete one complete cycle of vibration), the
In the earlier discussion of pitch and loudness, inverse of which is frequency (number of complete
these perceptual terms were connected to the physi- cycles per second); and in terms of amplitude, or the
cal acoustical characteristics of frequency and inten- extent of displacement of the molecule during the
sity, respectively. What is the physical basis of quality vibratory motion.
judgments? What causes a person to label one acous- Acoustic events can be represented by a wave-
tic event “shrill,” another “smooth”? This is a compli- form, which is a plot of amplitude (sound energy) as
cated question, but we can offer a general answer as a function of time, or by a spectrum, which is a plot
well as some more specific hints about the relationship of amplitude as a function of frequency; a waveform
between sound quality and physical acoustics. and a spectrum are two ways to represent the same
The general answer is that quality perception acoustic event.
depends to a large degree on the frequency composi- Complex acoustic events are composed of many
tion of an acoustic event — that is, the frequency com- sine waves having different frequencies; complex
ponents of the sound and their amplitudes. Sounds acoustic events can be periodic or aperiodic, or a mix
with a highly tonal quality, such as a piano note or a of periodic and aperiodic energy.
sung note by a trained singer, have mostly periodic fre- Resonance is the phenomenon where an object
quency components, and little aperiodic energy. Sound vibrates with maximal amplitude at a specific fre-
with a noisy quality, like the hissing sound of a boil- quency, or at multiple frequencies.
ing tea kettle or a sustained “sh” sound as in the word Psychoacoustics is the science of the psychological
“shop,” are composed mostly or completely of aperi- reaction to acoustic events. Terms such as pitch, loud-
odic energy. An extensive, probably unlimited number ness, and quality are perceptual terms.
of sound qualities are possible, depending on the fre- Frequency is the primary physical (acoustic) mea-
quency components, their amplitudes, and the mix of sure related to pitch, and intensity the primary acoustic
periodic and aperiodic energy in the sound. And, the measure related to loudness.
physical aspects of sound — frequency, amplitude, peri- The relationships between the physical event (e.g.,
odic and aperiodic energy — can vary over time, lend- frequency) and its perceptual correlate (e.g., pitch) are
ing another dimension to sound quality. not one-to-one; equal changes in frequency are not asso-
Take another look at the two spectra in Figure 21–3. ciated with equal changes in pitch, and equal changes
They have obvious differences in frequency compo- in intensity are not associated with equal changes in
nents and the mix of periodic and aperiodic energy, loudness.
and these differences account for the vowel quality
of the left-hand spectrum and the ocean-noise qual-
ity of the right-hand spectrum. A fascinating aspect of References
the relationship between the physical and perceptual
aspects of complex sounds like the ones in Figure 21–3 Hixon, T. J., Weismer, G., & Hoit, J. D. (2020). Preclinical speech
is that a frequency component, or its amplitude, can be science: Anatomy, physiology, acoustics, perception (3rd ed.).
changed in very small amounts for listeners to detect a San Diego, CA: Plural Publishing.
quality change. Humans are very sensitive to shades of Kramer, S., & Brown, D. K. (2019). Audiology: Science to practice
sound quality differences. (3rd ed.). San Diego, CA: Plural Publishing.
22
Hearing Science II:
Anatomy and Physiology
Introduction of hearing. The knowledge is also essential to under-

standing devices such as cochlear implants and hearing
This chapter presents the anatomy and physiology of aids which are used to treat hearing loss. Information on
the auditory mechanism. Structures covered include diseases of the auditory system, the audiological tests to
the external ear canal, eardrum (tympanic membrane), evaluate their effect on hearing, and auditory devices to
ossicles (middle ear bones), cochlea (end organ of hear- treat hearing loss is presented in Chapters 23 and 24).
ing), auditory nerve, and central auditory pathways. Structures of peripheral auditory anatomy can be
Abele and Wiggins (2015); Barin (2009); Goutman, classified as belonging to one of three major divisions:
Elgoyhen, and Gomez-Casati (2015); Hudpseth (2014); the outer ear, the middle ear (tympanic cavity), and the
Lemmerling, Stambuk, Mancuso, Antonelli, and Kubi- inner ear. Anatomical structures of the auditory system
lis (1997); Luers and Hüttenbrink (2016); and Olsen, are the same for both ears; descriptions of the struc-
Duifhuis and Steele (2012) have been used as sources tures of one ear apply equally to the structures of the
for the information presented in this chapter. other ear. “Peripheral auditory system” refers to those
A solid understanding of the auditory mechanism structures outside the central nervous system.
is critical to anyone interested in pursuing a career
in audiology, speech-language pathology, hearing or
speech science, or any other career related to communi- Temporal Bone
cation. Selected reasons for gaining this knowledge are
the following: (a) The great majority of children learn Much of the peripheral auditory mechanism is encased
speech and language via the auditory system, (b) an in the temporal bone of the skull. Figure 22–1 shows
understanding of the normal structures and functions the complex shape of the temporal bone. Figure 22–1,
of the auditory mechanism allows an appreciation of top, is a view of the skull from the left side of the head.
the ways in which various diseases and conditions The perimeter of the temporal bone is outlined in
affect hearing, and (c) knowledge of the anatomy and red. The temporal bone includes the bony part of the ear
physiology of the auditory mechanism is essential to canal — the opening inside the pinna (the structure often
understanding the design and purpose of formal tests referred to in lay terms as “the ear”; see below).
301
External
ear canal
Temporal
bone
Temporal bone
(petrous portion)
Figure 22–1. Top, view of temporal bone of the skull from the left side,
outlined in red; bottom, view of temporal from above, as part of the base of
the skull, shaded pinkish.
Figure 22–1, bottom, shows the interior base of of the skull. The three major structures of the periph-
the skull as if the top half has been removed, with eral auditory system — the bony ear canal, the middle
the view from above. The temporal bone extends ear, and the inner ear — are encased within the tempo-
toward the middle of the skull, forming part of the base ral bone.
22 Hearing Science II: Anatomy and Physiology 303
Peripheral Anatomy of the Ear — the sheet of tissue that terminates the external audi-
tory meatus — is also considered a structure of the outer
Figure 22–2 is an artist’s rendition of the peripheral ear. The tympanic membrane is the boundary between
anatomy of the ear. The structures are shown as if the the outer and middle ear.
head has been cut into front and back halves, with the
front half removed. The peripheral auditory mecha-
Pinna (Auricle)
nism is separated into three major parts: the outer ear,
the middle ear, and the inner ear. The outer ear plus The pinna is composed of cartilage and fat tissue. In
middle ear are components of the conductive part of humans, the pinna collects and directs sound energy
the auditory mechanism. The inner ear is the sensori- into the external auditory meatus and toward the ear-
neural part of the mechanism. Figure 22–3 is a sche- drum. Careful examination of a human’s pinna shows
matic chart of the divisions of auditory anatomy. Both many creases, folds, and cavities. Anatomical character-
Figures 22–2 and 22–3 should be referred to frequently istics of the pinna vary a good deal among individuals.
throughout the following sections.
External Auditory Meatus (EAM; also
called External Auditory Canal)
Outer Ear (Conductive Mechanism)
The entrance to the external auditory meatus is a small
The outer ear includes the pinna (auricle) and the exter- opening easily seen inside the pinna. The external
nal auditory canal, also called the external auditory auditory meatus (external ear canal) is a tube extending
meatus. Part of the tympanic membrane (eardrum) from this opening to the tympanic membrane; meatus is
Temporal bone
Middle Ear
Semicircular
canals
Ossicles
Incus
Stapes
Malleus
Internal
auditory
meatus
Pinna Auditory
(Auricle) nerve
Cochlea
Tympanic
cavity
Tympanic Auditory
membrane (Eustachian) tube
External
ear canal (eardrum) Inner Ear
Outer Ear
Figure 22–2. Coronal-plane view (head cut in front and back halves, view from the front) showing structures of the
peripheral auditory system.
canal to the tympanic membrane. The sound energy is

Outer Ear in the form of molecule-sized pressure waves (Hixon,
Weismer, & Hoit, 2020). As it conducts sound energy,
• Pinna
the external auditory meatus acts like a resonator, caus-
• EAM ing energy at certain frequencies to vibrate with greater
amplitude as compared to energy at other frequencies.
Conductive
Middle Ear
mechanism
• Eardrum
• Ossicles
Getting a Boost
- malleus
The resonant frequency of the exter-
- incus
- stapes nal auditory meatus is approximately 3300 Hz.
What does this mean? When sound waves travel
through the canal, sound energy at 3300 Hz and
Inner Ear nearby frequencies are amplified by the canal’s
• Vestibular Apparatus
resonant frequency. Frequencies much lower
or higher than 3300 Hz are amplified less. The
• Cochlea resonant frequency of the ear canal contributes
- basilar membrane Sensorineural
mechanism in a significant way to hearing in humans. For
- organ of Corti
example, the human ear is most sensitive to
• Auditory nerve sound energy at and near 3300 Hz. Second,
human speech contains important sound energy
Figure 22–3. The anatomical components of the outer, over a range of frequencies around 3300 Hz,
middle, and inner ear, and their relationship to the func- making the resonant frequency of the ear canal
tional distinction between conductive versus sensorineural particularly useful for accurate perception of
auditory components. speech sounds.
a Latin term meaning “opening” or “canal.” In adults, The external auditory meatus also serves a protec-
the external auditory meatus is roughly 2.5 cm in tive function. Glands in the cartilaginous walls of the
length and 0.7 cm in diameter. These dimensions vary external auditory meatus secrete cerumen (earwax)
quite a bit across individuals. which presents a barrier to foreign objects (insects, for
The external auditory meatus is not a straight, example) and may also block the movement of bacteria
level tube. Rather it runs slightly “uphill” (see Fig- or fungal agents toward the tympanic membrane. The
ure 22–2) and between the opening to the canal and kinked tube of the external auditory meatus is also a
the tympanic membrane has a small bend or kink in the barrier to foreign objects moving easily from the outer
direction of the back of the head. (The bend cannot be ear to the delicate tissues of the tympanic membrane.
seen in Figure 22–2 because the frontal view does not
provide depth perception.) You may have noticed that
Tympanic Membrane (Eardrum)
your medical practitioner, when inserting an otoscope
(ear scope) into your ear canal to examine the canal and The tympanic membrane, or eardrum (Figure 22–4),
view your tympanic membrane, gently pulls the pinna is the boundary between the outer and middle ear.
up and toward the back of the head. The practitioner The circular perimeter of the tympanic membrane is
does this to straighten out the natural “kink” in the linked to a bony foundation via a cartilage-ligamentous
tube for a more direct view of the tympanic membrane. ring called the annulus. The annulus fits into a small
The outer part of the external auditory meatus is circular, bony depression at the boundary between the
surrounded by cartilage. Near the bend in the external outer and middle ear to fix the tympanic membrane
auditory meatus, the surrounding walls change from in place.
cartilage to bone. The remaining length of the ear canal, Figure 22–4 is a photograph of the right tympanic
from bend to eardrum, is encased by part of the tempo- membrane as seen through an otoscope. The otoscope
ral bone. The external auditory meatus ends as a closed has a viewing lens and a light source to illuminate the
tube at the tympanic membrane. ear canal and the tympanic membrane. As a result of
The primary auditory role of the external auditory the shallow conical shape of the tympanic membrane
meatus is to conduct sound energy through the ear and its tilt with the lower half further from the scope
able bones (ossicles), ligaments, two muscles, nerves,

and a bony opening to a tube that leads to the top part
Pars of the throat.
flaccida
Incus of eardrum
Ossicles
The ossicles are the three smallest bones in the human
Stapes
body. They extend across the middle ear cavity from
Malleus the tympanic membrane to the oval window of the
(handle) cochlea. These bones are so tiny they can be placed on
Umbo a penny and occupy no more than the bottom half of
the coin.
The malleus (hammer) is the ossicle attached to the
Light tympanic membrane; the stapes (stirrup) is the ossicle
reflex
attached to the cochlea (part of the inner ear); and the
Annulus incus (anvil) is the middle bone, linking the malleus to
the stapes. The three connected ossicles are together
called the ossicular chain.
The malleus has a handle-like part called the
manubrium, which is attached at its lower end to the
Figure 22–4. View of the right tympanic membrane as tympanic membrane. The lowest attachment point of
seen through an otoscope. the manubrium to the tympanic membrane is called
the umbo and can be seen through the translucent tis-
sue of the tympanic membrane when viewed through
than the upper half, a “light reflex” (also called a “cone an otoscope (see Figure 22–4).
of light”) reflects off the normal tympanic membrane. The top of the malleus has a knobby bump that
The light reflex is directed forward, downward, and to fits into a cup-like depression at the top of the incus.
the right. The light reflex is directed to the left when This is where the connection between the two bones is
viewing the left tympanic membrane. Middle ear made. A long, bony limb descends from the top of the
bones (ossicles) can be seen through the translucent incus and forms a hook at its bottom. This hook fits into
membrane. the neck of the stapes (see Figure 22–2, pointer to the
In the otoscopic view, the tympanic membrane stapes, at its neck), connecting the two bones.
looks more or less flat — not conical, not tilted along its The stapes, the smallest of the ossicles, has a neck
top-to-bottom axis relative to the observer. The cone (mentioned in the previous paragraph) and two arches
shape and tilt are seen in the artist’s cross-sectional that project from it. The base of each arch is attached to
rendition of the tympanic membrane in Figure 22–2. the footplate of the stapes, which is oval shaped. The
The tympanic membrane has the shape of a flattened footplate of the stapes fits into an oval window cut into
bowl with a conical base that points into the middle the bony casing of the inner ear. The footplate is held in
ear cavity. that window by a fibrous ligament.
The tympanic membrane is tiny but tough. The The ossicles function to transmit sound energy
membrane has a diameter of roughly 8 to 10 mm and from the tympanic membrane to the footplate of the
a surface area of about 55 mm2. The thickness of the stapes, and eventually to the fluid-filled cochlea. This
membrane is little more than one-tenth of a millimeter function can be explained as a series of events begin-
(0.0001 meters). The tympanic membrane is composed ning with sound wave energy in the air (pressure
of three tissue layers. The middle layer is the one most wave), conversion of the pressure wave to mechanical
sensitive to sound waves, and because of its tissue energy in the form of vibration of the eardrum and ossi-
makeup, which is different from the two layers that cles with the final mechanical energy taking the form
sandwich it, is exceptionally strong. of vibratory movement of the footplate of the stapes,
in and out of the oval window. The vibratory move-
ment of the stapes displaces fluid in the cochlea, thus
Middle Ear the mechanical energy is converted to fluid energy.
The middle ear is an air-filled cavity surrounded by 1. Sound vibrations in the air enter the ear canal and
bone. Its complexly shaped volume contains tiny, mov- initiate vibration of the tympanic membrane.
2. Vibration of the tympanic membrane is trans- The two muscles of the middle ear cavity are the
ferred to the malleus, to which it is attached. tensor tympani muscle and the stapedius muscle (Fig-
3. The sound vibration is transmitted via the ure 22–5). The tensor tympani muscle enters the mid-
connected ossicles to the footplate of the stapes. dle ear canal and attaches to the malleus by means of a
4. The footplate of the stapes moves in and out of short tendon. The stapedius muscle is buried in a small
the oval window; behind the oval window is the bony canal in the back wall of the middle ear. It issues
fluid-filled cochlea, so the vibratory movement of a tendon that enters the middle ear cavity and attaches
the footplate of the stapes displaces cochlear fluid to the neck of the stapes (Figure 22–5).
(see later in this chapter). Contraction of the tensor tympani muscle pulls
on the malleus, retracting the tympanic membrane
into the middle ear cavity. Contraction of the stapedius
Ligaments and Muscles of the Middle Ear
muscle pulls on the footplate of the stapes (by means
The ossicles are anchored to the walls of the middle of the stapedius tendon), away from (but not out of)
ear cavity by several ligaments and two muscles. its “fit” into the oval window. Contraction of either
Figure 22–5, a “zoom” view of the middle ear cavity, muscle stiffens the ossicles, which as discussed later,
shows three of the ligaments (short, pinkish bands reduces the efficiency of sound energy transfer from
of tissue), each attached to an ossicle with their other the tympanic membrane to the footplate of the stapes.
ends attached to one of the middle ear walls. In this There is some debate about the role of the tensor
chapter, we note the importance of these attachments tympani muscle in hearing, which is not presented
as anchors for the ossicles, and their role in limiting here. The stapedius muscle, however, is known to
ossicular movement when they vibrate in response to be a key component of the acoustic reflex. The acous-
sound energy. Additional details about the middle ear tic reflex is the contraction of the stapedius muscle
ligaments are available in the publications cited in the in response to very high levels (intensities) of sound
first paragraph of the chapter. energy (see Chapter 23 for diagnostic testing for the
Superior incudal
ligament
Superior malleolar
ligament
Lateral malleolar
ligament
Posterior incudal
ligament
Annular ligament
Anterior
malleolar ligament
(point of attachment) Tensor
tympani muscle
Stapedius tendon
Figure 22–5. Close-up (“zoom”) view of middle ear cavity, showing ossicles, muscles, and ligaments.
acoustic reflex). High-intensity sound vibration has the of Figure 22–5. The tube becomes cartilaginous as it
potential to drive the footplate of the stapes too force- extends downward and terminates in the upper part
fully into the cochlear fluid, leading to excessive fluid of the pharynx (throat). The auditory tube is normally
displacements that can damage the delicate sensory closed at its pharyngeal ending; the tube is opened
organs of hearing. Recall that contraction of the sta- briefly during swallowing, chewing, and yawning.
pedius muscle pulls the footplate of the stapes away When the tube opens, it connects air in the middle ear
from the oval window and in so doing stiffens the cavity to air in the nasopharynx. Thus, intermittent
ossicular chain. A stiffer ossicular chain reduces sound opening of the pharyngeal part of the auditory tube
transmission and prevents the footplate of the stapes serves to maintain middle ear pressure at normal val-
from too-forceful displacement into the cochlear fluid. ues (i.e., at atmospheric pressure, the same pressure in
The acoustic reflex therefore protects the cochlea from the pharynx when the mouth and/or nostrils are open
extremely high sound levels. to atmosphere). Normal values of air pressure within
The acoustic reflex is “wired” by a loop made up the middle ear cavity are important to the health of
by the auditory nerve, structures in the brainstem, the middle-ear structures.
facial nerve (cranial nerve VII, see Chapter 2) and the
stapedius muscle. The reflex is fast: about a tenth of a
second (0.1 sec) passes between the introduction into Inner Ear (Sensorineural
the ear of extremely high sound levels, transmission Mechanism)
of the sound energy (in mechanical, fluid, and elec-
trochemical forms) through the cochlea and auditory The inner ear is encased in a complex, bony struc-
nerve to the brainstem, and a signal from the brainstem ture that is itself encased in the temporal bone. This
via the facial nerve to contract the stapedius muscle structure is called the bony labyrinth, shown for the
and exert its pull on the stapes. right ear in Figure 22–6. In this view, the semicircular
canals are to the left, the vestibule in the middle, and
the cochlea at the front. The vestibule joins the semicir-
Auditory (Eustachian) Tube
cular canals and cochlea and has the oval window as a
The auditory tube (also called the Eustachian tube, notable landmark. All structures of the bony labyrinth
after the 16th-century Italian anatomist Bartolomeo are filled with fluid. The bony labyrinth is identical in
Eustachi) is shown in Figure 22–2 as a bone-encased, the left ear.
open tube in the lower part of the middle ear cav- There are two openings into the bony labyrinth.
ity; The bony tube opening near the bottom of the The upper opening is called the oval window where the
middle ear cavity is also seen in the “zoom” image footplate of the stapes is attached. The lower opening
is the round window, which is covered by a membrane
similar in structure (although not identical) to the tym-
No Ossicles?? panic membrane. As described earlier, inward move-
ment of the footplate of the stapes displaces fluid in the
What would happen if we did not cochlea. The fluid travels like a wave through cochlear
have ossicles? We would not be able to hear channels to the round window, which bulges when the
as well. Your ossicle-less ear would have an wave arrives. The displacement of the cochlear fluid
eardrum and a cochlea, separated by an air-filled and its importance to auditory sensation are described
middle ear cavity. Vibrations of air molecules in later in this chapter.
your outer ear would be transmitted to the air
in your middle ear and strike the membrane of
Semicircular Canals
the oval window behind which resides the fluid
(liquid) in your cochlea. Would the fluid in your Three fluid-filled semicircular canals comprise the left-
cochlea be displaced by these sound waves? The most structure of the bony labyrinth in the right ear
answer is “yes,” but very ineffectively because (see Figure 22–6). One canal is oriented vertically, one
sound energy in air does not push against fluid horizontally, and one from front to back. Any one of
very effectively. In fact, in addition to transmit- the canals is oriented at right angles to the other two.
ting sound energy from the tympanic membrane The semicircular canals control balance. The fluid
to the oval window, the ossicles amplify the within them contains sensory organs called hair cells.
energy by means of their anatomy. We all need Movement of the head displaces the fluid, which bends
our ossicles. the hair cells and initiates a signal to the nervous sys-
tem. The fluid displacement in the semicircular canals
Cochlea
Vestibule
Superior Oval
window
Semi-circular
canals
Lateral
Posterior
Ampullae
Apex
Round window Base
Figure 22–6. The bony labyrinth (otic capsule), as if viewed from the middle ear cav-
ity of the right ear. The semicircular canals are to the left, the vestibule in the middle, and
the cochlea is to the right.
and the resulting bending of the hair cells sends signals why the fluid displacement at the oval window (push-
to the brain regarding the precise location of the head in ing into the scala vestibuli) is transmitted to the scala
space. Balance is an important outcome of these signals. tympani and results after a very brief delay in a bulg-
ing of the round window (the termination of the scala
tympani).
Vestibule
Figure 22–7 shows the coiled cochlea cut in several
The vestibule contains the oval window in which the cross sections; the three ducts are visible at each cut.
footplate of the stapes is fixed. Two structures within The footplate of the stapes fits into the oval window
the vestibule, also fluid filled and with hair cells, detect at the duct called the scala vestibuli. Movement of the
motion of the head, which supplements the position footplate pushes into the fluid in the scala vestibuli,
detection signaled by the semicircular canals. displacing it in the direction of the tip of the spiral.
The direction of this fluid displacement is indicated
by the blue arrow in Figure 22–7. The fluid displace-
Cochlea
ment travels through the opening at the tip of the
The cochlea (meaning snail) is made up of many smaller cochlea, from the scala vestibuli to the scala tympani
structures. Among the most important of these are the (red arrow, Figure 22-7). The fluid wave travels through
scalae (plural for scala, another name for duct), the bas- the scala tympani to its termination at the round win-
ilar membrane, and the organ of Corti, which sits atop dow, which bulges when it is pushed by the arriving
the membrane and includes hair cells. These structures fluid wave.
are the basis for the transformation of sound waves Each of the cross-sectional cuts in Figure 22–7
into neural signals. The neural signals code proper- show a membrane separating the scala tympani from
ties of the sound and deliver this information from the the third duct, called the cochlear duct (also called the
cochlea via the auditory nerve (cranial nerve VIII) to scala media). The membrane is called the basilar mem-
the central nervous system. brane; sitting on top of the membrane is the organ of
The cochlea consists of three membranous, fluid- Corti. The organ of Corti contains hair cells similar to
filled ducts that are coiled in a snail-shell spiral. The those in the semicircular canals of the vestibular sys-
spiral of the membranes, like its bony casing, includes tem. The hair cells in the organ of Corti are bent when
two-and-one-half turns from its base to its tip (apex). the fluid displacement in the scala vestibuli and scala
At the tip of the cochlea, the two outside ducts are con- tympani creates a sound-induced, fluid wave pattern
nected by a small opening. This connection explains that displaces the basilar membrane in precise ways.
Cochlea duct
Scala vestibuli Scala tympani
Cochlea
Auditory nerve
Away from the tip,

toward the round window
Away from the oval

window, toward the tip
Basilar membrane
Organ of Corti
Figure 22–7. The cochlear spiral cut in several cross-sections. The cross-section cuts through
multiple turns of the cochlea. Each section of the cochlea contains three scalae, the scala vestibuli,
the scala media (cochlear duct), and scala tympani.
Basilar Membrane and Organ of hair cells. Each hair cell in the organ of Corti, from base
Corti (Sensory Mechanism) to tip of the basilar membrane, is attached to a nerve
fiber that becomes part of the auditory nerve. The audi-
An artist’s rendition of the basilar membrane and tory nerve carries information from the cochlea to the
Organ of Corti, at one “slice” through a turn in the brain, and in some cases from the brain to the cochlea
cochlea, is shown in Figure 22–8. The organ of Corti (see discussion later in this chapter).
sits atop the basilar membrane. The basilar membrane,
organ of Corti, and a membrane above the hair cells Coding of Frequency Along the Basilar Mem-
(labeled “tectorial membrane”) form the sensory end brane. The hair cells within the organ of Corti are
organ of hearing. critical to auditory sensation, much like the rods and
To get an idea of how much this image has been cones of the retina, the end organ of vision, are critical
magnified relative to the size of the actual organs, to visual sensation. Here we focus on the role of the
consider that the nearly vertical structures labeled basilar membrane and inner hair cells in coding fre-
“inner hair cells” and “outer hair cells” are roughly 30 quency of incoming sound waves. The outer hair cells
micrometers (0.000030 meters, around 1/1000th of an are critical to hearing as well. They control the sensitiv-
inch) in length and 10 micrometers in diameter. ity of the inner hair cells, allowing the detection of very
Note the single hair cell (inner hair cell) to the left soft sounds and increasing the precision of frequency
of the image and the row of three hair cells (outer hair analysis by the inner hair cells. The outer hair cells are
cells) to the right. These hair cells run the length of the not discussed further in this chapter.
cochlea, from base to tip. Note also the nerve fibers The basilar membrane is displaced (“deformed”) by
(shown in yellow) connected to the inner and outer the fluid wave within the cochlea. The precise location
Outer Hair Cells
Tectorial membrane
Inner Hair Cell
Basilar
Membrane
Nerve fibers
Figure 22–8. The basilar membrane and the organ of Corti, at one cut along the cochlear spiral.
of maximum displacement along the basilar mem- the base toward the tip of the basilar membrane, the
brane depends on the frequency (or frequencies) of hair cells are sensitive to increasingly lower frequencies
the incoming sound wave. This is because frequency until at the tip they are sensitive to the lowest frequen-
analysis is arranged systematically along the basilar cies human can hear (about 20 Hz).
membrane and the hair cells in the organ of Corti. The Now we can make a more precise statement about
systematic arrangement of frequency along the basilar tonotopic representation: frequency is represented
membrane is called tonotopic representation. tonotopically along the basilar membrane such that the
Tonotopic representation is illustrated in Figures location of a hair cell along the membrane determines
22–9 and 22–10. In Figure 22–9, the snail shell–like its frequency sensitivity. Hair cells at the base of the
cochlea (top) is “unrolled” (bottom) to better explain basilar membrane are sensitive to the highest frequen-
tonotopic representation. cies, and hair cells at the tip of the basilar membrane
The basilar membrane is shown in Figure 22–9, are sensitive to the lowest frequencies.
bottom, by the flat pink strip running from the base to
tip of the cochlea. The arrows show the path of fluid Displacement of the Basilar Membrane and Fre-
displacement from the scala vestibuli, through the nar- quency Analysis. Tonotopic arrangement of the hair
row opening at the tip, and then through the scale tym- cells along the basilar membrane raises the question of
pani in the direction of the round window. how different locations along the membrane are stimu-
The basilar membrane is narrow at its base (the lated when vibratory motions of the conductive mecha-
left end of the membrane in Figure 22–9) and becomes nism are transferred to the cochlear fluid.
increasingly wider as it extends to the tip of the We owe the understanding of frequency analysis
cochlea. The narrow base of the basilar membrane is in the cochlea to experiments performed by Georg von
very stiff, and the wide tip end of the membrane is rela- Békésy (1899–1972), a Hungarian physicist and engi-
tively floppy. The hair cells at the base (narrow part) neer who in 1961 won the Nobel Prize for this work.
of the membrane are sensitive to the highest frequen- Von Békésy observed that when the footplate of the
cies humans can hear (about 20,000 Hz). Moving from stapes vibrated in response to sound energy, it pushed
Basilar
membrane
Highest
frequencies
Apex
Oval
window
Round
window
Base Lowest
frequencies
Figure 22–9. Top, the cochlea as if rolled out to form a straight structure (bottom). In the bottom image,
the scala vestibuli is the top duct, the scala tympani the bottom duct, and the basilar membrane is shown as
the pinkish partition in the middle, narrow and stiff at the base and wide and floppy at the apex.
into the cochlea and created a fluid wave that traveled basilar membrane. Three unrolled cochleae (plural of
through the scala vestibuli and scala tympani. The fluid cochlea) are shown, with three schematic “blips” rep-
wave displaced the basilar membrane, and the location resenting maximum displacement of the basilar mem-
of its maximum displacement depended on the fre- brane for high (top), mid (middle), and low (bottom)
quency of the incoming sound wave. Low frequencies frequencies. These are the wave patterns expected for
resulted in a basilar membrane displacement that built single frequencies. Traveling wave patterns for sound
up gradually and reached its peak near the tip of the waves made up of many different frequencies result
membrane. In contrast, high-frequency sound energy in more complex patterns of displacement along the
produced displacement of the basilar membrane that basilar membrane.
built to a peak at a short distance from the oval win- How does displacement of the basilar membrane
dow (that is, near the base). In his world-famous 1928 result in a frequency signal that is sent to the brain? The
paper, von Békésy described this fluid movement in traveling wave that has maximum displacement of the
the cochlea as a traveling wave. basilar membrane at a specific location causes bending
Figure 22–10 is a schematic summary of how dif- of the hair cells at the same location. When bent, the
ferent frequencies of incoming sound waves result in membranes of the hair cells change their sensitivity to
different locations of maximum displacement along the certain molecules, which in turn cause their attached
High Mid Low

Highest
frequencies
Apex
Oval
window
Round
window
Lowest
Base frequencies
High Mid Low

Highest
frequencies
Apex
Oval
window
Round
window
Lowest
Base frequencies
High Mid Low

Highest
frequencies
Apex
Oval
window
Round
window
Lowest
Base frequencies
Figure 22–10. Three unrolled cochleae, showing the location of maximum displacement along
the basilar membrane for high (top), mid (middle), and low (bottom) frequencies.
nerve fibers (see Figure 22–8) to “fire” and send a sig- brane: The fluid displacement that causes a traveling
nal to the brain via the auditory nerve (see Figure 22–2 to wave to crest at a specific location, which depends
for the auditory nerve emerging from the cochlea). The on the frequency of the incoming sound wave, results in
nerve fibers attached to the individual hair cells have the firing of a nerve fiber that is maximally sensitive
the same tonotopic arrangement as the basilar mem- to that frequency.
Auditory Nerve and Auditory problems and facial weakness may have diagnostic
Pathways (Neural Mechanism) significance.
The auditory nerve in the peripheral auditory system, Auditory Pathways. The auditory pathways are struc-
and the auditory pathways within the central nervous tures in the nervous system that carry auditory impulses
system, comprise the neural component of the hearing from the auditory nerve to the cortex, the highest level
mechanism. of the central nervous system. The auditory nerve also
includes fibers carrying information from the central
Auditory Nerve. Individual nerve fibers emerg- nervous system to the cochlea; these fibers innervate
ing from the base of the inner hair cells are gathered the outer hair cells. The focus in this section is on the
together and form a significant part of the auditory pathways from the auditory nerve to the cortex of
nerve, which is part of cranial nerve VIII. The auditory the cerebral hemispheres
nerve travels through the internal auditory meatus, a When electrical impulses are transmitted in the
narrow, short tunnel in the temporal bone. The nerve auditory nervous system, they travel along nerves (or
emerges from the tunnel and enters the central nervous tracts, as they are called in the central nervous system)
system at the lower levels of the brainstem. and make connections (synapses) in clusters of cell
The internal auditory meatus also contains the bodies. These cell bodies issue another tract aimed at
fibers of the other part of cranial nerve VIII — the ves- a different cluster of cell bodies along the pathway to
tibular nerve — as well as fibers of cranial nerve VII the auditory cortex. The auditory pathways are more
(the facial nerve). Figure 22–2 shows the cochlear or less dedicated to transmitting information from the
and vestibular components of cranial nerve VIII and auditory nerve all the way to the auditory cortex. The
the fibers of cranial nerve VII. The close proximity of pathway terminates in the primary auditory cortex,
the facial nerve to the auditory nerve is significant located on the upper lip of the temporal lobe. Like the
because the facial nerve may be affected by a disease hair cells and auditory nerve, cells within the auditory
of the auditory nerve, and the combination of auditory cortex are tonotopically arranged.
Tests of Hearing and Auditory

Anatomy and Physiology
Tests of hearing are designed based on knowledge of the auditory system.
For example, in pure-tone audiometry, tones of a single frequency are
used to estimate the response of the hair cells at different locations along
the basilar membrane. Because of the tonotopic arrangement of the hair
cells from base to apex of the cochlea, single-frequency tones allow a tester
to assess the health of the hair cells very precisely at different locations
throughout the cochlea. There are also techniques for assessing the stiff-
ness of the conductive mechanism (outer and middle ear). These stiffness
evaluations, all performed at the entrance to the ear canal with minimal
discomfort to the person being tested, are used to diagnose many auditory
disorders ranging from middle ear infections, which are very common
in childhood, to possible diseases of the auditory nerve which may be
reflected in poorly functioning acoustic reflexes. One more example is
the use of electrodes placed on the scalp to measure the amplitudes and
timing of electrical activity of cell groups within the auditory pathways of
the central nervous system, as brain analysis of an acoustic signal makes
its way from the auditory nerve to the auditory cortex. Clearly, audiologi-
cal tests reflect an intimate knowledge of the structure and function of
the auditory system. Chapter 23 presents details of these tests and their
interpretation.
canals, vestibule, and cochlea, all structures that com-

Chapter Summary municate with the central nervous system via cranial
nerve VIII (auditory-vestibular nerve).
Knowledge of the structure and function of the audi- Three semicircular canals, each oriented at right
tory mechanism is critical to those who plan a career angles to the other two, contain hair cells that bend
in communication sciences and disorders, whether the when the head moves and cause the vestibular part of
career goal is to understand (a) how children learn lan- cranial nerve VIII to fire and send information about
guage, (b) how diseases affect the normal mechanism, head position and orientation to the brain.
and (c) how formal evaluations of hearing are designed The vestibule contains the oval window as well as
and interpreted. structures that contain hair cells that send signals to
Most of the auditory mechanism is housed within the brain about the relative position and acceleration
the temporal bone, a complex bone of the skull. of the head.
The peripheral auditory mechanism can be subdi- The cochlea is the spiral-shaped end organ of
vided into the conductive mechanism, comprising the hearing that converts sound into neural signals and
outer and middle ear, and the sensorineural mecha- contains many important structures including the
nism, comprising the inner ear and auditory nerve. scalae, basilar membrane and organ of Corti, and hair
The external auditory meatus (or external audi- cells.
tory canal) is a canal approximately 2.5 cm long and Within the cochlea are three membranous, fluid-
0.7 cm in diameter that extends from an opening in the filled ducts called the scala vestibuli, scala media (or
auricle to the tympanic membrane, in which cerumen cochlear duct) and scala tympani, the first and last of
(earwax) is produced and through which sound waves which are connected at the top of the cochlear spiral.
are directed to the tympanic membrane. The scala media is separated from the scala tym-
The external auditory meatus has a resonant fre- pani by the basilar membrane.
quency of roughly 3300 Hz, which explains in part the On top of the basilar membrane sits the organ of
very acute sensitivity of the human auditory mecha- Corti, which contains a row of inner hair cells and three
nism in this frequency region. rows of outer hair cells.
The tympanic membrane (or eardrum) is a small Movement of fluid in the cochlea (caused by sound
(about 55 mm2 area), three-layered structure located at waves transmitted through the outer and middle ear,
the internal end of the outer ear; the middle tissue layer and the movement of the stapes into the oval window)
is sensitive to the very small pressure variations associ- deforms inner hair cells and causes them to send a sig-
ated with sound waves. nal to an attached nerve fiber, which makes the nerve
The middle ear is an air-filled cavity located between fiber “fire.”
the outer ear and inner ear, and contains three small ossi- Hair cells are arranged tonotopically along the bas-
cles (bones), several ligaments, and two muscles. ilar membrane, ranging from those that respond best to
The ossicles are stabilized by ligaments that attach the highest frequency at its narrow base (20,000 Hz) to
to the walls of the middle ear, and the tissue of two those that respond best to the lowest frequency at its
muscles, the tensor tympani muscle that attaches to wide apex (20 Hz).
the malleus and the stapedius muscle that attaches As discovered by Georg von Békésy, hair cells are
to the stapes; contraction of both muscles stuffen the stimulated by traveling waves transmitted through
ossicular chain when they contract. the cochlear fluid, the highest amplitude of which is
The stapedius muscle is an important component frequency dependent, with high-frequency sounds
of the acoustic reflex; the muscle contracts in response creating waves that peak near the base of the basilar
to high-level sound energy, and in so doing prevents membrane and low-frequency sounds creating waves
the footplate of the stapes from excessive displacement that peak near the apex of the basilar membrane.
force into the cochlear fluid. The auditory pathways consist of the audi-
The auditory tube (or Eustachian tube), a 3.8-cm tory nerve, and the tracts and clusters of cell bodies
(1.5-inch) tube that runs from the middle ear to the that carry auditory signals from the brainstem to the
nasopharynx, is bony and open at the middle ear and cortex.
cartilaginous and flexible toward the top part of the
pharynx (the nasopharynx) where it is usually closed,
but opens occasionally to equalize the pressure in the References
middle ear.
The inner ear is housed within the bony labyrinth Abele, T. A., & Wiggins, III, R. H. (2015). Imaging of the tem-
of the temporal bone and contains the semicircular poral bone. Radiological Clinics of North America, 53, 15–36.
Barin, K. (2009). Clinical neurophysiology of the vestibular Hudpseth, A. J. (2014). Integrating the active process of hair
system. In J. Katz, L. Medwetzky, R. Burkard, & L. Hood cells with cochlear function. Nature Reviews Neuroscience,
(Eds.), Handbook of clinical audiology (6th ed., pp. 431–466). 15, 600–614.
Baltimore, MD: Lippincott, Williams, & Wilkins. Lemmerling, M. J., Stambuk, H. E., Mancuso, A. A., Antonelli,
Békésy, G. (1928). Zur Theorie des Hörens; die Schwingungs- P. J., & Kubilis, P. S. (1997). CT of the normal suspensory
form der Basilarmembran. Physik. Zeits. 29, 793–810. ligaments of the ossicles in the middle ear. AJNR American
Goutman, J. D., Elgoyhen, A. B., & Gomez-Casati, M. E. Journal of Neuroradiology, 18, 471–477.
(2015). Cochlear hair cells: The sound-sensing machines. Luers, J. C., & Hüttenbrink, K.-B. (2016). Surgical anatomy
FEBS Letters, 589, 3354–3361. and pathology of the middle ear. Journal of Anatomy, 228,
Hixon, T. J., Weismer, G., & Hoit, J. D. (2020). Preclinical speech 338–353.
science: Anatomy, physiology, acoustics, perception (3rd ed.). Olsen, E. S., Duifhuis, H., & Steele, C. R. (2012). Von Békésy
San Diego, CA: Plural Publishing. and cochlear mechanics. Hearing Research, 293, 31–43.
23
Diseases of the Auditory System
and Diagnostic Audiology
Introduction with hearing aids, which indicates that there are just
under 30 million people for whom hearing aids would
This chapter presents an overview of the diseases of be of some benefit (NIDCD, 2016). As you can see in
the auditory and vestibular systems, and how they are Table 23–1, hearing loss is a problem in the general
diagnosed. The chapter has been written specifically to population. We should know that individuals with
discuss the various tests used to evaluate hearing and hearing loss will also have problems understanding
balance disorders. Subsequent chapters discuss what speech. But in addition to this obvious problem, we also
can be done to rehabilitate auditory disorders. With know that hearing loss is associated with other serious
your new knowledge of the anatomy and physiology negative consequences such as academic difficulties,
of the auditory and vestibular system from the last problems in the workplace, and psychosocial issues
chapter, we can start to understand the tests needed to such as social isolation, depression, anxiety, loneliness,
determine which parts of the auditory system are func- and lessened self-efficacy (Mueller, Ricketts, & Bentler,
tional. Further information about the senses of hearing 2014). However, as audiologists we need to identify
and balance can be gathered from the outstanding texts these individuals and determine the type, degree, and
by Kramer and Brown (2019) and Jacobson and Shepard configuration of their hearing loss so that we can assist
(2016). Then we evaluate the different tests and their them in developing effective communication.
outcomes for each of the main types of hearing disor-
ders — conductive, sensorineural, and mixed losses.
With a current population in the United States of Hearing Evaluation
approximately 327 million people, the National Insti-
tute of Deafness and Other Communication Disorders According to the American Academy of Audiology, an
(NIDCD) reports that approximately 15% or 37.5 mil- audiologist provides services in the audiologic identifi-
lion American adults over the age of 18 have some cation, assessment, diagnosis, and treatment of persons
trouble hearing, which is about the same as the total with impairment of auditory and vestibular function
population in the state of California. It is estimated while in their roles as clinician, therapist, teacher, con-
that 90% to 95% of these individuals can be helped sultant, researcher, and administrator. They identify,
317
Table 23–1. Quick Statistics About Hearing to diagnose any abnormality in their hearing and/or
vestibular systems (AAA, 2004).
Children • Two to three out of every 1,000 children
are born with a hearing loss in one or
both ears Case History
• More than 90% of deaf children are
born to hearing parents It is important that before you start your assess-
• Five out of six children have an ear ment, you obtain some information about the patient.
infection by the time they are 3 years old Referred to as a case history, you can learn important
• At least 1.4 million children (18 or information and gain great clinical insight about the
younger) have hearing problems patient’s primary complaints and symptoms. Through
Adults • Three in 10 people over age 60 years the case history, you will obtain answers to your ques-
have hearing loss tions about the extent of any hearing and communi-
cation problems, when the problem began, whether
• One in six baby boomers (ages 41–59
years), or 14.6%, have a hearing it has worsened, if it came on suddenly or gradually,
problem and if the patient has associated dizziness and/or tin-
• One in 14 Generation Xers (ages 29–40
nitus (ringing in the ears). Additional topics you may
years), or 7.4%, already have hearing explore in the case history include how family mem-
loss bers perceive the patient’s problem, any associated cir-
• 10% of adults (about 25 million) have
cumstances or activities that brought on the conditions,
experienced tinnitus what medications they are taking, a family history of
hearing problems, results of previous hearing tests, and
Treatment • Only about 16% of adults with hearing any previous use of hearing aids. Based on the patient’s
loss use hearing aids answers to these questions and/or information from
• 58,000 cochlear implants have been other sources, additional questions may be appropri-
implanted in adults and 38,000 in ate. With this information, you can begin to develop a
children clinical impression of the patient and his or her prob-
Source: From “Quick Statistics About Hearing,” National Institute lem, which will help guide you to the next steps.
on Deafness and Other Communication Disorders (NIDCD), 2016.
Retrieved from https://www.nidcd.nih.gov
Otoscopy
assess, diagnose, and treat individuals with impair- After completing a case history, it is important to visu-
ment of either peripheral or central auditory and/or ally inspect the ear canal and tympanic membrane
vestibular function (AAA, 2004). Audiologists (or cli- before attempting any audiologic assessment, espe-
nicians as they are often referred) are trained as both cially ones that require placing an ear insert or ear
diagnosticians and habilitation/rehabilitation experts probe into the canal. This technique is called otoscopy
for the auditory system. They spend much of their and requires you to place the speculum of the otoscope
time determining the type, degree, and configuration into the patient’s ear canal. Figure 23–1 shows a photo
of any hearing loss they detect and determining what of a standard otoscope with its handle, neck, head, and
can be done to rehabilitate communication problems specula. The otoscope has a light source and magni-
associated with the loss. In most cases, this makes fies the view down the ear canal. This ability to peer
audiologists heavily reliant on technology. Therefore, into a patient’s ear canal can determine the status of
audiologists need to be technologically savvy to be the outer and middle ear by assessing the color, shape,
competent in their job. and general appearance of the structures to see if they
The assessment of an individual’s hearing includes are normal.
the administration and interpretation of behavioral, As shown in Figure 23–2, the clinician holds the
psychoacoustic, and electrophysiologic measures of the otoscope with a pencil grip and uses proper bracing
peripheral and central auditory systems. The assess- technique (i.e., other fingers are placed against the head)
ment of an individual’s vestibular system includes the to support the insertion of the specula into the patient’s
administration and interpretation of behavioral and ear. The opposite hand is used to grip the auricle
electrophysiologic tests of equilibrium. Both of these (pinna), gently but firmly pulling up and back, which
types of assessments are accomplished using standard- will straighten out the canal and provide better visual-
ized testing procedures and instrumentation in order ization. Using this bracing technique to hold the oto-
23 Diseases of the Auditory System and Diagnostic Audiology 319
Head
Specula
Neck
Handle
Figure 23–2. When viewing the tympanic membrane

through a standard otoscope, it is important to use proper
bracing technique where one hand pulls up and back on
the auricle and the other holds the otoscope pencil-style
using the other fingers to brace the otoscope against the
patient. Courtesy of AudProf.com.
systems.1 If we apply a known sound (Chapter 21) to

the ear, the acoustic and mechanical properties of the
outer and middle ears (Chapter 22) provide opposition
Figure 23–1. A standard otoscope used in visualizing
to the energy flow, which is referred to as impedance.
the ear canal and tympanic membrane. Disposable specula
A high impedance system (i.e. a middle ear that is filled
are used to prevent the spread of germs from one patient
with fluid or an ear infection) will have a greater oppo-
to another.
sition to the flow of energy than a low impedance sys-
tem (i.e., a normal, healthy middle ear). The reciprocal
of impedance is called admittance, which is the measure
scope will avoid injury to the patient’s ear canal, as it of how much of the applied energy flows through the
will move with the patient should he or she suddenly middle ear system, so that a high admittance system
move the head. As you insert the speculum (plural, (i.e., a normal, healthy middle ear) has a greater flow
specula) into the ear canal, you can look through the of energy. These concepts can be applied to the evalu-
otoscope to view the tympanic membrane at the far end ation of the conductive part of the hearing mechanism
of the canal (Figure 23–3). As you examine the ear canal (Chapter 22) by making different types of measure-
and tympanic membrane, not only are you looking for ments including tympanometry and acoustic equiva-
excess cerumen (earwax) and foreign objects but also lent volume of the ear canal.
for diseases and disorders of the outer and middle ear. Figure 23–4 shows the basic components of an
admittance instrument. To obtain a measure of admit-
tance, the probe must include an air pressure pump,
Immittance which allows the system to vary the pressure within
the ear canal. The system also has a speaker, which pro-
Immittance audiometry describes the sound energy duces an 85 dB sound pressure level (SPL) pure tone
that is transferred through the outer and middle ear (usually at 226 Hz), called the probe tone. The tone is
1
Immittance is the overall term that includes both admittance and impedance. Admittance is the amount of energy that moves through the
middle ear system while impedance is the reciprocal of admittance. In other words Admittance values (Y) and impedance values (Z) are
related, Y = 1/Z or Z = 1/Y. Audiologists commonly use admittance when measuring middle ear function (Kramer & Brown, 2019).
presented to the ear through a probe assembly placed 226 Hz probe tone is not a valid measure, and a higher
at the entrance to the ear canal. A microphone, also a probe-tone frequency (1000 Hz) is recommended.
part of the probe assembly, is used to monitor the level
of the probe tone in the ear canal. For infants younger Tympanometry
than 6 months, conventional tympanometry with a
Tympanometry is one of the most commonly used tests
in the basic audiometric test battery. This test is used to
measure how the admittance changes as a function of
applied air pressure and how this function is affected
Pars by different conditions of the middle ear. The results of
flaccida this test are displayed on a graph (Figure 23–5) called
Incus of eardrum a tympanogram, where the admittance is on the y-axis
(mmhos, a unit of admittance), and the pressure range
is displayed on the x-axis (decaPascals or daPa, units
Stapes of pressure).
In conducting this test, the first step is to place
Malleus
(handle) the probe in the ear and obtain an airtight seal with an
appropriate-sized rubber probe tip. This allows the air
Umbo
pressure to be manipulated by the air pressure pump.
The canal is pressurized to +200 daPa, while the probe
Cone of tone is presented to the ear as shown in Figure 23–4.
light The pressurization of the ear canal forces the tympanic
Annulus membrane in the direction of the middle ear cavity
(positive pressure), reducing the tympanic membrane’s
ability to vibrate (low admittance). The admittance is
recorded at +200 daPa and plotted on the graph (see
Figure 23-5). The pressure is then swept continuously
Figure 23–3. The view of a normal tympanic mem- from +200 to −200, and the admittance is recorded
brane through an otoscope. along the way. In a normal ear, maximum admittance
Probe
Air pressure
Tone generator
Ear drum
Microphone
EAM
Figure 23–4. The key components of an admittance instrument or tympanometer. The air pressure
pump is used to apply air pressure during tympanometry, the speaker sends the probe tone down the
ear canal toward the tympanic membrane, and the microphone measures the intensity of the tone as it
is reflected back from the tympanic membrane.
EAM Side Middle Ear Side
Negative
pressure (–)
Admittance
Positive
pressure (+)
– 0 +
Pressure in Ear Canal
Figure 23–5. A tympanogram where the admittance is plotted across ear canal pressure.
Note the effect of pressure on the tympanic membrane, where the positive pressure is drawn in
red and the negative pressure in blue. Compare the position of the TM (on the left side) with the
movement of the TM on the tympanogram (on the right side). In this example, the TPP is 0 daPa.
is found at 0 daPa, where the air pressure is equal on the ear canal but also the volume of the middle ear
either side of the tympanic membrane allowing it to (and potentially eustachian tube). If the Vea is smaller
vibrate most effectively (high admittance). As the than the expected normal range, it may be an indica-
applied air pressure becomes negative, the admittance tion that the external ear canal is obstructed. For either
again decreases because the tympanic membrane does of these abnormal Vea conditions, the tympanogram
not vibrate as efficiently when the eardrum is pulled will not show any changes in admittance as the applied
out with negative pressure. The pressure at which air pressure is varied and appears as a flat line.
admittance is maximum corresponds to a specific
amount of pressure; this is referred to as the tympano- Types of Tympanograms. Tympanograms can be
metric peak pressure (TPP). categorized based on the five types described by Jerger
(1970). Although Jerger’s classification scheme is
widely used, it is a more useful approach to describe
Acoustic Equivalent Volume of the Ear Canal
the actual characteristics of the tympanogram such
Ear canal volume is an important measure, as it will as a “flat tympanogram” or a “normal-shaped tym-
provide information about the outer and middle ear. panogram with the peak admittance occurring at
The acoustic equivalent volume (Vea) is a measure of −150 daPa.” The description of a normal tympanogram
the physical volume of the ear canal, as estimated from varies with a number of factors including age. The dif-
the admittance obtained at +200 daPa. It can provide ferent types of tympanograms and their descriptions
diagnostic information about the condition of the tym- are briefly described next.
panic membrane and/or ear canal. For a normal ear An abnormal tympanogram is a good indication
canal and tympanic membrane, the admittance at +200 of some middle ear involvement that affects the admit-
should be within the normal range of ear canal vol- tance characteristics of the middle ear. However, the
umes. If the tympanic membrane has a perforation or tympanogram is not a predictor of the amount (if any)
pressure equalization (PE) tube inserted, then the Vea of conductive hearing loss.
will be larger than the normal range. This is because Normal admittance (Type A) tympanogram has a
the volume estimate includes not only the volume of characteristic peak shape with normal compliance and
the tympanic peak pressure within the normal range range (Figure 23–7B). A flat tympanogram occurs with
(Figure 23–6). A normal Type A tympanogram occurs either an ear infection (fluid in the middle ear), a hole
in normal functioning middle ears, and this patient will (perforation) in the tympanic membrane (i.e.,larger
not have a conductive hearing loss. than normal Vea), impacted cerumen, or when the
Flat (Type B) tympanogram does not have the char- probe is pushed against the ear canal wall (reduced
acteristic peak shape (i.e. no TPP) seen for Type A, but Vea). Patients with a flat tympanogram will most likely
instead appears to be relatively flat across the pressure have a conductive hearing loss.
Admittance
-400 -200 0 +200

Pressure at which peak compliance occurs,
measured in decaPascals.
Figure 23–6. Type A or normal admittance tympanogram.
Retracted eardrum No Mobility

(Type C Tympanogram) (Type B Tympanogram)
1.5 cm3 1.5 cm3
Admittance
Admittance
-400 -200 0 +200 -400 -200 0 +200

Pressure (daPa) Pressure (daPa)
A B
Figure 23–7. A. Type C negative pressure tympanogram. B. Type B or flat tympanogram.

Negative pressure (Type C) tympanogram has a char- ulation (break) of the ossicular chain, and the patient
acteristic peak with the same shape as a Type A; how- will usually have a conductive hearing loss.
ever, the TPP is shifted to a more negative pressure
(Figure 23–7A). A negative pressure tympanogram Acoustic Reflex Threshold
indicates that the pressure in the middle ear space is
not equal to the atmospheric pressure. However, when In this section, we will discuss the contraction of the
the TPP is outside the normal range and a negative stapedius muscles (Chapter 22) by the ear’s involun-
pressure tympanogram persists for an extended period tary middle ear reflex in response to a loud sound. The
of time, then fluid can build up in the ear, at which acoustic reflex threshold (ART) test, as it is known, uses
point it will change to flat (Type B). A patient with a the same immittance instrument and is usually per-
negative pressure tympanogram usually does not have formed immediately after obtaining a tympanogram.
a conductive hearing loss. This acoustic reflex is a bilateral response that occurs
Reduced admittance (Type As) tympanogram has a when a loud tone is delivered to one ear resulting in
characteristic peak shape with the TPP in the normal a contraction of the stapedius muscle due to firing of
range as for Type A; however, the admittance is lower the seventh cranial (facial) nerve in both ears. This con-
than the lower end of the normal range (Figure 23–8). traction of the stapedius muscle changes the transmis-
This type of tympanogram is sometimes referred to as sion efficiency of the sound energy as it travels through
“shallow.” A reduced tympanogram suggests reduced the ossicular chain, decreasing the admittance of the
movement of the tympanic membrane. A patient with probe tone. Figure 23–9 provides a simplified diagram
a reduced tympanogram will have a conductive loss on of the acoustic reflex pathway, illustrating the main
the audiogram. ART pathways and key structures. The ART is defined
High admittance (Type Ad) tympanograms have a as the lowest intensity level (in 5 dB steps) of a reflex
characteristic peak shape with the TPP in the normal eliciting tone that produces a repeatable acoustic reflex.
range as for Type A; however, the admittance is higher This test takes into account pathologies that occur in
than the upper end of the normal range (Figure 23–8). the outer and middle ear as well as abnormalities of the
A high admittance tympanogram suggests a highly cochlea, eighth cranial nerve, lower brainstem, and/or
mobile tympanic membrane, which may be seen in the seventh cranial nerve, as they can also influence the
some cases of disarticulation of the ossicular chain or ability to record an acoustic reflex.
cases of thinned tympanic membranes resulting from The goal is to monitor any change in the admit-
previous middle ear infections. These high admittance tance of the probe tone that occurs when the stapedius
tympanograms (Type Ad) are suggestive of a disartic- muscle contracts in response to a loud tone presented
Hyperflaccid
(Type Ad)
Normal Ear
Admittance
(Type A)
Stiff Ear
(Type As)
-400 -200 0 +200
Pressure (daPa)
Figure 23–8. Type Ad or hypermobile tympanograms.

SOC & SOC &

VII Nucleus VII Nucleus VI
IN
e e
rv
rv
e
IN
e
VI
CN CN
ve
VI
II
er
Ne
II N
rv
VI
e
Inner ear Inner ear
Middle ear Middle ear
Right ear Right ear
Figure 23–9. A simplified diagram of the acoustic reflex pathway showing the ipsilateral reflex arcs
for the right ear in red and for the left ear in blue.
to the ear. The probe tone is a hum-like sound (226 Hz) to decrease. As the intensity of the reflex eliciting tone
that is constantly playing in the ear; the clinician then increases above the ART, there is a range in which the
presents a tone to elicit the reflex. If the acoustic reflex stapedius contraction strengthens and the size (ampli-
is triggered, the stapedius muscle fires, and there is an tude) of the downward deflection of the acoustic reflex
abrupt reduction in admittance of the probe tone. In increases with increasing dB HL.
an individual with normal hearing, the ART should
fire between 75 and 95 decibels hearing level (dB HL)
(Wiley, Oviatt, & Block, 1987) as shown in Figure 23–10. Audiometric Testing
In this example, you can see that at the beginning, the
level of the reflex eliciting tone is below the stapedius Audiometric testing has been the mainstay in the audi-
reflex threshold, and there is no measurable change in ologist’s armament of diagnostic tests since audiology’s
admittance. As the reflex eliciting tone is increased in modern beginnings in the mid-1900s (Jerger, 2009). The
intensity (level), it will eventually become loud enough standard battery of audiometric testing includes both
causing the stapedius to contract and the admittance pure-tone audiometry and speech audiometry. When
Higher
80 dB 85 dB 90 dB 95 dB 90 dB
Admittance
mitta
At TPP
.00
.01
.04 .03
Lower
.08
Figure 23–10. An acoustic reflex measure illustrating different levels

of the reflex eliciting tone. The acoustic reflex threshold is defined as the
lowest level of the reflex eliciting tone that produces a downward deflection
(reduced admittance) ≥0.02 mL. In this example, the acoustic reflex threshold
is 90 dB HL. Reproduced with permission from Audiology: Science to practice
(3rd ed., p. 232) by S. Kramer and D. K. Brown, 2019, San Diego, CA: Plural
Publishing, Inc. Copyright 2019 by Plural Publishing, Inc.
assessing a cooperative patient for a hearing concern, hearing loss, (b) determine which parts of the auditory
pure-tone audiometry is a part of the basic audiologic system are involved, (c) determine if a medical referral
assessment. It should be pointed out that audiologists is needed, and (d) predict how the patient’s hearing loss
use a test battery approach, not counting on one single may relate to his or her ability to listen and communi-
test to determine a person’s ability to hear because the cate. To be able to measure a pure-tone threshold, the
auditory system is complex, and we have the ability audiologist uses an instrument called an audiometer to
to assess many of its parts independently. Using the produce the variety of stimuli needed for the test. The
cross-check principle2 along with the battery of tests, audiometer is used to create pure tones (Chapter 21)
we are able assess a person’s hearing to determine from 125 to 8000 Hz (in octave or half-octave steps) at
type, degree, and configuration of the hearing loss with a variety of intensities through a variety of transduc-
confidence. ers such as an earphone, an ear insert, a bone vibrator,
or speakers. It can also produce different noises such
as speech noise or narrow-band noise, which are used as
Pure-Tone Audiometry
maskers to keep the nontest ear busy while determin-
Pure-tone audiometry is the heart of the standard test ing the threshold for the test ear. Pure-tone audiom-
battery and involves finding the lowest intensity across etry is first completed using earphones or ear inserts,
the frequency range that a person is just able to hear. this means that the sound is transmitted down the
The lowest intensity for a particular tone that a person ear canal and through the middle ear. This pathway
can reliably respond to at least 50% of the time is called is referred to as air conduction, as the initial sound is
his or her threshold for that frequency. In pure-tone traveling through air before it is converted to mechani-
audiometry, thresholds are obtained in a quiet envi- cal energy by the tympanic membrane and passed
ronment for a range of frequencies between 250 and along to the cochlea. The other pathway is referred to
8000 Hz. This range is important because it contains the as bone conduction and requires the use of a bone oscil-
frequencies that are most relevant for speech sounds. lator, which vibrates the skull in order to transmit the
The pattern of hearing loss according to the thresh- sound directly to the cochlea, bypassing the outer and
old across the frequency spectrum is often characteris- middle ear. Differences in thresholds between air and
tic of certain types of hearing loss. For example, high bone conduction results in an air-bone gap, and gaps
frequencies are more affected than low frequencies greater than 20 dB are considered to be a conductive
for persons with hearing loss due to noise exposure. hearing loss (discussed later in this chapter).
Pure-tone thresholds can be used to describe why a The next step is to take the threshold and plot it on
patient’s hearing loss might relate to their inability to an audiogram to record the results. However, we must
hear different speech sounds. Speech sounds such as first understand the audiogram or graphical record
the consonants “f”, “s”, or “th” have higher-frequency on which we record a patient’s thresholds. When try-
components than vowels “a” or “o”, which suggests ing to understand how well a patient hears, we must
that the person with a high frequency hearing loss place these thresholds on the audiogram, according to
will have more trouble hearing these consonants (and the frequency and intensity. The audiogram, as shown
others as well) compared with hearing the vowels. in Figure 23–11, is a description of a person’s hearing
This simple test of determining a person’s threshold with frequency (in Hz) on the x-axis and their thresh-
may be quite easy to accomplish with a cooperative old (in hearing level, dB HL)3 on the y-axis. The legend
adult but may require considerable skill and expe- or audiogram key indicates the various symbols that
rience to be able to recognize and adapt to different can be used to describe the results. Once the thresh-
patient response abilities and patterns when testing an olds are plotted on the audiogram, we can calculate
8-month-old child or an elderly patient with dementia. the pure-tone average (PTA), which is simply the
It is the audiologist’s responsibility to incorporate and average of the thresholds for 500, 1000, and 2000 Hz.
integrate the pure-tone results with other test findings The degree or amount of loss is based on the person’s
and make appropriate interpretations, impressions, and threshold. Once you have determined the patient’s
recommendations for management of the hearing loss. threshold for a particular frequency, you will plot it
Pure-tone audiometric thresholds are used by on the audiogram according to the frequency and
audiologists to (a) describe the amount of the patient’s intensity of the stimulus.
2
he cross-check principle, first suggested by Dr. James Jerger (1976), is the checking of results of a single test by the results of another inde-
T
pendent test. With this principle, we can compare results from a number of tests to determine if the outcome is supported. For example, we
can compare the results of the pure-tone audiogram with the results of tympanometry.
3
dB hearing level (HL) is a scale used on the audiogram where 0 dB HL at any frequency represents the lowest level for normal hearing.
Frequency (Hz)
250
2 500 1000 2000 4000 8000
-10 Audiogram Right Left
0 Key Ear Ear
10 AC unmasked O X
Decibels Hearing Level (dB HL) 20 AC masked
<
BC unmasked <
30
BC masked [ ]
40 No response
50 Sound-field S
60
70
AC transducer:
80 Patient:
90
De
100
110
120
Figure 23–11. An audiogram for plotting a person’s thresholds. The sym-

bols as shown in the legend are the key to comparing the air and bone conduc-
tion thresholds. Reproduced with permission from Audiology: Science to practice
(3rd ed., p. 130) by S. Kramer and D. K. Brown, 2019, San Diego, CA: Plural Pub-
lishing, Inc. Copyright 2019 by Plural Publishing, Inc.
Speech Audiometry completed using recorded speech so that there can be

Speech audiometry is a method used in the clinic to consistency across visits and across clinics.
evaluate how well a patient can hear and understand
specific types of speech stimuli. Speech tests provide Speech Reception Threshold. One of the goals of
a formal way to determine the patient’s ability to rec- speech audiometry is to determine a person’s speech
ognize speech. Speech audiometry can also contribute reception threshold and find the lowest intensity at
to the diagnosis of different hearing disorders. For which that person can produce a response to speech
example, if a patient has a cochlear problem, he or she stimuli. Speech stimuli are generally more familiar
can have a predictable relation between the shape of than pure tones. Speech reception thresholds are mea-
the audiogram and his or her understanding of speech; sured using the same dB HL scale as the pure-tone
however, an eighth cranial nerve problem can result in thresholds. The degrees of hearing loss for speech can
significantly poorer speech recognition than would be be described by the same categories used for degree
suggested by the audiogram. of hearing loss for pure-tone audiometry. The SRT is
Results of speech tests are used to compare and conducted using spondee words, which are two-syl-
validate pure-tone thresholds (i.e., cross-check), com- lable compound words (i.e., baseball). The first step
pare speech recognition ability between the two ears, when conducting an SRT is to familiarize the patient
and/or monitor changes across time. Speech audiom- with the list of words, and any words that are not cor-
etry can also help determine whether a patient is an rectly identified are omitted from the test (Tillman &
appropriate candidate for a hearing aid or cochlear Jerger, 1959).
implant, or to compare a patient’s performance with The SRT level is recorded on the audiogram, where
different amplification devices. Speech audiometry it is compared with the pure-tone average (PTA), this
includes two basic types of speech tests: (a) the estab- is used as a cross-check with the pure-tone thresholds.
lishment of a speech threshold and (b) a measure of The speech threshold is usually equal to or slightly
speech recognition ability performed at a level above better than the PTA, but it should be within 10 dB
his or her threshold (suprathreshold). These tests are HL of the PTA. If there is more than a 10 dB differ-
ence between the speech threshold and the PTA, then will affect his or her speech understanding at normal
a determination of why there is a difference should be conversational level. It can also be used for differential
explored. diagnosis or to suggest the need for further evaluation
of an eighth cranial nerve disorder.
Word Recognition Score. Speech audiometry also
includes an evaluation of how well a patient can rec- Speech in Noise. A primary complaint of many
ognize speech, at one or more levels above his or her people with hearing loss is that they have difficulty
SRT or PTA. This test is performed using single-syllable understanding what is being said in the presence of
words, which are phonetically or phonemically bal- background noise (Beck et al., 2018). However, the
anced lists of single-syllable words (consonant-vowel- speech measures described earlier are usually pre-
consonant), called PB (phonetically balanced) words. The sented in quiet. There are speech tests, however, that
PB word lists were constructed to approximate the are designed to assess how well a patient can recognize
frequency of occurrence for different speech sounds speech in the presence of different levels of background
or phonemes based on their occurrence in the English noise. These measures of speech recognition are more
language. For example, the sounds “t,” “n,” and “d” representative of real-world listening situations. The
occur in English more frequently than the sounds “k,” tests present speech materials for different levels of
“f,” and “z”; the phonetic composition of the PB words noise, referred to as a speech-to-noise ratio (SNR) in the
reflects these differences. same ear as the noise. For example, if the speech mate-
One example of these lists is the set of NU-6 (pho- rial is presented at 50 dB HL and the noise is presented
nemically balanced) word lists developed by research- in the same ear at 45 dB HL, this would be considered a
ers at Northwestern University (Tillman & Carhart, +5 SNR. There are a number of commonly used speech-
1966). These words are used to acquire a word recog- in-noise tests including the Speech Perception in Noise
nition score for a patient’s right and left ears in quiet. (SPIN) test, the Quick Sentence in Noise (QuickSIN)
The words are presented to the patient at a comfortable test, and the Bamford-Kowal-Bench Speech in Noise
listening level, approximately 30 or 40 dB above the (BKB-SIN). Each test has its own administration pro-
patient’s PTA. cedure and method of scoring. The results can then be
The word recognition score (WRS) is obtained by compared to the Speech in Quiet test results and the
having the patient repeat back the words from the list. pure-tone audiogram.
The WRS is simply calculated as a ratio of the number
of words the patient was able to correctly repeat back
to the tester over the number of words given; the ratio Physiological Responses
is expressed as a percentage correct. After calculating
the percentage score for both ears, they are recorded
Otoacoustic Emissions
on the audiogram. Additionally, the audiologist will
interpret the score based on the categories in Table 23–2 It is difficult to think that your ear produces a sound
for use in describing the result in the report. WRS can when it hears a sound, but that is just what happens
be used as an estimate of how a patient’s hearing loss with an otoacoustic emission. In 1978, Kemp first
Table 23–2. Categories Used to Describe Results From Word

Recognition Score (WRS) Testing
WRS (% Correct) Degree of Impairment Word Recognition Ability

100–90 None Excellent/normal
89–75 Slight Good
74–60 Moderate Fair
59–50 Poor Poor
<50 Very poor Very poor
Source: Reproduced with permission from Audiology: Science to Practice (3rd ed.,
p. 168) by S. Kramer and D. K. Brown, 2019, San Diego, CA: Plural Publishing, Inc.
Copyright 2019 by Plural Publishing, Inc.
described the phenomenon of the otoacoustic emis- or middle ear can interfere with both the inward and
sion (OAE) after measuring low-intensity acoustic sig- outward transmission of the cochlear-generated OAEs
nals in the ear canal with a sensitive microphone. The and cause a decrease in the emission. Think about “gar-
emissions originate in a normal functioning cochlea. bage in–garbage out” — if the inward transmission of
Specifically, the emissions have their origin in the move- the stimuli travels through an abnormal middle ear
ment of the outer hair cells (OHCs) that enhance the system, it will reduce the sound energy that reaches the
vibrations on the basilar membrane. These emissions cochlea. This will then effectively reduce any energy
travel outward along the basilar membrane, through created by the OHC, and the energy will be further
the middle ear, and finally produce an OAE in the reduced as it transmits back out through the middle ear
ear canal. OAEs are soft signals that can only be resulting in an absent OAE (Hof, Anteunis, Chenault,
recorded with a very sensitive microphone placed in & van Dijk, 2005). Therefore, if OAEs are present, the
the ear canal and coupled to a computer that uses sig- OHC are functioning, but if they are absent, there is
nal averaging to enhance these low-level OAEs and re- either a transmission problem (conductive loss) or
duce any unwanted background noise (Glattke & there is a cochlear problem (OHC), and further testing
Robinette, 2007; Lonsbury-Martin, Martin, & White- is needed.
head, 2007). There are two main types of OAEs commonly used
The inner ear uses an active mechanism delivered in clinical settings: distortion product otoacoustic emis-
through the OHC to generate the OAE. This process is sions (DPOAE), and transient evoked otoacoustic emissions
operational only at low intensity levels (≤65 dB SPL), (TEOAE). Fundamentally they are very similar; how-
and a mild cochlear hearing loss due to a loss of OHC ever, the main difference is the type of stimulus used to
function is sufficient to eliminate them. Therefore, the evoke them. DPOAEs use two-tones (f1 and f2 where
presence of an OAE suggests normal cochlear function f2 > f1) to produce an additional tone or harmonic in
in that ear. It also provides evidence of normal middle the cochlea, and TEOAEs use a series of broad-band
ear function. OAEs have two main clinical purposes: clicks to stimulate most of the basilar membrane and
newborn hearing screening and diagnostically as a test evoke the emission. For both types of OAEs, after the
of cochlear function (Dhar & Hall, 2018). stimuli are sent into the ear, the emission that it gen-
OAEs are a very popular test not only because erates and the background noise measured in the ear
they are easy to measure and can be performed quickly, canal are displayed on a graph where a signal-to-noise
but also because they are objective and can be a pow- (SNR) comparsion can be made. An emission can only
erful source of information. The function of the OHCs be considered present if the SNR ratio is greater than
can be determined in just seconds by simply placing a 6 dB, otherwise the emission is absent. Failure to have
small probe in the ear canal. Audiologists include OAE an OAE that is at least 6 dB above the noise for any of
testing as part of the basic audiologic evaluation of all the frequency regions would indicate that the OHCs
patients but especially for young children (Blanken- from those frequency regions are not functioning nor-
ship et al., 2018). One of the main reasons is that they mally or are contaminated by a high noise level (Dhar
are easily administered and can provide an excellent & Hall, 2018; Kimberley, Brown, & Allen, 1997). Results
cross-check to verify audiometric information. Diag- from two patients are presented: one with a present
nostically, the presence of robust OAEs can provide DPOAE at all frequencies in one ear and absent in the
strong evidence of normal cochlear (OHC) function other ear (Figure 23–12) and the other patient who
(see Chapter 22). They can be used to determine if a has a present TEOAE at all frequencies in one ear and
sensorineural hearing loss is due to a problem in the absent results in the other ear (Figure 23–13).
cochlea or in the neural pathway. If the sensorineural
hearing loss identified with behavioral audiometry has
Auditory Brainstem Response
normal OAEs, this suggests that the cochlea is func-
tioning normally, and the problem lies in the eighth The Auditory Brainstem Response (ABR) is a neuro-
cranial nerve or central auditory pathway. However, electric response to an auditory stimulus sent into
it is important to remember that OAEs are not a test the ear. The stimulus is a broadband click that excites
of hearing (i.e., a patient’s degree of hearing loss) but the entire cochlea to send the neural activity along the
that they can only determine whether the OHCs are auditory (eighth cranial) nerve toward the brainstem.
functioning normally or not. This is a very quick phenomena and typically happens
In the presence of normal middle ear function, within 10 ms after the initial stimulation. The electrical
the absence of an OAE can mean that the OHC are activity is recorded through electrodes attached to the
not functioning. However, a problem with the outer scalp. This response is very small (<1 μV) in relation
B
Figure 23–12. DPOAE responses. A. Results from an ear with nor-
mal hearing; amplitudes of the DPOAEs are larger (>6 dB) than the noise
floor across the frequency range. B. Results from an ear with hearing
loss; amplitudes of both the response and noise floor are low (<6 dB)
indicating no emissions from the cochlea. Courtesy of AudProf.com.
to other electrical activity in the brain, but it is possible state, or drugs, and can be reliably recorded across all
to remove the other activity and focus on this small ages, including premature infants.
auditory response. By repeating the stimulus and The ABR is characterized by a series of five posi-
averaging the responses, we can decrease the random tive peaks as shown in Figure 23–14. The first peak,
background electrical activity and enhance the neural Wave I, is a reflection of the synchronous discharge
response. To further enhance this small ABR signal of neurons in the distal (more peripheral) portion of
or reduce the background noise, the patient needs to the auditory portion of the eighth cranial nerve as it is
be very still or asleep (natural or sedated) in order to leaving the cochlea. Subsequent waves are generated
detect the response from the background noise. Fortu- by the synchronous neural activity in the proximal part
nately, the ABR is unaffected by level of attention, sleep of the eighth cranial nerve (Wave II), cochlear nucleus
20 20
10 10
dB SPL
dB SPL
0 0
TE TE
-10 NF -10 NF
-20 -20
0 1 2 3 4 5 6 0 1 2 3 4 5 6
Frequency (Hz) Frequency (Hz)
Figure 23–13. TEOAE responses. Results from an ear with normal hearing on the left; amplitudes of the
TEOAEs are larger (>6 dB) than the noise floor across the frequency range. Results from an ear with hearing
loss on the right; amplitudes of both the response and noise floor are low (<6 dB) indicating no emissions
from the cochlea. Background noise is usually larger in the low frequencies. Courtesy of AudProf.com.
I-IV interwave latency
Absolute latency of wave V

IV V
III
0.20 µv
I
II
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Latency (ms)
Figure 23–14. A normal auditory brainstem response (ABR) waveform to a

click at a relatively high intensity level showing the waves (positive peaks) labeled
I–V. The latency (time) of each wave is called the absolute latency, which is relative
to the onset of the stimulus (at 0 ms). An example of absolute latency is shown for
wave V, which is usually the most prominent wave. The latency difference between
any two waves is called the interpeak latency difference. Reproduced with per-
mission from Audiology: Science to practice (3rd ed., p. 253) by S. Kramer and D. K.
Brown, 2019, San Diego, CA: Plural Publishing, Inc. Copyright 2019 by Plural Pub-
lishing, Inc.
(Wave III), superior olivary complex (Wave IV), lateral Neurodiagnostic ABR. Soon after the ABR was
lemniscus, and input to the inferior colliculus (Wave V) developed in the 1970s, clinicians saw its utility in
(Møller, 1994). Although the generators for the differ- identifying possible eighth cranial nerve pathologies.
ent ABR waves are found in the eighth cranial nerve The ABR was seen as a less costly and noninvasive
and brainstem, the response latencies and thresholds method when compared to a computed tomography
are affected by peripheral (cochlear) hearing loss. (CT) scan. Any patient who had an asymmetric senso-
rineural hearing loss (a hearing loss that is greater in However, because ABR is a relatively quick, objective
one ear that the other) was routinely referred for a neu- measure of hearing sensitivity, especially in newborns,
rodiagnostic ABR. When using the ABR to evaluate the young infants, or other difficult-to-test populations,
eighth cranial nerve or brainstem function, measure- ABR for threshold estimates has far surpassed the neu-
ments are made at a relatively high stimulus level to rodiagnostic applications of ABR.
maximize the likelihood of measuring the absolute and Currently, the greatest use of ABR is for determin-
interpeak latencies for Waves I, III, and V as described ing thresholds in infants who did not pass their new-
in Figure 23–14. Resulting waveforms for a patient born hearing screening. Similar to determining hearing
with an acoustic neuroma (tumor on the eighth cranial thresholds in an adult, we can estimate a child’s hear-
nerve) may show a different pattern than one with nor- ing by determining the lowest level where a repeatable
mal hearing (compare Figures 23–14 and 23–15). Fig- Wave V response can be obtained at a number of fre-
ure 23–15 shows the ABR from three different patients quencies. Figure 23–16 shows characteristic changes in
illustrating the types of abnormal responses that can the ABR waveform, specifically Wave V, for different
occur with an eighth cranial nerve pathology. frequencies presented at relatively low intensity lev-
els within the range of normal hearing for neonates.
Threshold ABR. ABR can be used as a screening test Frequency-specific ABRs can be obtained down to lev-
at a single preset intensity. Often a stimulus level of els of 10 to 20 dB for the frequencies 500, 1000, 2000,
30 or 35 dB nHL is used to determine a pass or fail. and 4000 Hz (Elsayed, Hunter, Keefe, Feeney, Brown,
Those who fail the ABR screening are referred for fol- Meinzen-Derr, Baroch, Sullivan-Mahoney, Francis,
low-up evaluations that will include a threshold ABR & Schaid, 2015). Since this technique is not the same
if the child is younger than 6 months. As with OAEs, as determining thresholds through audiometry, there
the ABR is not a test of hearing, and only inferences is an additional step. Once you have determined the
can be made about any potential degree of hearing loss. lowest level at which Wave V can be identified, then
Figure 23–15. Three different auditory brainstem response (ABR) waveforms

illustrating different abnormal recordings associated with eighth nerve disorders.
In the top waveform, the tumor only affects the proximal (more central) part of the
eighth cranial nerve, the ABR on the affected side may show a normal Wave I and
abnormalities in the later waves. In the middle waveform, the results for a patient
with an eighth cranial nerve pathology show abnormal interpeak latency inter-
vals (I–III and/or I–V) that are delayed from normal interpeak latency intervals.
The bottom waveform shows the results from a patient with no discernible waves.
Reproduced with permission from Audiology: Science to practice (3rd ed., p. 256)
by S. Kramer and D. K. Brown, 2019, San Diego, CA: Plural Publishing, Inc. Copyright
2019 by Plural Publishing, Inc.
V
4000 Hz
30 dB
V
10 dB
V
2000 Hz
30 dB
V
10 dB
1000 Hz V
30 dB
V
20 dB
500 Hz V 0.0.1 V
30 dB
V
20 dB
0 2 4 6 8 10 12 14 16 18 20 22 24
Latency (ms)
Figure 23–16. Results from a threshold ABR on a newborn, the tone-

burst ABR waveforms are recorded near threshold (“threshold ABR”).
Results show wave V at each frequency from 500 to 4000 Hz at 30 dB and
at either 10 or 20 dB nHL, which are considered normal thresholds for
this neonate, indicating that they have normal auditory function at least
through the brainstem (and most likely has normal hearing). Reproduced
with permission from Audiology: Science to practice (3rd ed., p. 257) by S.
Kramer and D. K. Brown, 2019, San Diego, CA: Plural Publishing, Inc. Copy-
right 2019 by Plural Publishing, Inc.
specific correction values need to be added for each fre- get while moving (Semaan, Wick & Megerian, 2015).
quency (Small & Stapells, 2017). These new estimated A problem with one or more of these systems results
thresholds can then be compared to behavioral thresh- in a patient feeling dizzy (i.e., unsteady, lightheaded),
olds for determination of hearing loss and fitting of and/or a spinning sensation, called vertigo. Vertigo is a
amplification if needed. symptom of a peripheral vestibular disorder, whereas
other dizziness or balance problems may be related
to central nervous system involvement, other medi-
Vestibular Assessment cal conditions, and/or psychological factors. It is esti-
mated that 35% of adults (Agrawal, Ward, & Minor,
A component of an audiologist’s scope of practice is 2013) and 5.3% of children (Li, Hoffman, Ward, Cohen,
to deal with the identification and rehabilitation of the & Rine, 2016) in the United States have some dizziness
vestibular or balance system. Anatomically, the ves- and/or balance problems.
tibular system is a part of the inner ear and is impor- Audiologists are involved in both screening and/
tant for maintaining one’s balance (see Chapter 22). or diagnostic assessment and management of vestibu-
The integration of neural activity from the vestibular, lar problems. As with hearing assessments, audiolo-
visual, somatosensory, and cerebellar systems allows gists utilize a test battery approach to assess a patient’s
you to maintain an upright posture, perform coordi- vestibular system. There are two main vestibular
nated complex movements, and maintain a visual tar- pathways: the vestibulo-ocular reflex (VOR) and the
vestibulospinal reflex (VSR). In the VOR, the system oscillations at different frequencies like an amusement
connects the vestibular neural pathway in the brain- ride. As shown in Figure 23–18, the patient is seated in
stem to neurons that reflexively control the muscles of the chair, which spins him or her around at different
the eyes to automatically maintain a clear visual image
during head movement (Wright & Schwade, 2000).
The VSR has direct connections from the vestibular
nuclei in the brainstem to motor neurons that will
reflexively control the relevant skeletal muscles, which
is important in controlling the body when unexpected
changes in position occur relative to gravity. Tests have
been developed to assess the various components of
the balance system, the most common of which are
described next.
There are four main tests that the audiologist
uses to assess a patient’s vestibular system. They
include videonystagmography (VNG), rotary chair
testing, posturography, and vestibular-evoked myo-
genic potentials (VEMPs). VNG is a balance test that
uses an infrared video system with cameras built into
goggles that track eye movements in response to vari-
ous subtests. The VNG objectively evaluates a person’s
nystagmus (i.e., rapid eye movement), watching and Figure 23–17. Conducting videonystagmography as-
measuring their eye movement as illustrated in Fig- sessment utilizing goggles with a small infrared camera
ure 23–17. Rotary chair testing uses a motorized chair built in to the lenses to track the person’s eye movements.
in a darkened room; the chair revolves in calibrated Courtesy of AudProf.com.
Figure 23–18. A rotary chair system used to determine if the vestibu-

lar or the neurological system is involved in the balance disorder. Image
courtesy of Interacoustics A/S © 2019.
speeds and with rapid accelerations and decelerations and vestibuloocular reflex (VOR) (Jacobson et al., 2011).
while measuring eye movements with the same type of Together these tests provide a wealth of information to
goggles as in VNG. The results of these tests are com- assess a patient’s balance system and create a plan for
pared to individuals with a normal functioning system treatment or rehabilitation.
to evaluate differences in eye movement (VOR) gain,
phase, and symmetry.
Posturography assesses a patient’s functional bal- Audiometric Results
ance while changing vestibular, visual, and somato-
sensory inputs. As shown in Figure 23–19, the patient Audiologic test results provide the clinician with infor-
stands on a platform and tries to stay stable while the mation about a patient’s hearing acuity and sense of
platform is moving. During the test, the patient’s pos- balance. This information will help provide a differ-
tural stability/control and ability to correct for any ential diagnosis of any pathologies in either hearing
change is monitored and evaluated. Finally, there are or balance. Determining if there is a hearing loss and,
VEMPs, which measure myogenic (muscle) poten- if so, whether it is conductive, mixed, sensorineural,
tials elicited by a high-intensity low-frequency tone unilateral, or bilateral will all assist this quest. In addi-
recorded from around the area of the neck (cervical) tion to the audiogram, information from speech tests,
or ocular muscles (Figure 23–20). Similar to ABR, the immittance tests, otoacoustic emissions, auditory-
electrical activity is recorded and averaged to reveal evoked potentials, and/or vestibular testing may also
a waveform with a negative and positive peak. The be useful in reaching a proper diagnosis and course
VEMP waveforms from the right and left sides are com- of treatment. For example, audiologic test results may
pared to assess a patient’s vestibulocollic reflex (VCR) help localize the disorder to a possible perforation of
Figure 23–19. A computerized dynamic posturography system with

an immersive virtual environment for testing a person’s balance. Image
courtesy of Bertec Corporation © 2019.
individual’s hearing loss will be different, how can we

describe each individual’s loss? The easiest method is
to describe their hearing loss based on their pure-tone
audiometric thresholds. Specifically, we need to iden-
tify the type of hearing loss, the degree (or amount) of
hearing loss, and the configuration (or shape) of the
loss. After you have determined a patient’s thresholds
and placed them on the audiogram, you can establish
their type of hearing loss. To describe the type of hear-
ing loss from the audiogram means you need to deter-
mine whether the hearing loss involves the conductive
and/or sensorineural portions of the auditory path-
ways. The conductive portion of the auditory system
refers to the outer and middle ear, and the sensorineu-
ral portion of the auditory system refers to the cochlea,
eighth cranial nerve, and central pathways. The degree
of loss refers to the amount of hearing loss that a per-
son has. Table 23–3 examines the categories used to
describe the degree of a person’s hearing loss. The
configuration of a person’s hearing loss refers to the
shape of the audiogram, showing how the thresholds
compare across the frequency range. For example, a
hearing loss where the thresholds are consistent across
the frequency range would be described as flat. If the
thresholds are better in the low frequencies and gradu-
ally become poorer, we would call that a sloping hear-
ing loss; if it was flat across the frequency range except
Figure 23–20. Electrode placement used to record a for a decrease in hearing at a particular frequency
VEMP on a patient with an electrode placed on the fore- (usually 4000 Hz), we would call that a notch. Ears that
head and on the sternocleidomastoid muscle in the neck. have a bilateral hearing loss are said to have a loss in
Courtesy of AudProf.com. both ears, and a unilateral loss is a loss in only one ear.
Normal Hearing
the tympanic membrane or suggest that there may be
pathology of the eighth cranial nerve. It is important If a patient has air conduction thresholds across the fre-
to keep in mind that a medical diagnosis can only be quency range at 20 dB HL or less, he or she is said to
determined by a physician, who conducts a thorough
medical examination and may order lab work, imaging
studies, or other diagnostic tests. All medically related Table 23–3. Categories Used When Describing the
auditory disorders must be referred by hearing health Degree of Loss in Adults
care professionals to a physician for evaluation and
ongoing care. In cases where an adult has a sensori- dB HL Range Descriptive Category
neural hearing loss, with no apparent medical or neural −10 to 20 dB HL Normal
involvement, the audiologist may provide appropriate
services without the need for medical evaluation; how- 21 to 40 dB HL Mild
ever, any hearing loss in a child should be referred to a 41 to 55 dB HL Moderate
physician for evaluation.
56 to 70 dB HL Moderately severe
71 to 90 dB HL Severe
Type, Degree, and
Configuration of Loss 91+ dB HL Profound
Source: Adapted with permission from Audiology: Science to Prac-

Since we know that there are a significant number of tice (3rd ed., p. 139) by S. Kramer and D. K. Brown, 2019, San Diego,
people in the country with a hearing loss and each CA: Plural Publishing, Inc. Copyright 2019 by Plural Publishing, Inc.

250
2 500 1000 2000 4000 8000 250
2 500 1000 2000 4000 8000
-10 < < -10
<
0 <O <O X <O <<O
X O O
X 0 [ ] [ ] [ ] [ ]
X <O
< <
X X <O X X [ ] [ ]
10 10
Decibels Hearing Level (dB HL)

20 20
30 30
O
X O O X O
X O
40 40 X X X O O
X X
O
50 50
60 60
70 70
80 80
90 90
De
100 100
110 110
120 120
Figure 23–21. A normal audiogram. Note that the Figure 23–22. Conductive hearing loss for a patient’s
unmasked air and bone scores are less than 20 dB HL at all right and left ears as noted by the air-bone gap. This is
frequencies for both the right and left ears. an example of a patient with a conductive loss from otitis
media with effusion as discussed in Chapter 19 for children
with craniofacial abnormalities.
have normal hearing as shown in Figure 23–21. How- threshold and bone conduction threshold) is consid-
ever, normal is not always normal, and since there are ered significant if it is greater than 10 dB.
a significant number of components that make up the
auditory system, it is possible that there can be some Sensorineural Hearing Loss
issues that do not affect thresholds but cause hearing
difficulties. It is estimated that 26 million American A hearing problem in the sensorineural portions of the
adults have hearing within normal limits but report auditory pathway (see Chapter 22) would have essen-
difficulty hearing and/or have problems with hearing tially the same hearing loss when tested by air conduc-
speech in noise (Beck et al., 2018). tion as when tested by bone conduction. An example
of a patient with a sensorineural hearing loss can be
seen in Figure 23–23. This type of hearing loss indicates
Conductive Hearing Loss that there is a disorder of the cochlea or neural disorder
(eighth cranial nerve or central pathway).
The conductive portion of the system is in the outer
and middle ear. A hearing problem in the conductive
Mixed Hearing Loss
parts of the ear would show an air-bone gap, that is,
normal hearing by bone conduction and poorer hear- A hearing loss that occurs in both the sensorineural
ing by air conduction. An example of an audiogram and conductive parts of the auditory system (see Chap-
from a patient with a conductive hearing loss from oti- ter 22) is called a mixed hearing loss. In this type of
tis media with effusion can be seen in Figure 23–22, loss, you would have a hearing loss by air conduction
which shows that the patient has abnormal air conduc- and by bone conduction; however, there would also be
tion thresholds but normal bone conduction thresh- an abnormal air-bone gap of greater than 10 dB. An
old with an air-bone gap of approximately 30 dB. The example of an audiogram from a patient with a mixed
air-bone gap (or difference between the air conduction hearing loss can be seen in Figure 23–24.

250
2 500 1000 2000 4000 8000 250
2 500 1000 2000 4000 8000
-10 -10
0 0
10 10

20 20
30 30 [ ] [ ]
[ ] [ ] [ ] [ ]
40 < 40
<O < <O < X
<O <O
< <
50 X X X < X O
X O 50 O
X O O X O
X O
<O
X <O X X X X O O
X X
60 60 O
70 70
80 80
90 90
De
Dec
100 100
D
110 110
120 120
Figure 23–23. Sensorineural hearing loss. Note the Figure 23–24. Mixed hearing loss. Note that the air
moderate degree of loss and that there is no difference conduction scores are poorer than the bone conduction
between the air conduction and bone conduction thresh- thresholds but that they both show a hearing loss.
olds (no air-bone gap).
Hearing and Balance Disorders Case 1
Now that we have gone through all of the steps in Test Results
assessing a patient’s hearing and balance system from
the case history, otoscopy, pure-tone audiometry, Case history A 6-year-old female came to your
speech audiometry, immittance, otoacoustic emis- clinic because she did not pass
sions, auditory brainstem response, and vestibular her hearing screening at school.
assessment, we can turn our thoughts to putting it Her parents reported no history of
together and determining the disorder. The first part hearing problems in the family, and
is to determine the location of the problem; remember they felt that she had no hearing
that this plays a large part in the type of hearing loss, concerns.
since conductive losses occur in the outer and middle Otoscopy Ear canals appeared normal
ear and sensorineural losses occur in the cochlea or bilaterally
eighth nerve. Table 23–4 lists a number of disorders
and describes their condition and location. Audiometric Using Figure 23–25, calculate
results the PTA, determine the type and
degree of loss, and confirm using
Patient Examples the cross-check principle.
The following are a few examples of patient outcomes. Outcome: Pure-tone thresholds indicate hearing within
Try and work your way through them from the case normal limits for both the right and left ears. Speech
history to the audiometric results. Identify the tests and reception thresholds are in good agreement with pure-
their results and see how they all fit together (cross- tone averages. Word recognition scores are excellent
check principle), then compare your outcome to the for both the right and left ears. Tympanometry results
outcomes at the end of the case. showed normal middle ear function bilaterally. Acoustic
Table 23–4. Hearing and Vestibular Disorders for Different Locations
Location Disorders Condition

Outer ear Atresia Congenital absence of an external auditory canal,
conductive hearing loss
Exostosis Bony outgrowths in the external auditory canal caused by
repetitive irritation from cold water, normal hearing
Impacted cerumen Abnormal buildup of earwax (cerumen) that completely
blocks off the ear canal, conductive hearing loss
Stenosis Abnormal narrowing of the external auditory canal, normal
hearing
Middle ear Disarticulation Separation of the ossicular chain or break in one of the
ossicles, conductive hearing loss
Otitis media Inflammation or accumulation of fluid in the middle ear, can
- acute be fluid free of bacteria or infected with bacteria, conductive
- effusion hearing loss
- chronic
Otosclerosis Caused by outgrowth of bony wall (otospongiosis)
around the stapes footplate; commonly causes middle ear
conductive hearing loss when stapes is immobilized; less
commonly, toxins may invade the cochlea and cause a
sensorineural hearing loss
Perforation Hole in the tympanic membrane, depending on size and
location may cause conductive hearing loss
Inner ear Ototoxicity Sensorineural hearing loss due to poisonous side effects
from some therapeutic drugs or environmental toxins
Meningitis Inflammation of the meninges (membranous sheets covering
the brain), may cause sensorineural hearing loss
Noise-induced hearing loss Sensorineural hearing loss resulting from excessive
exposure to loud sounds
Presbycusis Sensorineural hearing loss related to aging
Neural Acoustic neuroma Benign tumor involving the eighth cranial nerve,
disorders sensorineural hearing loss
Auditory neuropathy Congenital hearing disorder in which the eighth nerve
spectrum disorder (ANSD) neurons do not fire with normal synchrony; characterized by
the absence of ABR and the presence of normal OAE, normal
hearing to sensorineural hearing loss
Vestibular Benign paroxysmal Caused when some calcium carbonate crystals of the
positional vertigo (BPPV) otoconia within the macula of a utricle become dislodged
and fall into one or more of the semicircular canals; most
commonly found in the elderly, no hearing loss
Labyrinthitis Infection (viral) of the inner ear labyrinth and/or vestibular
nerve that may cause temporary disruption of vestibular
function producing dizziness/vertigo, balance problems,
nausea, and/or hearing and vision problems
Superior canal dehiscence A small opening in the bony labyrinth of the superior
semicircular canal that acts as a third window of the inner
ear that allows the membranous portion of the superior
semicircular canal to be displaced during sound or
increased pressure, vestibular disorder
Vestibular schwannoma Benign tumor involving the vestibular portion of the eighth
cranial nerve; vestibular disorder
338
Frequency (Hz)
250
2 500 1000 2000 4000 8000
-10 <
<O
<
0 <O < O <O
<
X< X X O
X
X
X
< <O Speech Audiometry
10
Ear PTA SRT WRS dBHL
20 Right 5 100% 50 dBHL
Left 5 100% 50 dBHL
30
40 Ipsilateral Acoustic Reflex Thresholds
Ear 500 Hz 1000 Hz 2000 Hz
50 Right 85 dB 85 dB 85 dB
60 Left 85 dB 85 dB 85 dB
70
80
90
De
100
110
120
Figure 23–25. Audiometric results for Case 1.
reflex thresholds were present for both the right and left ears. Tympanometry results showed normal mid-
left ears. As expected, all tests suggest that this young dle ear function in the left ear but abnormal middle
girl has hearing within normal limits bilaterally. ear function in the right ear. Acoustic reflex thresholds
were present in the left ear but absent in the right ear.
As expected, all tests suggest that this boy has hearing
Case 2
within normal limits in the left ear and a mild conduc-
tive loss in the right ear.
Test Results
Case history An 8-year-old male was brought
to your clinic complaining about Case 3
a sore ear and that he was having
a hard time hearing his teacher in Test Results
school. His parents reported that
Case history A 59-year-old male came to your
prior to 2 weeks ago, he had not
clinic complaining about not
had any problems hearing.
hearing as well as he used to and
Otoscopy Right ear was red and inflamed; left that his children complain that the
ear canal appeared normal TV is too loud at his house. He
reported that he was in the artillery
Audiometric Using Figure 23–26, calculate during the war and that he worked
results the PTA, determine the type and in a factory most of his adult life.
the cross-check principle. Otoscopy Ear canals appeared normal
bilaterally
Outcome: Pure-tone thresholds indicate hearing
Audiometric Using Figure 23–27, calculate
within normal limits for the left ear and a mild conduc-
results the PTA, determine the type and
tive loss in the right ear. Speech reception thresholds
are in good agreement with pure-tone averages. Word
the cross-check principle.
recognition scores are excellent for both the right and
Frequency (Hz)
250
2 500 1000 2000 4000 8000
-10 <
0< [
<
X
<
[X [ < [X <
Speech Audiometry
10 X X X
20 Right 45 100% 80 dBHL
Left 10 100% 50 dBHL
30
O
40 O O Ipsilateral Acoustic Reflex Thresholds
O O Ear 500 Hz 1000 Hz 2000 Hz
50 O Right NR NR NR
60 Left 85 dB 80 dB 90 dB
70
80
90
De
100
110
120
Frequency (Hz)
250
2 500 1000 2000 4000 8000
-10
0
< Speech Audiometry
10 <O
X

O
X
<O
<
20 X Right 20 72% 80 dBHL
< Left 20 76% 80 dBHL
30 <O
X
<
40 <O
X Ipsilateral Acoustic Reflex Thresholds
Ear 500 Hz 1000 Hz 2000 Hz
50 Right 95 dB 100 dB NR
60 O
X Left 95 dB 100 dB NR
70
80
90
De
100
110
120
Outcome: Pure-tone thresholds indicate hearing within with pure-tone averages. Word recognition scores are
normal limits to 1000 Hz sloping to a moderately good for both the right and left ears. Otoscopic results
severe sensorineural hearing loss at 8000 Hz bilaterally. were unremarkable bilaterally. Tympanometry results
Speech reception thresholds are in good agreement showed normal middle ear function bilaterally. Acous-
tic reflex thresholds were present ipsilaterally for 500 Otoacoustic emissions assess the function of the
and 1000 Hz but absent at 2000 Hz in both ears. These cochlea by measuring sounds produced by the outer
test results suggest that this gentleman has a high-fre- hair cells.
quency sensorineural hearing loss bilaterally. The evoked electroencephalography (EEG) is used
to generate the auditory brainstem response (ABR)
and measure the neurodiagnostic ABR response. This
technique is used to measure the neural activity in
Chapter Summary
the auditory nerve and brainstem. The ABR test can
also be used to estimate thresholds, especially in very
Assessment of an individual’s hearing includes the young children.
administration and interpretation of behavioral, psy- Not only do audiologists assess a person’s hearing,
choacoustic, and electrophysiologic measures of the but they are also involved in evaluating their balance
peripheral and central auditory systems. system. This can be accomplished using videonystag-
Assessment of an individual’s vestibular sys- mography (VNG), rotary chair testing, posturography,
tem includes the administration and interpretation of and vestibular-evoked myogenic potential (VEMP).
behavioral and electrophysiologic tests of equilibrium. Each of these tests assesses a different part of the ves-
Use of a case history allows the clinician to collect tibular system.
information about a patient’s primary complaints and The results of all of these evaluations can assist
medical history to help develop a clinical impression the audiologist in determining the type, degree, and
about the patient’s problem. configuration of hearing loss such as conductive, sen-
Prior to inserting anything into the ear, the clinician sorineural, or mixed hearing loss and provide differ-
needs to visually inspect the ear canal using otoscopy. ential diagnosis of the patient’s hearing and balance
Immittance audiometry assesses the function of problems.
the middle ear through a battery of tests. Tympanom-
etry measures how the admittance changes as a func-
tion of applied air pressure and how that is affected References
by different conditions of the middle ear. Jerger types
(A, B, C, As, Ad) are used to describe the results of the Agrawal, Y., Ward, B. K., & Minor, L. B. (2013). Vestibular dys-
tympanometry measurement. function: Prevalence, impact and need for targeted treat-
The acoustic equivalent volume (Vea) test allows ment. Journal of Vestibular Research, 23(3), 113–117.
for an estimation of the ear canal volume to determine American Academy of Audiology (AAA). (2004). Scope of
if it is larger or smaller than normal. practice. Retrieved from https://www.audiology.org/
Acoustic reflex thresholds measure the lowest- publications-resources/document-library/scope-practice
intensity tone that can elicit a reflex where the stape- Beck, D. L., Danhauer, J. L., Abrams, H. B., Atcherson, S. R.,
dius muscle is contracted to reduce the vibration of Brown, D. K., Chasin, M., & Wolfe, J. (2018). Audiologic
the stapes. considerations for people with normal hearing sensitivity
yet hearing difficulty and/or speech-in-noise problems.
Pure-tone audiometry allows for development of
Hearing Review 25(10), 28–38.
an audiogram, which shows air and bone conduction Blankenship, C. M., Hunter, L. L., Keefe, D. H., Feeney, M. P.,
thresholds for both ears. Pure-tone audiometric thresh- Brown, D. K., McCune, A., & Lin, L. (2018). Optimizing
olds are used by audiologists to describe a patient’s clinical interpretation of distortion product otoacoustic
hearing loss, determine the parts of the auditory sys- emissions in infants. Ear and Hearing, 39(6), 1075–1090.
tem that are involved, and predict how the patient’s Dhar, S., & Hall, J. W. (2018). Otoacoustic emissions (2nd ed.).
hearing loss may relate to his or her ability to listen and San Diego, CA: Plural Publishing.
communicate. Elsayed, A., Hunter, L. L., Keefe, D. H., Feeney, M. P., Brown,
Speech audiometry evaluates how well a patient D. K., Meinzen-Derr, J. K., . . . Schaid, L. G. (2015). Air and
can hear and understand speech by formalizing a way bone conduction tone-burst auditory brainstem thresh-
to determine the patient’s ability to recognize speech. olds using a Kalman filtering approach in non-sedated
normal hearing newborns. Ear and Hearing, 36(4), 471–481.
One test called the speech reception threshold (SRT) is
Glattke, T. J., & Robinette, M. S. (2007). Transient evoked oto-
used as a cross-check principle with the patient’s pure- acoustic emissions in populations with normal hearing
tone average. Word recognition score testing evaluates sensitivity. In M. S. Robinette & T. J. Glattke (Eds.), Oto-
how well a patient can recognize speech at a supra- acoustic emissions: Clinical applications (3rd ed., pp. 87–106)
threshold level. These tests are performed in quiet, but New York, NY: Thieme.
there are some tests (i.e., BKB-SIN test) that can assess Hof, J. R., Anteunis, L. J. C., Chenault, M. N., & van Dijk,
their ability in the presence of background noise. P. (2005). Otoacoustic emissions at compensated middle
ear pressure in children. International Journal of Audiology, Glattke (Eds.), Otoacoustic emissions: Clinical applications
44(6), 317–320. (3rd ed., pp. 107–130). New York, NY: Thieme.
Jacobson, G. P., McCaslin, D. L., Piker, E. G., Gruenwald, J., Møller, A. R. (1994). Neural generators of auditory evoked
Grantham, S. L., & Tegel, L. (2011). Patterns of abnormality potentials. In J. T. Jacobson (Ed.), Principles and applications
in cVEMP, oVEMP and caloric tests may provide topologi- in auditory evoked potentials (pp. 23–46). Boston, MA: Allyn
cal information about vestibular impairment. Journal of the & Bacon.
American Academy of Audiology, 22(9), 601–611. Mueller, H. G., Ricketts, T. A., & Bentler, R. (2014). Modern
Jacobson, G., & Shepard, N. (2016). Balance function assessment hearing aids: Pre-fitting testing and selection considerations.
and management. San Diego, CA: Plural Publishing. San Diego, CA: Plural Publishing.
Jerger, J. (1970). Clinical experience with impedance audiom- National Institute of Deafness and Communication Disorders
etry. Archives of Otolaryngology, 92, 311–324. (NIDCD). (2016). Quick statistics about hearing. Retrieved
Jerger, J. (2009). Audiology in the USA. San Diego, CA: Plural from https://www.nidcd.nih.gov
Publishing. Semaan, M. T., Wick, C. C., & Megerian, C. A. (2015). Vestibu-
Jerger, J. F., & Hayes, D. (1976). The cross-check principle in lar physiology. In M. L. Pensak & D. I. Choo (Eds.). Clinical
pediatric audiometry, Archives of Otolaryngology, 102(10), otology (4th ed., pp. 35–44). New York, NY: Thieme.
614–620. Small, S. A., & Stapells, D. R. (2017). Threshold assessment in
Kemp, D. T. (1978). Stimulated acoustic emissions from infants using the frequency-specific ABR and ASSR. In R.
within the human auditory system. Journal of the Acousti- Seewald & A. M. Tharpe (Eds.), Comprehensive handbook of
cal Society of America, 64(5), 1386–1391. pediatric audiology (2nd ed., pp. 505–550). San Diego, CA:
Kimberley, B. P., Brown, D. K., & Allen, J. B. (1997). Distor- Plural Publishing.
tion product emissions and sensorineural hearing loss. Tillman, T. W., & Carhart, R. (1966). An expanded test for
In M. S. Robinette & T. J. Glattke (Eds.), Otoacoustic emis- speech discrimination utilizing CNC monosyllabic words.
sions: Clinical applications (pp. 181–204). New York, NY: Northwestern University Auditory Test No. 6. [Technical
Thieme. report] SAM-TR-66-55. Brooks AFB, TX: USAF School of
Kramer, S., & Brown, D. K. (2019). Audiology: Science to practice Aerospace Medicine.
(3rd ed.). San Diego, CA: Plural Publishing. Tillman, T. W., & Jerger, J. F. (1959). Some factors affecting the
Li, C. M., Hoffman, H. J., Ward, B. K., Cohen, H. S., & Rine, spondee threshold in normal hearing subjects. Journal of
R. M. (2016). Epidemiology of dizziness and balance prob- Speech and Hearing Research, 2, 141–146.
lems in children in the United States: A population based Wiley, T. L., Oviatt, D. L., & Block, M. G. (1987). Acoustic-
study. Journal of Pediatrics, 171, 240–247. immittance measures in normal ears. Journal of Speech and
Lonsbury-Martin, B., Martin, G., & Whitehead, M. (2007). Hearing Research, 30(2), 161–170.
Distortion product otoacoustic emissions in populations Wright, C. G., & Schwade, N. D. (Eds.) (2000). Anatomy and
with normal hearing sensitivity. In M. Robinette & T. physiology of the vestibular system. New York, NY: Thieme.
24
Assistive Listening Devices
Introduction of audiologic treatment, including auditory training,

rehabilitation, implant programming, and maintenance
This chapter presents an overview of the different types of the devices.
of listening devices available to maximize an individ-
ual’s residual hearing for daily functioning. Students
interested in detailed information about hearing aids Hearing Aids
and other implantable devices are encouraged to con-
sult the outstanding text by Ricketts, Bentler, and Muel- Steps in Selecting and
ler (2019). In the current chapter, we review three kinds Fitting a Hearing Aid
of listening devices: hearing aids, cochlear implants,
and hearing assistive technology systems. The key component in the treatment and rehabilitative
As we indicated in the previous chapter, there are process for a patient with a hearing loss is the providing
approximately 37.5 million adults in the United States of amplification or the fitting of a hearing aid. Unlike
with hearing loss. It is estimated that just under 30 mil- providing the patient with glasses, it is not simply the
lion people could be helped with hearing aids but only provision of the device that allows the patient to have
about a third will have a need for them (National Insti- better communication. The audiologist has an impor-
tute on Deafness and Other Communication Disorders tant role in the comprehensive selection and fitting pro-
[NIDCD], 2016). The audiologist is responsible for eval- cedure, coupled with appropriate counseling before,
uating, fitting, and verifying these individuals’ ampli- during, and after the fitting of hearing aids. Patients
fication or assistive listening devices. The audiologist will also require various types of auditory and acclima-
also determines whether the device is appropriate for tization training in order for them to achieve maximal
their hearing problem and evaluates any benefit they benefit. To make this happen, there is a general work-
might receive from the device. The audiologist is also flow to the treatment process that the audiologist will
a member of the auditory implant team (i.e., cochlear follow to assist the patient in making maximal gains
implants, middle ear implantable hearing aids, fully in his or her rehabilitation. In this six-step procedure
implantable hearing aids, bone-anchored hearing aids, (Table 24–1) for the fitting of a hearing aid, the audi-
and all other amplification devices) who determines ologist will guide the patient through the assessment,
audiologic candidacy based on hearing and communi- treatment planning, selection, verification, orientation,
cation information. The audiologist provides pre- and and validation for selecting and fitting a hearing aid
postsurgical assessment, counseling, and all aspects (Mueller & Hall, 1998).
343
Table 24–1. Typical Workflow for the Selection and Fitting of Hearing Aids
• Step 1. Assessment: Determine the extent and cause of hearing loss. Determine candidacy for
hearing aids based on pure-tone hearing loss, self-assessment inventories, and patient history.
• Step 2. Treatment Planning: Review the assessment results with the patient and/or family
members. Identify areas of difficulty and explore different amplification options.
• Step 3. Selection: Determine the type of fitting (style, etc.). Decide on electroacoustic
parameters and what special features are needed.
• Step 4. Verification: Determine that the hearing aids meet a set of standardized measures,
including electroacoustic performance and patient’s real-ear match to desired levels
(presumably based on validated prescriptive targets). The hearing aids also should have good
sound quality, be comfortable to wear, and have acceptable cosmetics.
• Step 5. Orientation: Counsel the patient on the use and care of the hearing aids. Discuss
hearing aid adjustments and realistic expectations. Determine the need for a more
comprehensive audiologic rehabilitation program.
• Step 6. Validation: Assess the effectiveness of the use of hearing aids in the patient’s everyday
environment, through the use of patient interview and/or formal self-assessment inventories of
benefit and satisfaction. Readdressf areas of need regarding the fitting.
Source: Reproduced with permission from Audiologists’ Desk Reference by H. G. Mueller and J. W. Hall, 1998, San
Diego, CA: Singular.
The first step is to assess the patient’s hearing to through an interface with a computer. The audiologist
determine the type, degree, and configuration of loss as uses a computer program or validated prescriptive fit-
we discussed in the previous chapter. Using the pure- ting method (Mueller, Ricketts, & Bentler, 2017) to set
tone audiogram, speech audiometry, tympanometry, the hearing aid to the desired gain and output targets
and other tests as needed, we can determine patients’ based on the patient’s hearing loss. There are two fit-
hearing loss and how it is affecting their communicating methods that are commonly used, the National
tion ability. Patients must also believe that their hearing Acoustic Laboratories’ (NAL) NonLinear method v.2
loss is causing a problem and that they are motivated (NAL-NL2) and the Desired Sensation Level method
and have the appropriate expectations for hearing aid v.5 (DSLv5.0). Most audiologists use the NAL method
use. Once this is complete and the patient is determined with adults and the DSL method with children. The
to be an appropriate candidate, the audiologist needs to goal of these fitting methods is to determine the
develop a treatment plan. In this plan, we will review optimal frequency-specific gain and output for soft,
the results with the patient, identify any areas in which average, and loud speech inputs and to calculate the
the patient is having difficulty such as in noisy restau- maximum power output so that the hearing aid cannot
rants, on the phone, or at work and explore the differ- exceed the patient’s loudness discomfort level.
ent amplification options available to him or her. Once the hearing aid selection has been made,
The third step is the selection of the hearing aid the audiologist’s focus can turn toward verification of
style, the electroacoustic parameters for the hearing aid the hearing aids, which is the fourth step. Verification
based on the patient’s hearing loss, and the ear or ears is the determination by the audiologist that the hear-
to be fit. In the United States, approximately 80% of ing aids are comfortable to wear and have acceptable
patients are fit with two hearing aids; however, many cosmetic appeal, have good sound quality, and meet a
will become unilateral hearing aid users (Mueller, Rick- set of standardized measures including electroacous-
etts, & Bentler, 2014). Working with the patient, the tic performance and that a patient’s real-ear measures
audiologist determines the style of hearing aid and the match their desired levels. After a hearing aid has been
features required. In addition to the standard features, programmed for the patient’s hearing loss using a pre-
some instruments will have optional features such as scriptive fitting method, it is important to verify that
digital noise reduction, wireless Bluetooth connectivity, the output as measured in the ear canal, when actually
or directional microphones that the patient may desire. being worn by the patient, is appropriate. Verification
The hearing aid settings or electroacoustic parame- of prescriptive targets is accomplished using probe-
ters are programmed into the digital hearing aid circuit microphone real-ear measures, where a thin silicone
tube is placed in the ear canal and is attached to a mea- Outcome Inventory for Hearing Aids (IOI-HA), the
surement microphone, which measures the aided ear Client Oriented Scale of Improvement (COSI), and the
canal levels for an external speech signal. This method, Abbreviated Profile of Hearing Aid Benefit (APHAB).
referred to as speech mapping, records the real-ear These measures usually compare a pretest/pre-fitting
aided response (REAR), thus allowing the audiologist to or unaided measure to an aided or post-fitting measure
reference the ear canal sound pressure level (SPL) so of success and evaluate a difference score or benefit
that he or she can visualize the portion of the speech measurement. These questionnaires are easy to admin-
signal that is audible for individual patients in their ister and score and address many of the different areas
own ear and determine if it is equal to the patient’s of hearing aid success. Since the purpose of the valida-
prescriptive targets. tion process is to assess the effectiveness of the hearing
As a part of Step 5, the audiologist provides an aids and the patient’s subjective satisfaction with them,
extensive hearing aid orientation. During this orienta- a low score would indicate the patient may need the
tion, the patient receives hands-on instruction about hearing aids adjusted and/or require more counseling.
the care and use of the new hearing aids. This portion Thus, a properly fitted hearing aid has the ability to
of the fitting process cannot be underestimated, as the change a person’s life for the better, and the audiolo-
first few weeks of ownership are critical regarding gist is the professional who is able to make it happen.
hearing aid use and acceptance. It has been estimated These six steps, summarized in Table 24–1, will allow
that as high as 10% of people who own hearing aids the audiologist to assess, fit, verify, and validate the
do not use them, and many times they are not used patient’s hearing loss and the patient’s rehabilitation
simply because the patient becomes frustrated with the of his or her hearing problem.
basic operation and handling of the instruments during
those first few days (Mueller, 2019). The best hearing
aid circuitry in the world does not help if the patient Types of Hearing Aids
cannot properly fit the hearing aid in his or her ear or
if they are uncomfortable to wear. Extensive counseling There are different types of hearing aids for different
about anticipated benefits of the hearing aids is also kinds of people and their problems. As summarized
required. Patients often believe (incorrectly) that sim- in Table 24–2, there are basically two types of hearing
ply putting on the hearing aid will immediately allow aids, either a behind-the-ear (BTE) or a custom hearing
them to hear. However, providing them with more aid. Fundamentally, these hearing aids differ in size,
realistic expectations will assist in their acclimatization application, and features. It is important not to confuse
to the new hearing aids. This counselling can take the hearing aid type with the technology contained within
form of frequent clinic visits, follow-up phone calls, the aid. Many manufacturers have different hearing aid
and group classes for new hearing aid users. models with the same digital processing chip across
Following the verification of hearing aid perfor- the various types they offer; therefore, different types
mance and orientation to the new devices, the next step of the same product model will tend to sound similar
in the fitting process is to conduct some measure of val- to the patient.
idation. Hearing aid validation is the process of ensuring The BTE type of hearing aid is worn behind the
that the goals set forth in the treatment plan or in the ear with an earhook, which rests on top of the auricle
communication needs assessment are met. This type of and helps keep it in place behind the ear. The amplified
validation includes different measurements, which can sound is delivered through the earhook and custom-
be used to determine if the fitting was successful during made earmold that fits into the concha and ear canal.
real-world use by the patient, if the hearing aids have The earmold, which is shown in Figure 24–1, is used
improved their social and emotional well-being, and to secure the hearing aid to the ear. This type contin-
if this process improved their quality of life. This can ues to be used for fitting individuals with severe to
be accomplished through the use of post-fitting self- profound hearing losses, individuals with visual or
assessment inventories. These self-assessment invento- dexterity issues, and young infants and children with
ries consist of questionnaires completed by the patient hearing loss. Alternatively, the BTE can be connected
and the patient’s communication partners (i.e., spouse, to the ear with a one-piece thin tubing, or a wire that
significant other, or family member). Many different has the receiver at the end (see Figure 24–1). This com-
types of questionnaires are available, from informal bination allows for a smaller size, which has made it
questionnaires to standardized assessment tools. There very popular.
are a large number of questionnaires because there are Custom hearing aids require an impression of the
a variety of ways to measure success with hearing aids. ear, which the manufacturer uses to create a case fit for
Some examples of questionnaires are the International the user’s ear in which they place all of the components
Table 24-2. Hearing Aid Types. Hearing aids become smaller and have less gain as you move from left to
right side of the table.
Behind-the-Ear Custom Hearing Aid

Connection to the Ear
Standard Receiver-in-the- Receiver-in-the In-the-Ear In-the-Canal Completely-in-

canal (RIC) aid (RITA) the-Canal
Custom Thin electrical Thin/small Hearing aid Only fills the Resides
earmold wire connected diameter (nearly housed in a lower one completely in
to a receiver invisible) tube custom Half or quarter of the the ear canal
with a non- with a non- Full Shell concha
custom eartip custom eartip
AB
Earhook
Thin wire
Tubing
Non-custom
open eartip
Earmold Slim tube
Figure 24–1. Behind-the-ear (BTE) hearing aid style. A. Standard BTE with custom earmold. B. Mini-
BTE hearing aids in a receiver-in-the-canal (RIC) and receiver-in-the-aids (RITA) configuration.
of the hearing aid. Within the custom line of hearing where all the hearing aid components can be fit into a
aids, there are a number of different styles from which customized case that fits into the ear canal. Examples of
the patient can choose. The largest custom style is the different custom styles are shown in Figure 24–2. The
in-the-ear (ITE) hearing aid, which uses a custom case ITE or in-the-canal (ITC) hearing aid has all the hearing
that fills the concha and a portion of the ear canal. aid components built inside of a small customized cas-
The development of the digital integrated circuit has ing (Figure 24–3). The ITC hearing aid only partially
allowed hearing aid sizes to be reduced to the point fills the lower approximate one quarter of the concha;
a. b. c. d.
Figure 24–2. Examples of different styles of custom hearing aids. a. Mini Canal, b. ITC, c. Half Shell, and d. Full Shell.
Images courtesy of Sonova AG © 2019.
Microphone ports
User toggle switch
Battery door
Volume control
Figure 24–3. External features of a full-shell ITE. Image courtesy of Sonova

AG © 2019.
thus, the ITC is even smaller than the half-shell ITE but ate for those with some cosmetic concerns regarding
with the smaller size comes less amplifier gain. For this hearing aids, and who have less severe hearing loss
reason, the selection of an ITC may be most appropri- and relatively good manual dexterity to operate and
Earshell
Microphone
Battery door
cm inch Removal string

Vent
Figure 24–4. A CIC hearing aid that fits completely in the ear
canal and has a removal string to allow the user to remove it. Image
courtesy of Starkey Hearing Technologies © 2019.
insert/remove the aid. The last style is the completely- processer, through which the incoming signal can be
in-the-canal (CIC) hearing aid (Figure 24–4). It is the manipulated and different processing algorithms can
smallest of the custom-style hearing aids and is often be applied.
selected by the patient who is looking for the “invis- The main component of the hearing aid circuitry is
ible” hearing aid. CICs typically do not offer enough the amplifier, a computer chip that is used to increase
amplification for patients with moderate or greater the level and alter the frequency of the input signal.
hearing losses. This chip also includes algorithms for all special signal
processing features that the hearing aid may have. The
digital signal is subsequently reconverted to an electri-
Hearing Aid Components cal signal that is sent to the receiver. The receiver is in
actual fact a speaker that converts the amplified signal
The basic components of a digital hearing aid include from the hearing aid into an acoustical signal that is
a microphone, amplifier, receiver, battery, and volume delivered to the patient’s ear. This is all powered by a
control. As shown in Figure 24–5, these components battery, which comes in different sizes to match the size
work together to detect the incoming signal, amplify and power requirements of the hearing aid. The level of
it, and then deliver it to the ear. The microphone or the output can be adjusted by the user through the vol-
microphones are used to pick up the acoustic signal ume control. Many hearing aids have a volume control
from the sound source and convert it to an electrical that is either a wheel or button on the aid itself; how-
signal that is then sent to the amplifier. Most current ever, on some aids, it may be a function on a remote-
hearing aids have two microphones, which allows for control device or controlled with a smartphone app.
special directional processing. The electrical signal is An additional component is the telecoil, an alternative
then converted into digital information by the digital input source for electrical signals. The telecoil converts
Microphones
BTE
Switch
Earhook Receiver
Processor
Battery
RIC Microphone
Switch
Processor
Receiver
Battery
Dome
Figure 24–5. Main components found in both standard BTE and RIC hearing aids. Adapted
with permission from Audiology: Science to Practice (3rd ed., p. 327) by S. Kramer and D. K.
Brown, 2019, San Diego, CA: Plural Publishing, Inc. Copyright 2019 by Plural Publishing, Inc.
an electromagnetic signal from a telephone, an assistive are three such specialized auditory devices: the bone-
listening device, or a wide-area loop system to a signal anchored implant (BAI), the middle ear implant (MEI),
that can be delivered directly to the amplifier. and the cochlear implant (CI). Each of these devices fits
a specific type of hearing loss or meets the need of a
certain type of patient.
Auditory Implantable Devices The BAI can be used with patients who have a con-
ductive or mixed hearing loss or for those with single-
Depending on the type of hearing loss, some patients sided deafness (unilateral hearing loss). The MEI is
will perform better with the use of a different type of typically used with patients who have an intact middle
hearing device. Rather than being fit with a conven- ear system but have a mild to severe sensorineural hear-
tional hearing aid, sometimes a patient needs to have a ing loss. A CI is a different type of hearing device, where
different treatment option depending on the individual the acoustic signals are converted to electrical impulses
patient characteristics. In these types of patients, the and delivered to electrodes that are implanted into the
audiologist will work with an otologist (ear, nose, and cochlea. The CI is inserted into the cochlea and requires
throat surgeon) to manage the patient’s case. There an intact eighth cranial nerve to be able to function.
Bone-Anchored Implant ure 24–6, the surgeon implants an abutment or magnet

made with a titanium screw into the skull behind the
A BAI consists of a hearing device that is surgically pinna. First, a hole is drilled into the skull, and then the
implanted into the mastoid area behind the ear. Candi- surgeon threads the screw into the bone. Over a period
dates for this technology usually have a bilateral con- of time, usually 4 to 6 weeks, the bone creates a fibrous
ductive or mixed hearing loss, but some patients with layer around the screw, and the bone begins to weave
a unilateral hearing loss also receive benefit. The BAI is itself into the implant creating a solid integration of the
similar to a bone conduction oscillator used in testing a implant with the bone.
patient’s hearing (see Chapter 23) because it can trans- BAIs can be either percutaneous or transcutane-
mit a signal to the cochlea and bypass the outer and ous. A percutaneous device, as shown in Figure 24–6,
middle ears. By missing the conductive parts of the ear, has an abutment with a screw attached that is osseoin-
the acoustic signals are picked up by the BAI’s micro- tegrated into the skull and a post attached to the abut-
phone and transduced into vibrations that stimulate ment that penetrates through the skin. The fixture or
the skull, thus creating the normal vibratory energy in abutment that penetrates the skin is used to connect
the normal hearing cochlea. to the hearing device and acts as a breakaway point
In order for the BAI to function, it must transmit if something happens to hit the device. The hearing
the auditory signal to the cochlea via bone conduc- device is attached to the abutment and contains the
tion. The device is permanently attached to the skull microphone, battery, and other components similar to
through a process called osseointegration. Osseointegra- a regular hearing aid. The advantages of percutane-
tion is the connection between living bone and the sur- ous devices include a threshold improvement in the
face of a load-carrying implant, or in other words, it is mid to high frequencies and a lower force level when
the method by which the implant adheres to the bone. compared to transcutaneous BAIs. Disadvantages are
The implant is made of titanium, which allows for a that patients must be diligent about cleaning around
strong and stable implant-to-tissue interaction giving it the abutment or risk causing skin infections because of
a structural and functional contact between the implant poor hygiene, for this is an open wound that they will
and the bone (Kuhn & Perez, 2015). As shown in Fig- have for life (Hodgetts, 2017).
Bone conduction
Air conduction
Figure 24–6. A percutaneous bone-anchored implant. Note the device’s penetration through
the skin where it makes direct connection to the bone, allowing for sound to be transferred to the
cochlea via bone conduction.
Transcutaneous BAIs have a magnet that is surgi- the audiometric criteria reported to the U.S. Food and
cally attached to the skull under the skin and a reten- Drug Administration (FDA). However, candidacy
tion magnet on the outside of the skin; therefore, they requirements are similar for both those patients with
do not penetrate through the skin. The transcutane- conductive/mixed hearing loss and those with single-
ous magnet is osseointegrated to the skull under the sided deafness. Patients with a conductive hearing loss
skin, and the hearing device attaches to it magnetically must have at least a 30 dB HL conductive component,
because the magnets are attracted to each other across and those with a mixed hearing loss must have at least
the skin (Reinfeldt, Hakansson, Taghavi, & Feg-Olofs- a 30 dB HL conductive component and a mild to mod-
son, 2015). An example of a transcutaneous device is erate sensorineural component. Currently, patients
shown in Figure 24–7. The sound processor is located with single-sided deafness (SSD) must have normal
outside the skin and contains the microphone, battery, hearing in the better hearing ear (a PTA <20 dB HL by
and vibrating source to transmit the signal to the mag- air conduction) and a profound sensorineural hearing
net under the skin. Similar to the percutaneous devices, loss (a PTA >90 dB HL) in the poor ear (Zeitler, Snapp,
the device will detach from the scalp without trauma Telischi, & Angeli, 2012).
or injury to the user. An advantage of transcutane-
ous devices includes an improvement in thresholds,
especially in the low frequencies. However, the disad- Middle Ear Implant
vantages are that as the vibrations travel through the
skin, they lose energy when compared to percutaneous A MEI is a type of hearing aid that differs from a con-
devices, and patients may experience minor complica- ventional hearing aid because it is surgically placed into
tions because of the pressure on the skin between the the middle ear space by an ear, nose, and throat (ENT)
two magnets (Hodgetts, 2017; Kohan, 2015). surgeon. This specialized type of hearing device is used
Candidates for a BAI must be over 5 years of age with patients who are considered unsuccessful hear-
(Kuhn & Perez, 2015) and fall into one of two catego- ing aid users or whose lifestyle would be hampered by
ries (Table 24–3): (a) have a conductive or mixed hear- the use of a regular hearing aid. In some cases, these
ing loss or (b) have a unilateral hearing loss. Specific patients are medically unable to wear conventional
requirements are set by each manufacturer based on hearing aids; therefore, a MEI may become an option.
Figure 24–7. An example of a transcutaneous device with an implanted mag-

net and an external hearing device — the Sophono Alpha 2 MPO. Note the four holes
in the implant portion through which the titanium screws are placed to attach the
implant to the skull. Courtesy of Medtronix, Jacksonville, FL.
Table 24–3. Candidacy Considerations for a Bone-Anchored Implant
Conductive/Mixed Loss Unilateral Hearing Loss

Atresia — bilateral Atresia — unilateral
Abnormal ear canal structure Acquired unilateral conductive/mixed
hearing loss
Chronic ear drainage
Congenital or acquired profound
Stapedectomy
unilateral loss
Failed middle ear surgery
Tumors of the conductive pathway
Unable to use air conduction hearing aid
Autoimmune inner ear disease
Hypersensitivity to earmolds
Ménière’s disease
Severe dermatitis of the ear canal
Severe otitis externa
Source: Reproduced with permission from Audiology: Science to Practice (3rd ed., p. 349) by
S. Kramer and D. K. Brown, 2019, San Diego, CA: Plural Publishing, Inc. Copyright 2019 by Plural
Publishing, Inc.
These devices are implanted in the middle ear and cou- and oscillates as electromagnetic waves move past the
pled to the ossicles, essentially bypassing the ear canal. magnet; thus, the electromagnetic energy is converted
This leaves the ear canal unoccluded, which is poten- to mechanical energy (ossicular movement). The elec-
tially more cosmetically appealing to patients (Achar, tromagnetic devices have greater gain and output, but
2013). Although MEIs are a viable option for patients, they are offset by a greater power consumption and
they have only a small share of the U.S. market. larger size compared to the piezoelectric transducers
The MEI is a device that is categorized either by (Channer et al., 2011).
its implant status (partially or fully implanted) or on Candidacy for a MEI is set by each manufacturer
the type of transducer used in the device (Channer, according to the audiometric criteria they reported
Eshraghi, & Lui, 2011). MEIs are surgically implanted to the FDA during their clinical trial. Typically, MEIs
in the middle ear and depending on the device can be are designed for patients with a stable SNHL in the
attached to the ossicular chain, oval window, or round moderate to severe range (≤65 to 70 dB HL), and word
window (see Chapter 22) and used as a method of recognition scores greater than 40% to 60% (Achar,
treating SNHL (Bassim & Fayad, 2010; Gifford, 2017). 2013; Kuhn & Perez, 2015). Candidates must also be
Most transducers used in MEIs are piezoelectric or medically stable and not have any preexisting medical
electromagnetic transduction mechanisms (Kuhn & conditions.
Perez, 2015). The piezoelectric implementation uses Examples of currently available devices include
an external microphone placed by the ear or under the the Esteem, Maxum, and Vibrant Soundbridge. The
skin in the ear canal to detect sounds around the lis- Esteem is a two-channel analog device, which uses a
tener. From this, an electrically encoded signal is sent different approach because it does not have an external
to an implanted middle ear crystal placed on the ossic- processor or microphone. As shown in Figure 24–8, the
ular chain. The crystal moves in response to stimula- Esteem device is totally implanted within the temporal
tion from the sound transduction and propagates this bone and the middle ear space. This device including
electrical activity into the cochlea, where it is processed the battery is housed in the ear, including a piezoelec-
by the remaining hair cells and eighth cranial nerve tric transducer or sensor, which connects to the incus
fibers. Piezoelectric devices are smaller and use less and moves with the motion of the tympanic mem-
power than electromagnetic devices. The electromag- brane. It functions by receiving the mechanical vibra-
netic implementation is larger than the piezoelectric tions from the incus and converts them to electrical
version; therefore, it is only partially implantable. Most signals, which are sent to a sound processor implanted
of the electromagnetic implants have an external coil in the temporal bone. The sound processor filters and
that is connected to a microphone and an amplifier. The amplifies the signal and sends the processed signal to
amplified signal is sent to an implanted magnet located the driver (another piezoelectric transducer) coupled
at or near the incus–stapedial joint of the ossicular chain to the stapes footplate. The electrical signal is converted
Figure 24–8. An example of a middle ear implant — the Esteem. (1) An electronic micro-
phone is not used, instead the external ear funnels the sound waves down the ear canal to the
tympanic membrane causing it to vibrate and act as the microphone. (2) The sensor is con-
nected to the incus and receives the mechanical vibrations, converting them to an electrical
signal and forwarding them to the processor. (3) The processor implanted in the temporal
bone amplifies the signal. (4) The driver is coupled to the stapes footplate and converts the
enhanced signal back to mechanical energy for transfer into the cochlea. Courtesy of Envoy
Medical Corporation, White Bear Lake, MN.
back to mechanical energy at the driver, which moves performances when compared to conventional hear-
the stapes in and out of the oval window, transmitting ing aids (McRackan, Clinkscales, Ahlstrom, Nguyen, &
the signal to the cochlea. Dubno, 2018; Truy, Philibert, Vesson, Labaassi, & Collet,
MEIs have slightly better pure-tone thresholds, 2008). Results have also shown an overall improvement
word recognition scores in quiet, and speech-in-noise in patient satisfaction. However, more importantly,
patients have reported additional benefits with MEI and/or programming and checks. Fortunately, the cost
including reduced problems with feedback and other associated with cochlear implants, including the sur-
distortions of the sound (Kuhn & Perez, 2015). For this gery, device, and programming visits, is covered by
select group of patients, MEI serves a useful purpose most medical insurances.
and allows many who would otherwise not be success- A CI can be divided into external and internal
ful with amplification, gain auditory benefit. components as illustrated in Figure 24–9. The external
components consist of a microphone, speech processor,
and external transmitter (headpiece). As with all ampli-
Cochlear Implant fication devices, the microphone changes the acoustic
energy into electrical energy. The signal is sent to the
A CI is a biomedical device that bypasses the middle speech processor for processing and analyzing, and
ear and most of the cochlea, and electrically stimu- then bandpass filtered, creating frequency bands of
lates neurons of the eighth cranial nerve. The stimula- electric stimulation. The external transmitter receives
tion from a CI provides an input that is different from these signals and sends them to the internal implant
normal hearing. Therefore, it can provide benefit for via radio frequency. The implant consists of a magnet,
individuals with severe to profound hearing loss, who receiving coil, and an electrode array. As illustrated,
have some functional auditory nerve fibers but mini- the external transmitter is attached to the implant via
mal or absent hair cells in the cochlea (see Chapter 22), a magnet. The signal from the external transmitter is
meaning that a conventional hearing aid would not sent to the receiving coil and then to the electrode array,
help. Much like a hearing aid, a CI can provide reason- thus bypassing the ear canal, middle ear, and cochlear
ably good detection and correct identification of envi- hair cells, where the electrical impulses stimulate the
ronmental and speech sounds. eighth cranial nerve.
The actual success and the functional level obtained After the sound is converted to an electrical sig-
with a CI depend on many factors, including the time nal, it is changed via a speech coding strategy. These
of onset of the hearing loss, length of deafness prior strategies convert the sounds into patterns of electri-
to implantation, residual function of auditory neurons, cal signals and deliver them to the cochlea in order to
appropriate programming of the implant, and consis- convey frequency, intensity, and duration information.
tent use of the implant. Some limitations to CIs include The internal receiver coil picks up the carrier signal
the risk of undergoing the surgery, the loss of remain- from the transmitter and sends the electrical code to
ing functional hair cells from implanting the electrode the various channels of the electrode array. As shown
array, the expense of undergoing the surgery, and the in Figure 24–10, the electrode array is attached to the
need to travel to and from a CI center for remapping implant and surgically inserted into the cochlea where
A. Volume control B.
Microphone
Electrode array
Processor
Headpiece Coil
Battery
Magnet
Figure 24–9. Components of a cochlear implant. A. External device with the processor and coil that
attaches to the internal magnet. B. Implant with the electrode array which is inserted into the cochlea.
Images courtesy of Advanced Bionics LLC, © 2017.
ment of the patient’s ability to cope physically with

surgery. The audiology evaluation includes air and
bone conduction thresholds, tympanometry, acoustic
reflex thresholds, and word recognition scores.
The CI is programmed by the audiologist through
a process called CI mapping 3 to 6 weeks following the
surgery. During this process, the electrical thresholds
for soft inputs and the upper limits of comfort thresh-
olds are established. It is common for the audiologist
to turn off any electrodes where the stimulation causes
pain, facial twitches, and/or no improvement in hear-
ing sensitivity. The patient will return multiple times
within the first year for additional sessions to fine-tune
the CI. Most adults with postlingually acquired hearing
loss1 show significant improvements in speech recogni-
tion abilities after implantation. The speech recognition
abilities appear to improve and plateau for most indi-
viduals within the first 3 months but may continue to
improve up to 1 year postimplant. Their outcome per-
formance is negatively correlated with their duration
of deafness and age (Sharma, Dorman, & Kral, 2005).
After implantation, children will require significant
rehabilitation to obtain maximum outcomes; however,
Figure 24–10. Cochlear implant with placement of they can develop spoken language skills (Fitzpatrick,
external and internal CI components. (1) The microphone Crawford, Ni, & Durieux-Smith, 2011). Individuals and
on the sound processor picks up and converts the sound their families will need counseling regarding realistic
into digital information. (2) This information is transferred expectations, additional training with communication
through the coil to the implant located under the skin. strategies, speech-language therapy, academic support,
(3) The implant sends electrical signals down the elec- continued instruction, and additional fine-tuning of the
trode into the cochlea. (4) The auditory nerve fibers in the cochlear implant.
cochlea pick up the electrical signals and conduct them up
the auditory pathway to the auditory cortex. Image cour-
tesy of Cochlear Americas, © 2017.
Chapter Summary
the nerve fibers are activated by the electrical signals Hearing Aids
from the internal component.
The candidacy requirements for a CI are as fol- A main component of the treatment and rehabilita-
lows: (a) the patient must have a severe to profound tive process that we know as audiology is the fitting
bilateral sensorineural hearing loss and (b) receive little of hearing aids. Audiologists have an important role in
or no benefit from hearing aids. Additionally, an evalu- the comprehensive selection and fitting of the device.
ation of the patient is required by a team of medical Six steps in selecting and fitting a hearing aid
professionals for consideration of a cochlear implant. include the following:
The CI team consists of the ENT surgeon, audiologist,
a speech-language pathologist, psychologist, social 1. Assessment — determine the type and degree of
worker, and educator of the deaf. The entire CI team loss
participates in the evaluation and makes a final deci- 2. Treatment Planning — review results, identify
sion on whether a patient is appropriate to receive a areas of difficulty, and explore amplification
cochlear implant. A medical evaluation includes a com- options
puted tomography (CT) scan to determine the presence 3. Selection — determine type of fitting and decide
of a cochlea and auditory nerve, as well as an assess- on electroacoustic parameters
1
ostlingually acquired hearing loss is where a person’s hearing loss is acquired after they have developed language at age 5 or greater. Refer
P
to Chapter 7 for further details.
4. Verification — determine that aids meet electro- and the sensory hair cells and electrically stimulates
acoustic performance and that real-ear results the eighth nerve.
match desired levels The processor transduces acoustic signals into
5. Orientation — counsel patient on hearing aid use electrical signals along the electrode array to stimulate
and care and give realistic expectations the auditory nerve. After implantation, individuals
6. Validation — assess effectiveness of hearing aids will need to undergo cochlear implant mapping. This
and patient’s subjective satisfaction process allows the audiologist to identify the patient’s
electrical thresholds for soft inputs and establish the
Types of hearing aids include BTE, RIC, ITE, ITC, upper limit of comfort thresholds.
and CIC. Adults with postlingually acquired hearing loss
Hearing aid components include a microphone, show significant improvement in speech recognition
amplifier, receiver, battery, and telecoil. abilities, and children will require significant rehabili-
Fitting strategies are used to program a hearing tation to obtain maximum outcomes; however, they
aid to the desired gain and output for the patient. The can develop spoken language skills.
NAL-NL2 program is used mostly for adults and the
DSLv5.0 is used mainly with children.
After a hearing aid has been programmed using References
a prescriptive fitting method, it is important to verify
that the ear canal output is appropriate for the patient. Achar, P. (2013). Hearing rehabilitation with middle ear
Probe-microphone real-ear measures (REAR) are used implants: An overview. Surgeon, 11(3), 165–168. https://
to verify prescriptive targets. doi.org/10.1016/j.surge.2013.02.002
Validation of hearing aid benefit is accomplished Bassim, M. K., & Fayad, J. N. (2010). Implantable middle
using a self-assessment inventory or questionnaire ear hearing devices: A review. Seminars in Hearing, 31(1),
(i.e., International Outcome Inventory for Hearing 28–36.
Aids [IOI-HA], Client Oriented Scale of Improvement Channer, G. A., Eshraghi, A. A., & Lui, X. Z. (2011). Middle
ear implant: Historical and futuristic perspective. Journal
[COSI], and Abbreviated Profile of Hearing Aid Benefit
of Otology, 6(2), 10–18.
[APHAB]) completed by the patient. Fitzpatrick, E. M., Crawford, L., Ni, A., & Durieux-Smith,
A. (2011). A descriptive analysis of language and speech
skills in 4- to 5-year-old children with hearing loss. Ear and
Auditory Implantable Devices Hearing, 32(5), 605–616.
Gifford, R. H. (2017). The future of auditory implants. In A.
Depending on the type of hearing loss, some patients M. Tharpe & R. Seewald, Comprehensive handbook of pediat-
will perform better with the use of a different type of a ric audiology (2nd ed., pp. 793–812). San Diego, CA: Plural
hearing device. Publishing.
Three specialized auditory devices include the Hodgetts, B. (2017). Other hearing devices: Bone conduction.
BAI, the MEI, and the CI. In A. M. Tharpe & R. Seewald (Eds.), Comprehensive hand-
book of pediatric audiology (2nd ed., pp. 781–792). San Diego,
BAIs are used with patients who have a conduc-
CA: Plural Publishing.
tive or mixed hearing loss or for those with single- Kohan, D. (2015, February). Implantable auditory devices. Paper
sided deafness (unilateral hearing loss). There are two presented at the Ultimate Colorado Midwinter Meeting,
types: percutaneous (penetrate through the skin) or Vail, CO.
transcutaneous (across the skin). Kuhn, J. J., & Perez, A. J. (2015). Implantable hearing devices.
MEIs are a specialized type of hearing device In M. L. Pensak & D. I. Choo, Clinical otology (4th ed., pp.
because they are surgically placed in the middle ear. 402–420). New York, NY: Thieme.
They are used with patients who have an intact mid- McRackan, T. R., Clinkscales, W. B., Ahlstrom, J. B., Nguyen,
dle ear system but have a mild to severe-to-profound S. A., & Dubno, J. R. (2018). Factors associated with benefit
SNHL. They usually have a lifestyle that would ham- of active middle ear implants compared to conventional
per the use of conventional hearing aids. MEIs have hearing aids. Laryngoscope, 128, 2133–2138.
Mueller, H. G. (2019). Hearing aids. In S. Kramer & D. K.
slightly better thresholds, word recognition scores in
Brown. Audiology: Science to practice (3rd ed, pp. 319–346).
quiet, and speech-in-noise performance compared to San Diego, CA: Plural Publishing.
conventional hearing aids, and patients reported ben- Mueller, H. G., & Hall, J. W. III (1998). Audiologists’ desk refer-
efits such as reduced problems with feedback. ence (Vol. II). San Diego, CA: Singular Publishing Group.
CIs are used with individuals with a severe to Mueller, H. G., Ricketts, T. A., & Bentler R. (2014). Modern
profound SNHL and receive limited benefit from con- hearing aids: Pre-fitting testing and selection considerations.
ventional hearing aids. They bypass the middle ear San Diego, CA: Plural Publishing.
Mueller, H. G., Ricketts, T. A., & Bentler, R. (2017). Speech map- Sharma, A., Dorman, M. F., & Kral, A. (2005). The influence
ping and probe microphone measurements. San Diego, CA: of a sensitive period on central auditory development in
Plural Publishing. children with unilateral and bilateral cochlear implants.
National Institute on Deafness and Other Communication Hearing Research, 203(1–2), 133–143.
Disorders (NIDCD). (2016). Quick statistics about hearing. Truy, E., Philibert, B., Vesson, J. F., Labaassi, S., & Collet, L.
Retrieved from https://www.nidcd.nih.gov (2008). Vibrant Soundbridge versus conventional hearing
Reinfeldt, S., Hakansson, B., Taghavi, H., & Eeg-Olofsson, M. aid in sensorineural high-frequency hearing loss: A pro-
(2015). New developments in bone-conduction hearing spective study. Otology and Neurotology 29, 684–687.
implants: A review. Medical Devices (Auckl), 8, 79–93. Zeitler, D. M., Snapp, H. A., Telischi, F. F., & Angeli, S. I.
Ricketts, T. A., Bentler, R., & Mueller, H. G. (2019). Essentials (2012). Bone-anchored implantation for single-sided deaf-
of modern hearing aids: Selection, fitting, and verification. San ness in patients with less than profound hearing loss.
Diego, CA: Plural Publishing. Otolaryngology-Head and Neck Surgery, 147, 105–111.
25
Aural Habilitation and
Rehabilitation
Introduction gists counsel patients regarding the effect hearing loss

has on the communication process and on their psycho-
This chapter presents an overview of what can be done social status in personal and social situations.
to habilitate or rehabilitate individuals with hearing When working on the communication skills of
loss. Now that you have learned about the anatomy individuals with hearing loss, we must utilize an inter-
and physiology of the auditory and vestibular systems vention that is aimed at minimizing and alleviating the
(Chapter 22), the clinical tests used to identify and diag- communication difficulties associated with the hearing
nose hearing loss (Chapter 23), how to improve an indi- loss (Tye-Murray, 2020). The first step in this process
vidual’s auditory function with the use of hearing aids is to assess their hearing acuity and communication
or auditory implants (Chapter 24), and the information ability; in doing so, we can assess their communication
relevant to the development of speech and language skills and determine the level of intervention they will
Chapters 5, 6, and 13), we move forward and provide require. The depth of the problem is associated with
information on the audiologic treatment services for his or her ability at the time of the loss. Infants and
individuals with hearing loss and their family unit young children who are born with or acquire hearing
according to their needs. Individuals (mainly adults) loss soon after birth have not learned to listen or com-
who have had hearing and lost it will require different municate. Older children and adults who have had
interventions than those (mainly children) who have hearing and then subsequently lost it will require dif-
not been able to hear. Infants and children with hear- ferent strategies and techniques.
ing loss and their families will require services that can Children who have not learned communication
include clinical treatment, home intervention, family skills will need to undergo aural habilitation, which
support, and case management. is an intervention for children who have not devel-
Audiologists not only provide treatment to indi- oped listening, speech, and language skills (American
viduals (children or adults) with hearing loss but they Speech-Language-Hearing Association [ASHA], 2019).
also counsel patients and their families concerning Whereas, aural rehabilitation focuses on restoring a
hearing loss, the use of amplification, and strategies for skill that is lost. Although aural rehabilitation has been
improving speech recognition. Additionally, audiolo- around since the beginning of audiology, it has evolved
359
over time to keep pace with our knowledge and technol- (see Chapter 23) in each ear and to fit the child with a
ogy, but continues to be an integral part of what we do hearing aid (see Chapter 24). In this way, we can estab-
as audiologists. Aural rehabilitation is the intervention lish exactly what the child hears. The audiologist will
for individuals who have had a reduction in function, need to encourage the child’s use of his or her hearing
activity, participation, and quality of life caused by their aids by helping the parent/caregiver become famil-
hearing loss, but whose difficulties can be overcome by iar with the child’s hearing loss and hearing aids and
sensory management, perceptual training, and counsel- guide the parent/caregiver in achieving full-time use
ing (Boothroyd, 2007). In sensory management, we use of the amplification device by the child.
hearing aids, cochlear implants, or other hearing assis- Since these children are very young, they may not
tive technology systems (e.g., frequency-modulated be able to be assessed easily. Audiologists can use sub-
[FM] system) to make sounds as audible as possible. jective information provided by parents about auditory
Perceptual training is the enhancement of auditory or behaviors in everyday life to assess how the child is
auditory-visual perceptual skills (e.g., our ability to performing and complement it with some objective test
perceive spoken language) through auditory training results from a battery of assessments (Bagatto, Moodie,
(e.g., retraining our listening ability). Counseling is the Seewald, Bartlett, & Scollie, 2011). There have been a
discussion both informally and formally of the practi- number of questionnaires or scales developed to col-
cal, social, and emotional consequences of hearing loss. lect parent feedback. These questionnaires have been
standardized, and some even provide normative data
so that the results can be compared to those of their
Aural Habilitation peers. Examples of these questionnaires include the
Infant-Toddler Meaningful Auditory Integration Scale
It is very fortunate that newborn hearing screening is (IT-MAIS), the LittlEARS questionnaire, and the Par-
being completed in every state in the nation.1 Through ents’ Evaluation of Aural/Oral Performance of Chil-
this endeavor, over 6000 infants (1.7 per 1000 live dren (PEACH).
births) are identified with hearing loss annually (Cen- The Infant-Toddler Meaningful Auditory Integra-
ters for Disease Control and Prevention [CDC], 2016). tion Scale (IT-MAIS) (Zimmerman-Phillips, Robbins,
Using the 1, 3, 6 rule, infants must have their hearing & Osberger, 2000) is a structured interview designed
screened by 1 month, the diagnosis must be completed to assess the child’s spontaneous responses to sound
and amplification fitted by 3 months, and they must be in his or her everyday environment. The assessment is
enrolled in intervention by 6 months (Joint Committee based on information provided by the child’s parent(s)
on Infant Hearing [JCIH], 2007). This early identifica- in response to 10 probes (Figure 25–1). These probes
tion and intervention has a significant positive impact assess three main areas: (a) vocalization behavior,
on the child’s development, leading to a normal range (b) alerting to sounds, and (c) deriving meaning from
of language (comprehension and expression) and social sound. Through this scale, the clinician can determine
development between 1 and 3 years of age (Yoshinaga- how a child is accessing and using sound in his or her
Itano, Sedey, Coulter, & Mehl, 1998). Through the use everyday environment, determine potential issues for
of aural habilitation, whose goal is to develop normal the child regarding his or her hearing in everyday situ-
listening skills, a child with hearing loss can acquire ations, and determine goals for intervention.
language as naturally as possible. The habilitation pro- The LittlEARS® Auditory Questionnaire (LEAQ) is
cess emphasizes the child’s usable hearing through a parental questionnaire that assesses auditory behavior
an appropriate amplification device, whether hearing in children up to 24 months of age (Coninx, Weichbold,
aids or a cochlear implant (CI). Note that children can Tsiakpini, Autrique, Bescond, Tamas, et al., 2009). This
receive a CI as early as 1 year of age if they are found language-independent tool can be used to document a
not to benefit from a hearing aid. child’s progress with his or her current amplification,
provide evidence of the need for a cochlear implant, or
demonstrate the need for follow-up in other develop-
Assessment of Communication mental areas. This brief questionnaire uses a series of
Needs in Children 35 yes/no questions that can be compared to a set of
expected values based on parent-observed reactions that
The first step in the aural habilitation process is to the child has to acoustic stimuli. Results from the child
determine the child’s type and degree of hearing loss can be compared to data for normal hearing children
1
he CDC (2016) indicates that 98% of all babies born in the US are screened, 1.7% of those infants do not pass their screening but most go on
T
to have normal hearing as identified through the diagnostic process. The end result is 1.7 per 1000 (n = 6,337) live births will have a hearing
loss diagnosed by 3 months of age.
IT-MAIS
Infant-Toddler Meaningful Auditory Integration Scale
1
Is the child’s vocal behavior affected while wearing his/her sensory aid (hearing
aid or cochlear implant)?
The benefits of auditory input are often apparent first in the speech production skills of very young children. The
frequency and quality of vocalizations may change when the device is put on, turned off, or not working properly.
1. Ask the parent: Describe _____’s vocalizations when you first put his/her device on each day. Have the parent
explain how and if the child’s vocalizations change when the sensory aid is first turned on and auditory input is
experienced at the start of each day.
2. Ask the parent: If you forget to put the device on ____, or if the device is not working properly, do you and/or
others notice that ____’s vocalizations are different in any way (e.g., quality, frequency of occurrence)?
3. Or ask: Does the child test the device by vocalizing when the device is first turned on?
_____ 0 = Never
No difference in the child’s vocalizations with the device turned on versus the device turned off.
_____ 1 = Rarely
Slight increase in the frequency of the child’s vocalizations (approximately 25%) with the device on (or
similar decrease with the device off).
_____ 2 = Occasionally
Child vocalizes throughout the day, and there are increases in vocalizations (approximately 50%) with the
device turned on (or similar decrease with the device turned off).
_____ 3 = Frequently
Child vocalizes throughout the day, and there are noticeable increases in vocalizations (approximately
75%) with the device on (or a similar decrease with the device off). Parents may report that individuals
outside the home notice a change in the frequency of child’s vocalizations with or without the device.
_____ 4 = Always
Child’s vocalizations increase 100% with the device on compared to the frequency of occurrence with the
device turned off.
Parent Report:
Figure 25–1. An example of a question from the Infant-Toddler Meaningful Auditory Integration Scale (IT-MAIS) (Zim-
merman-Phillips,
3 Robbins, & Osberger, 2000). This is one of the 10 probe questions from the parent-report scale that is
administered in an interview format. The questions are designed to elicit a dialogue between the examiner and parent
about their child’s spontaneous responses to sound in his or her everyday environment. Utilization of this question format
discourages the clinician from leading the parent to provide desired responses and from only providing yes/no answers.
It is often used with children who have a severe to profound loss, in order to compare ability both pre- and post-cochlear
implantation. Source: © 2013 Advanced Bionics AG and affiliates. All rights reserved.
361
and used to develop his or her aural habilitation plan (i.e. languages spoken in the home), religious beliefs,
or to assess improvement by evaluating change on the education level of the parents, previous experiences
LEAQ scale after intervention. with hearing loss, engagement with the process, and
The Parents’ Evaluation of Aural/Oral Perfor- socioeconomic status.
mance of Children (PEACH) scale is used to assess The factor that has the greatest impact on a child’s
functional performance of children under 5 years of auditory development is the child himself or herself,
age with hearing loss ranging from mild to profound including the child’s age of diagnosis, hearing age,
(Ching & Hill, 2007). The PEACH is a validated ques- type and degree of hearing loss, developmental level
tionnaire that asks parents to reflect on their child’s lis- and additional disabilities. The age of diagnosis is an
tening behavior and rate each of the 12 items, which important factor, as there is a large difference in terms
correspond to different hearing and communication of language development between a child who has
scenarios. The child’s results can be compared to nor- lost his or her hearing prelingually and a child who
mal hearing children and provide information about has lost his or her hearing postlingually. Children who
his or her performance in everyday situations. They have developed language and then lost their hearing
can also be used to monitor the child’s progress, evalu- only require rehabilitation of their auditory skills, as
ate the effectiveness of the child’s hearing aid and/or their language development has already been com-
cochlear implant, or assist in tailoring an intervention pleted. Whereas a prelingual child needs habilitation to
plan by identifying a child’s specific area of difficulty. develop not only his or her auditory skills but also lan-
Along with the objective information gleaned guage skills. Hearing age is the amount of time that the
from the audiological evaluation (i.e. audiogram, tym- child has spent hearing, which can be different from
panogram, hearing aid evaluation) and the subjective the child’s chronological age. For example, a child with
information gathered directly from the parents, the a profound hearing loss who was not identified until
audiologist can synthesize this data and form a pic- 1 year of age has a hearing age of zero because he or
ture of the child’s auditory ability. The audiologist is she has yet to develop auditory skills. Furthermore, the
then able to compare the child’s functional outcomes to type and degree of hearing loss (see Chapter 23) will
that of their peers with normal hearing and develop also affect the child’s auditory skill development, as a
a set of treatment goals and monitor progress toward child with a conductive loss (i.e. a child with a CL/P,
those goals. see Chapter 19) will have more auditory awareness as
long as the sound is loud enough to reach the cochlea.
In contrast, a child with a sensorineural loss (see Chap-
Pediatric Intervention ter 7) will have less auditory awareness because even
if the sound is loud enough, the system is still not fully
The main objective of aural habilitation is to enhance functional, so sound may not make it from the periph-
the listening skills of children with hearing loss by hav- eral system to the auditory cortex.
ing them progress through an auditory skills hierarchy. Approximately 30% to 40% of children with hear-
Obtaining this objective will enable the child to func- ing loss have one or more additional disabilities (i.e.,
tion to his or her maximum auditory potential. How- blindness, cerebral palsy) (Gallaudet Research Insti-
ever, there are factors that can affect the outcomes of a tute, 2009). Further studies have shown that as many
child’s auditory development (Fitzpatrick, 2010). These as 32% of children with hearing loss also have a devel-
factors include issues involving the child, his or her opmental disability (Fitzpatrick & Doucet, 2013). This
family and the environment. suggests that in addition to the child having a hearing
Environmental factors are related to the family’s loss, a substantial number will also have a develop-
access to care, the knowledge of providers, and the mental disability. This will further add to the difficulty
type of intervention available in the child’s location. of learning auditory skills. Others will show limited or
Geography can play a role in the type of services that no ability to acquire spoken language as their primary
are available; not every service is available in every mode of communication.
state, and even within states, there are often disparities Although there are factors that can affect the devel-
between services provided in urban and rural commu- opment of a child’s auditory skills, there are some com-
nities. These disparities can create delays in identifica- ponents that when present should lead to a successful
tion, amplification, and intervention. intervention program. According to the Joint Com-
Family factors, which revolve around the char- mittee on Infant Hearing (JCIH), these should include
acteristics and functioning of the family can also con- (a) using a family-centered approach, (b) providing
tribute to poorer outcomes. These factors include the every family with unbiased information on all options
makeup of the family, cultural and linguistic practices regarding approaches to communication, (c) monitor-
ing development at 6-month intervals using norm-ref- model of intervention, and the other utilizes a coaching
erenced instruments, (d) providing services in a natural model. A direct therapy approach is where the profes-
environment such as in their home, and (e) being sensi- sional interacts with the child while the parent observes
tive to cultural and language differences and (f) provid- and receives guidance from the clinician (Tye-Murray,
ing accommodations (JCIH, 2007). 2020). Direct intervention focuses on the child and the
In the early stages of intervention, the goal is to professional-child interaction is, whereas the coaching
enable the child to function at his or her maximum model requires that the parent implement the program
auditory potential. However, at this stage the primary while the clinician observes the parent–child interac-
focus is on parent support and education, rather than tions and provides the parent with support. Coaching
on “therapy” with the child. Figure 25–2 illustrates that is a family-centered approach, where the parent is sup-
parents are at the center of family-centered interven- ported and given hands-on practice through which the
tion. The parents need to receive support and coaching parent develops a sense of self-efficacy. The coaching
from the clinician so that they, in turn, can implement approach is usually completed in the child’s home dur-
the program with their child. ing a home visit. Due to the ongoing interactions that
occur between family members and the child, training
the family will have substantially more of an effect in
Coaching Model of Intervention
the long term. In this model, the professional’s goal is
There are two main approaches used with early inter- to help turn everyday activities in the home environ-
vention programs, one emphasizes a direct therapy ment into rich learning experiences for the child.
Developing
auditory skills
Support
Creating
optimal Information
listening
and learning
Parent about hearing
and technology
environments
Coaching
Developing
consistent
hearing device
use in their
child
Figure 25–2. Components of a family-centered intervention model.

The role of the audiologist is to provide information to support parents and
to train them through coaching. When parents feel empowered, children
have better outcomes.
Tye-Murray’s Coaching Principles

Dr. Tye-Murray, a noted expert in Aural n The intervention program must be
Rehabilitation, has developed a set of principles for driven by the family’s needs and desires.
use in coaching parents of children with hearing loss The professional must learn to adapt to
(Tye-Murray, 2020). Remember that the emphasis of the family’s values, culture, ethnicity,
habilitation process is on the function of communi- socioeconomic status, and background
cation and on linking learning within routine inter- information.
actions between the parent and child. These routines n The professional must learn to utilize the
provide families with a naturally occurring and sup- child’s toys and the family’s resources rather
portive framework to promote language and con- than what they would bring so that learn-
versational abilities. These principles can be used to ing strategies can be incorporated into the
guide the interaction between the professional and families everyday activities.
the parent, and they include: n Parents must learn to attend to their child’s
attentional lead and to stimulate conversation
n When it comes to the child, the parents are the based on the child’s focus.
experts because it is the parents, not the profes-
sional, who understand the child’s personality, Having conversational strategies embedded
preferences, and routines, and the emphasis into play and caregiving routines by the parent
must be placed on the parent–child relation- individualizes the intervention to the family where
ship not the professional–child relationship. it uniquely combines their personal and cultural
n Both the parents and the professional values, ecological constraints, and resources. These
must work together as members of a team, principles can be used to strengthen the intervention
giving them a balanced partnership in any for the parent, child and professional in order for the
decision-making. child to have the best outcome.
Components of a testing, amplification, and cochlear implant technol-

Family-Centered Intervention ogy. Another component is the development of consis-
tent hearing device use in the child. It is anticipated
The role of the audiologist is to provide parents with that through the hearing aid evaluation, the amplifi-
the information necessary to make informed deci- cation device they choose will allow the child to hear
sions for their child and to support the decision made optimally. The next step is to keep the technology on
by the family. A family-centered approach reflects the the child, consistent use is important to ensuring access
entire family unit, as it is not just the child who is being to sound but it is difficult to accomplish without sup-
treated but the family unit with all of its individual port from the audiologist (Moeller, Hoover, Peterson, &
members, some of whom may have unique needs and Stelmachowicz, 2009).
challenges (Head & Abbeduto, 2007). When parents Information on auditory development will also be
feel empowered, children have better outcomes. provided by the audiologist. Parents need to under-
Figure 25–2 shows that in the early stage of aural stand the auditory developmental milestones and how
habilitation, the parents are the center of the family- to develop auditory skills in their child. Table 25–1 lists
centered intervention and that it is the role of the some typical auditory development milestones and
audiologist to support and coach them. In this model, provides an example of information that parents need
there are four components in which the audiologist to acquire.
must support the parent. These components include: Parents will need to participate in creating the
providing information about hearing and technol- optimum listening and learning environments for
ogy; developing consistent hearing device use in their their child. To do so, parents need to be taught spe-
child; providing the parents with information on the cific listening and language facilitation techniques by
development of auditory skills; and, creating optimal the audiologist. In turn, parents can use this informa-
listening and learning environments. The audiologist tion to stimulate audition and language in the home
will educate the parents about hearing and their child’s and become aware of listening and language learn-
hearing loss. They will provide information on hearing ing opportunities in everyday activities. Therefore, by
Table 25–1. Auditory Behaviors Exhibited by Children With Normal Hearing in the
First Year of Life
Child’s Age
Auditory Behavior (months) Auditory Example of Behavior
Quiets to sound 1 Baby will stop or quiet when he or she hears a
sound (i.e., while sucking on a bottle, baby will
stop sucking when he or she hears a sound)
Turns head to sound 3–6 Baby will stop and turn head when he or she
hears a novel sound (i.e., baby playing on floor
will stop and turn head when mother starts the
blender on the counter)
Localizes sound 6–9 Baby turns head to visualize where the sound is
coming from (i.e., baby sitting in highchair, turns
toward sibling as sibling walks up to right side)
Turns to locate a 9–12 When facing away, baby will turn around and
sound from behind acknowledge the sound (i.e., baby playing on
floor, and dog barks behind him or her, baby will
stop and turn around to locate the dog)
Note. Parents of children with a hearing loss will need to acquire this as a part of their general knowl-
edge about auditory development.
Source: Adapted from Pediatric Audiologic Rehabilitation: From Infancy to Adolescence, by E. Fitzpatrick
and S. P. Doucet, 2013, New York, NY: Thieme Medical Publishers.
using these four components, the audiologist can sup- Table 25–2. Erber’s Hierarchy of Auditory Awareness
port and coach the parents to develop the child’s listen-
ing skills and spoken language in an enhanced family, Detection Knowledge of presence/absence
play and learning environment. of sound (may be turning, smiling,
stilling, searching, or vocalizing)
Auditory Training in Discrimination Tell that two sounds are different

Aural Habilitation quiet/loud, short/long
(not only detects, but also shows
Auditory training is the development of listening skills awareness through searching,
using auditory information; through which children smiling, head turning)
learn to listen and to make use of any residual hearing Identification Say it/repeat it (begins to understand
they may have. Children who use hearing aids benefit and add meaning to sounds)
from auditory training because they learn to use the
input from their devices to understand their auditory Comprehension Understand it (child begins to
understand connected speech)
world. Similarly, children who use cochlear implants
can learn through auditory training to interpret the
electrical signal and associate them with a particular
speech sound. Auditory training can be conceptual- Communication Options
ized in terms of the hierarchy of auditory development.
This hierarchy was first developed by Erber (1982) One decision that parents will need to make is that of
who described the building blocks for auditory skills communication mode,2 as described previously, the
development (described in Table 25–2). Table 25–3 role of the audiologist is to provide parents with suf-
shows an example of how the hierarchy is utilized with ficient information to make an informed decision and
young children. to support the decision they make. This decision can be
2
ommunication mode is the method by which a sender shares information with a receiver. The method can include speech, writing, symbols,
C
or hand gestures.
Table 25–3. Auditory Training Activities Appropriate for Each Stage of Auditory Skill Development
Auditory Skill Activity

Sound awareness Play peek-a-boo.
(Detection) Play musical chairs.
March to the beat of a drum.
Push the toy car whenever the clinician says “vrrrrm.”
Discrimination Play a game with toy animals (“The cow says ‘moo’ ”; “The sheep says ‘baaa’ ”).
Respond to commands (“Clap your hands”; “Jump!”).
Play a Same or Different game (“car car”; “car star”) using pairs of picture cards
where the child has to point to the correct illustration.
Repeat what you hear (“Mama”; “Papa”).
Identification Play the game Candy Land and listen for the names of the colors.
Play with sets of postcards or stickers (“Show me the cat”).
Play Go Fish with cards (“Give me your sevens”; “Give me your twos”).
Comprehension Listen to a read-aloud story and then answer questions about the plot and characters.
Play I Spy (“I spy a red sweater”; “I spy some blue jeans”).
Play 20 Questions (“Is it bigger than a chair?”; “Is it alive?”).
Source: Reproduced with permission from Foundations of Aural Rehabilitation: Children, Adults, and Their Family Members
(5th ed., p. 400) by N. Tye-Murray, 2020, San Diego, CA: Plural Publishing, Inc. Copyright 2020 by Plural Publishing, Inc.
very difficult for parents and brings with it much emo- system in which phonemically based hand gestures
tion and distress because of personal biases brought supplement speechreading. The mouth movements of
forth by the many people/professions that the par- speech are combined with “cues” to make the sounds
ents come in contact with. Choices for communication (phonemes) of spoken language appear different. The
mode vary across a continuum from auditory to visual “cues” consist of eight handshapes which are used to
as shown in Figure 25–3. Families have to decide if they distinguish the consonant phonemes and four locations
want their child to be auditory or manual. Someone near the mouth used to distinguish vowel phonemes,
who is auditory will rely on his or her auditory skills when combined, a handshape and a location form a
to understand what is being said and his or her verbal syllable.
skills to communicate to others, whereas someone who Auditory-oral and auditory-verbal systems are
is manual will utilize a sign system. similar to one another in that both use spoken lan-
American Sign Language (ASL) is a manual system guage. The auditory-oral approach encourages the use
of communication used by many in the Deaf commu- of speechreading (i.e., watching the movement of the
nity. ASL is its own language with its own vocabulary, mouth, face, and body to assist in the understanding
grammar, and social rules for use that are different from of what the speaker is saying). Auditory-verbal focuses
English. Handshapes, palm orientation, location of the on using the child’s listening ability to learn spoken
hands, and facial expression are important components language. These listening skills require consistent use
of ASL. An individual who uses ASL is a fully visual of amplification to access conversational speech and
communicator (see Figure 25-3). Total communica- utilize a systematic approach to increasing focused,
tion is a visual system that incorporates more English- sustained auditory attention. As shown in Figure 25–3,
based systems of signing (i.e., Signing Exact English) auditory-verbal communicators are fully auditory
with spoken English, mime, facial expression, and ges- communicators. The method of communication is a
tures. This system has the intention of providing any personal/familial decision. Families will evaluate the
and all strategies necessary to bridge ASL and Eng- different methods across the continuum and make
lish, and supports the development of communication their own decision, and the clinician is there to provide
and language. Cued speech is a visual communication information and to support their decision.
Fully Fully
Auditory Auditory Auditory- Cued Total Visual
Communicator Verbal Oral Speech Communication ASL Communicator
Auditory Visual
Figure 25–3. Families will need to choose a communication mode for their child, these modes are
shown as they extend across a continuum from auditory to visual. The more auditory the communication
mode, the further to the left; the more visual the mode, the further to the right.
Outcome Measures for Children ual with hearing loss (Tye-Murray, 2020). Unlike aural
habilitation that uses a family-centered approach, AR
The final stage in the aural habilitation process is to uses a patient-centered approach in which the patient
look at the outcomes of the intervention; at this stage, it is seen as an individual who is involved in his or her
is important to assess the amount of improvement over own care and the decisions about the appropriate
preintervention levels and to determine if the child is intervention (Grenness, Hickson, Laplante-Lévesque,
on track with his or her normal-hearing peers. This & Davidson, 2014).
assessment of a child’s ability and comparison of results
with their peers is important not only for measuring
progress but also for determining further intervention Assessment of Communication
goals. To measure outcomes, the same tools from the Needs in Adults
original assessment will be used. As the intervention
has emphasized the child’s usable hearing through an As with children, the first step in an AR program is
appropriate amplification device, the outcome should to evaluate the individual’s overall hearing health.
reflect this improvement. Most children who are identi- This will require an audiological assessment includ-
fied early and receive optimal auditory and language ing a case history, and an audiometric evaluation (see
stimulation can develop spoken language skills com- Chapter 23), and a hearing aid evaluation, including
mensurate with age-matched peers. However, children self-assessment inventories (see Chapter 24). However,
with hearing loss continue to benefit from intervention other factors can influence an individual’s communi-
during the preschool years. cation ability and should be evaluated, especially in
older adults. They include the individual’s social cir-
cumstances, emotional variables, living arrangements,
Aural Rehabilitation physical variables, reduced vision, arthritis, dementia,
and other cognitive variables. As a part of this assess-
Aural rehabilitation (AR) is an intervention aimed ment, we also need to evaluate their activity limita-
at restoring or optimizing a person’s participation in tions, participation restrictions, and other contextual
activities and with others whom they have removed factors. These assess how smoothly conversations
themselves from because of their hearing loss (Tye- unfold, the time spent repairing conversational break-
Murray, 2020). It focuses on restoring a skill that is downs, the amount of shared speaking time, and the
lost, unlike habilitation as we discussed in the previ- conversational time spent in silence.
ous section. The objectives of AR are to alleviate the
difficulties related to the hearing loss and minimize
Assessment Methods
its consequences. It also seeks to work with and assist
families or communication partners of those with hear- Most of the testing performed by audiologists involve
ing loss in adjusting to the impact of hearing loss. The measuring the patient’s response to a sound in some
outcome of AR is to enhance the conversational fluency fashion. However, some information can only be col-
and reduce the hearing-related disability of an individ- lected directly from the patient. This is most often
accomplished through an interview, usually in the form ratios or enhance communication with others usually
of a case history, where the clinician utilizes patient- via amplification. An example of this technology is an
centered tools to determine personal characteristics of amplified telephone or FM system that allows a teach-
the patient. The audiologist will utilize self-assessment er’s voice to have a constant level no matter where the
tools with both the patient and their communication student is in the room. The FM system also assists with
partners (usually their spouse). These tools take the reducing background noise by amplifying the teacher’s
form of questionnaires, many which have been clini- voice over the other noise in the room.
cally validated. Others are subjective impressions, AADs are designed to react to alarm situations
where the patient self-reports information about their and notify users that a situation is occurring. An exam-
hearing and communication. These questionnaires use ple is a light that flashes when the telephone rings or
either open-ended or closed-ended questions, both of a strobe light that flashes when the smoke detector is
which have advantages and disadvantages. Examples activated. You may find it easy to get up for class in the
of highly respected questionnaires include: the Hear- morning but how would you do it if you had a hear-
ing Handicap Inventory for Adults and Elderly (HHIA ing loss? This could be a difficult situation if you had a
and HHIE), International Outcome Inventory of Hear- hearing loss since you do not wear your hearing aids or
ing Aids (IOI-HA), Satisfaction with Amplification in cochlear implant to bed. However there are ADDs that
Daily Life (SADL), Self-Assessment of Communication can assist with this potential problem. One method to
(SAC), and Significant Other Assessment of Communi- wake a person is with the use of an alarm clock capa-
cation (SOAC). ble of shaking the bed, or vibrating under a pillow, or
Another method of obtaining information from flashing a strobe light.
the patient is through the use of daily logs; a procedure
where the patient monitors his or her behavior and self-
reports the outcome. A log is to be completed by the Auditory Training in
patient over several consecutive days. Once completed, Aural Rehabilitation
these responses provide a general index of the patient’s
daily use of communication strategies, conversational Auditory training is another major component of AR.
fluency, and communication difficulties. It is a process designed to maximally utilize an indi-
vidual’s residual hearing by enhancing their ability
to interpret auditory experiences (Sweetow & Sabes,
Adult Intervention 2009). In auditory training, the focus is on rehabilitat-
ing (retraining) skills that were previously in place and
The main component in any AR program is to make providing training on “new” skills to accommodate for
sure that the patient is receiving optimal auditory input an individual’s hearing loss. Through this process, the
and maximizing his or her residual hearing for daily clinician must teach the individual to take full advan-
functioning. Following the audiological evaluation, tage of all of the auditory cues available. Like in aural
the audiologist will perform a hearing aid evaluation habilitation, the objectives for an auditory training pro-
using the steps laid out in Chapter 24. Depending on gram in AR include: teaching the person to use his or
the type and degree of hearing loss, the patient will be her amplification and take advantage of its features, to
fit with an appropriate style of hearing aid or a cochlear maximize the perception of speech through the use of
implant. There are times, however, when hearing aids auditory and other available cues, and to develop an
or cochlear implants are not sufficient. Usually these are individual’s ability to recognize speech using the audi-
the result of the distance between the sound source and tory signal and to interpret his or her auditory experi-
the listener, the background noise around the listener, ences (Tye-Murray, 2020).
or the reverberant nature of the area. In these cases, the There are two main methods used in auditory
listener may need to utilize additional assistance from training: analytic and synthetic. The analytic approach
a hearing assistive technology system (HATS). breaks down speech into the smallest components
HATS is a general term for listening, alerting, or features of the acoustic cue. Using this bottom-up
and/or signaling devices that facilitate patients’ com- process with small speech segments (phonemes or syl-
munication with other people and the environment or lables), this method focuses on how hearing influences
enhance personal safety through the use of auditory, communication. This method follows the four levels
visual, or tactile modalities. There are two types of from Erber’s hierarchy (see Table 25–2). In compari-
HATS: assistive listening devices (ALDs) and assistive son, the synthetic approach uses a top-down process
alerting devices (AADs). ALDs are designed to facili- to focus on the overall meaning of the discourse. With
tate the reception of speech, to improve signal-to-noise an emphasis on auditory comprehension, it uses the
segments of speech (i.e., words, phrases, sentences), strategies that can be used to facilitate communication.
as well as syntactic and contextual cues to allow the These strategies include anticipatory strategies and
user to fill in the missing gaps of information in a real- repair strategies. Anticipatory strategies focus on good
world situation. communication habits and control of the environment.
Auditory training programs using home-based These are things the listener can do to better prepare
computer-delivered programs have increased in the for a successful communication encounter. In this strat-
past few years (Lawrence et al., 2018). Results suggest egy, the listener tries to anticipate any difficulty and
that there is a positive effect from these programs on prevent it from occurring. For example, after walking
the speech perception and cognition in people with into a room, the listener notices a band playing on the
hearing loss. This improvement is due to the targeted left side and bright sunshine streaming through large
auditory training on working memory, attention, and windows on the right side. For an optimum communi-
communication (Pizarek, Shafiro, & McCarthy, 2013). cation setting, the listener will want to have any con-
Computer programs such as Angel Sound, the Com- versations on the right side of the room with his or her
puter-Assisted Speech Training (CAST), and the Lis- back toward the window so that he or she can minimize
tening and Communication Enhancement Training background noise and optimize the speechreading
(LACE) are widely used. The most well-known pro- environment (see section on “Speechreading,” next).
gram is LACE, a home-based, self-paced adaptive audi- In contrast, repair strategies are tactics used by
tory training program designed to improve listening a listener with hearing loss when the message pre-
and communication skills. This training is composed sented by a communication partner is not understood.
of 10 to 20 half-hour sessions taken over the course of In other words, it is a way to fix the communication
a 1-month period. LACE provides exercises for dif- breakdown after it has occurred. Repair strategies use
ficult listening situations and can adapt for cognitive specific and effective tactics to help repair the com-
characteristics of the aging process that interfere with munication breakdown while reducing the stress from
effective communication. It combines listening training any miscommunication. Table 25–4 shows a number of
(analytic) with repair strategies (synthetic) and gives common communication problems and their recom-
patient feedback regarding his or her performance. mended repair strategies.
Communication Strategies Speechreading
When working with individuals with hearing loss, we Humans use both auditory and visual information
can train them to use strategies to enhance communi- to understand communication. We combine auditory
cation. A breakdown in communication can be caused signals with visual cues, including facial expressions
by a number of factors; however, there are two broad and gestures, and other available cues (i.e., setting and
Table 25–4. Common Communication Problems and Recommended Repair Strategies
Communication Partner Listener

Common Communication Problems Recommended Repair Strategies
Listener only understood part of the message 1. Repeat the part you understood (e.g., “you flew to Paris?”)
2. Ask for the part you did not understand to be repeated
Listener is unable to see the communication Ask the partner to please put his or her hand down away
partner’s mouth when he or she is speaking from the mouth when he or she is speaking
Communication partner is speaking too quickly Ask the partner to please speak a little more slowly
Communication partner’s speech was too soft Ask the partner to please speak a little more loudly
Communication partner’s sentence was too long Ask the partner to please make the sentence shorter
Communication partner’s speech was not clear Ask the partner to please speak a little more clearly
The sentence was too complicated Ask the partner to please rephrase that in a different way
context) to interpret the meaning of communication. speechreader. As shown in Table 25–5, each factor has a
In using this additional information (visual and con- number of components that can affect communication.
textual), we can enhance understanding communica- However, there are methods that can be employed to
tion in noisy situations (i.e., noisy airplane or subway) help improve or at least minimize the problem. Within
when a message cannot be understood without seeing speechreading, it is not sufficient to simply watch the
the speaker’s face. The development of speechreading visemes and repeat them, but, rather, speechreaders
skills can augment communication when an individual must also learn what will assist them in gathering addi-
has hearing loss. In other words, if a normal-hearing tion information and how to make that change happen.
individual uses speechreading in normal conversa- For example, if the talker is speaking rapidly and this
tion, we can expect that individuals with a hearing loss is hampering good communication, the speechreader
will also need to speechread to have effective commu- can learn that it is appropriate to ask the speaker to
nication. The hearing aid does not eliminate the need slow down. Also, if the talker is obscuring his or her
to speechread but rather requires the user to com- mouth with a paper, the speechreader can learn to self-
bine listening and speechreading to have optimum advocate and ask the speaker to keep his or her mouth
communication. visible. Issues with the speechreader and the environ-
Speechreading sounds like a simple thing, just ment are easily repaired because the speechreader can
watch what a person is saying as if you are reading the manipulate them (i.e., move closer to the speaker).
movement of his or her lips. However, in reality, it is
not that easy as 60% of sounds (e.g. /g, k, n/) are not
visible or easily seen. In addition, the rate of conversa- Outcome Measures for Adults
tional speech is 150 to 250 words per minute or four to
seven syllables per second, which would be difficult The assessment of outcome measures in AR includes
and taxing to follow in a long conversation. The level four key domains: performance, benefit, usage, and
of difficulty is also increased because sounds belong to satisfaction of the device. These domains are mea-
viseme groups and words are homophonous. Visemes sured by a variety of tests both on the amplification
are groups of speech sounds that look the same on the device itself and in combination with the user. Verifi-
lips such as /b, m, p/, and homophones are words that cation techniques are primarily used to determine if
look the same on the lips (without sound) such as bat the hearing aids meet a particular standard and are
and pat (Tye-Murray, 2020). performing appropriately. The use of electroacoustic
There are four factors that can affect the speechread- analysis including measures of the sound (e.g., inten-
ing situation: the talker, the message, the speechread- sity) output and frequency response will determine if
ing environment/communication situation, and the the sound produced by the hearing aids is appropriate
Table 25–5. Individual Components for Each of the Four Factors That Can Influence Speechreading
Talker Message Environment Speechreader

Facial expressions Length Viewing angle Lipreading skill
Diction Syntactic complexity Distance Residual hearing
Body language Frequency of word Background noise Use of appropriate amplification
usage
Speech rate Room acoustics Stress profile
Shared homophenes
Familiarity to the speechreader Distractions Attentiveness
Context
Accent Fatigue
Facial characteristics Motivation to understand
Speech prosody (intonation, Language skills
stress, and rhythm)
Uncorrected vision impairment
Objects in or over the mouth
Source: Reproduced with permission from Foundations of Aural Rehabilitation: Children, Adults, and Their Family Members (5th ed., p. 132)
by N. Tye-Murray, 2020, San Diego, CA: Plural Publishing, Inc. Copyright 2020 by Plural Publishing, Inc.
for the hearing loss. As you may recall from Chapter 24, group programs (Abrams, Chisolm, & McArdle, 2002).
performing only verification measures provides infor- Group AR has also been shown to improve a user’s
mation regarding how much gain the individual is hearing aid satisfaction, help a user be more adept at
receiving but does not provide information about employing communication strategies, reduce the user’s
whether this amount of gain is actually benefiting self-perceived hearing handicap, help a user be more
them. After setting the prescriptive target for the ampli- relaxed but also more assertive in communicative situ-
fication, we need to verify that the prescriptive targets ations, and improve their emotional well-being and
have been met and that the fitting is appropriate for coping behaviors. These improvements lead to a user’s
the user. This will determine whether the hearing aid increased self-sufficiency and self-efficacy, improved
is working as desired on the user’s ear. The next step is ability to address psychosocial influences on behavior,
validation — through the use of functional gain, speech improved socialization (the user knows he or she is not
perception, questionnaires, or interviewing, we can the only one with a hearing loss), and will improve the
determine how much benefit the user is receiving from relationship with the clinic and audiologist.
the hearing aids. Both verification and validation of the When developing a group AR program you need
fitting and function of the hearing aids are a part of to consider the group, the number of sessions, and the
the AR process. These measures are described in Chap- content. In determining the group composition, con-
ter 24, and the reader is referred to those sections for sider the age and experience of the users and their com-
further information about the procedures. munication partners. It is important to include his or
Outcomes can also be determined for an individ- her daily communicative partners (i.e., spouses, chil-
ual’s speechreading ability and use of communication dren, or caregivers) as they will also benefit from the
strategies. A number of speechreading tests are avail- program (Preminger, 2003). The size of the group is
able such as the Craig Speechreading Test and the NTID also important, it should not be larger than 10 or 12, or
Speechreading Test. These tests can be used pre and it will be too large to accomplish any meaningful work
post-evaluation of speechreading instruction to deter- or interaction. The number of sessions can be as few as
mine if sufficient improvement is being obtained. Com- two or as many as 8 or 10. Different participants may
munication strategies can also be assessed through the only be able to attend a few sessions, so different types
use of questionnaires and log books, which will assist of programs may be required.
with determining if his or her goals have been reached. Not only do we need to consider the content and
the group makeup, but the facility is important as well.
For a successful program, you need to ensure that the
Group Aural Rehabilitation group members can hear and understand the content.
The meeting should be a held in a quiet room with
AR can take place individually, but it may also be good lighting and a single table that participants can
offered as a group activity. AR can also be provided to sit around. It should also have ALDs with microphones
individuals in a group setting, since adults are postlin- so all participants can hear and be heard by the other
gually deafened and thus require retraining of skills members. Other assistive services such as closed cap-
that were lost. This is less complicated and requires less tioning may be required depending on the content
individualization than facilitating the acquisition of delivery. The moderator needs to provide ground rules
language. There are many benefits to individuals being for communication and address emotional reactions
a part of a group AR program; these benefits impact (i.e., anxiety, frustration, fear, or shame). In some cir-
both the user and the clinician. Individuals, especially cumstances, it will be beneficial if the group members
adults, will benefit from the group interaction, the peer determine the content for the sessions.
support, and the improved self-sufficiency and emo- The content of an AR program should provide
tional well-being gained in an AR program. The clini- information on problem management and managing
cian benefits from being able to counsel multiple users the emotional response to hearing difficulties (Oestreich,
at a time thus increasing the number of people who can 2018). Problem management will focus on communica-
be assisted. Furthermore, group AR allows the clinician tion strategies; training, identifying and solving prob-
can help his or her patients use their new hearing aids, lems with communication (i.e., speechreading or other
which will lead to a reduction in the number of hear- strategies); self-advocacy; and, providing informational
ing aids being returned. However, probably the biggest lectures. These lectures can include information on
benefit is in quality of life; people who have partici- hearing aids, ALDs, and such obscure topics as travel-
pated in group AR report an improvement in quality of ing with a hearing loss (i.e., communicating in a car or
life indicators compared with those who do not attend your Americans with Disabilities Act [ADA] rights in a
hotel). To target management of the emotional response Children identified early and who receive optimal
to hearing difficulties, participants can work on psy- auditory and language stimulation develop spoken
chosocial exercises and/or stress reduction training. language skills commensurate with age-matched peers.
Aural Rehabilitation
Chapter Summary
AR is a patient-centered care intervention aimed at
Aural Habilitation restoring a hearing-impaired individual’s participa-
tion in activities. It also strives to alleviate difficulties
The 1, 3, 6 rule is where infants must have their hearing related to hearing loss; works with families or com-
screened by 1 month, the diagnosis must be completed munication partners to adjust its impact; enhances the
and amplification fitted by 3 months, and the child conversational fluency and reduces the hearing-related
must be enrolled in intervention by 6 months. disability of an individual with hearing loss.
Aural habilitation is an intervention for persons Hearing aids or cochlear implants may not be
who have not developed listening, speech, and lan- sufficient for communication; the listener may need
guage skills. Activities are designed to teach the par- additional assistance from HATS. Either an ALD (to
ent how to help their child develop auditory skills and facilitate the reception of speech, to improve signal-
create an optimal listening environment for their child. to-noise ratios and communication with others) or an
After determining type and degree of hearing loss AAD (to react to alarm situations and notify users that a
in each ear, fitting the hearing aid, and establishing situation is occurring) can assist with various situations.
exactly what the child hears, audiologists can gather Home-based computer-delivered programs can
additional information about auditory behaviors in be used to improve working memory, attention, and
everyday life from parents through the use of question- communication. Examples include Angel Sound, the
naires that will complement other objective test results. Computer-Assisted Speech Training (CAST), and the
Examples: Parents’ Evaluation of Aural/Oral Perfor- Listening and Communication Enhancement Training
mance of Children (PEACH), Meaningful Auditory (LACE).
Integration Scale (MAIS), and LittlEARS questionnaire. Strategies that help facilitate communication
Factors that affect the outcomes of a child’s audi- include anticipatory strategies (focus on good com-
tory development in a negative way include environ- munication habits and on controlling the environment
ment, family, and child. Factors that positively affect to prepare for the communication encounter) or repair
the development of a child’s auditory skills include the strategies (fixing the communication breakdown after
use of a family-centered approach, providing unbiased it has occurred).
information on all options, monitoring auditory devel- Speechreading involves use of visual information,
opment at 6-month intervals, providing services in a auditory information, facial expressions, gestures, and
natural environment, and being sensitive to cultural any other available cues to interpret communication.
and language differences. In the early stages, the focus It is difficult because most sounds are not visible, and
is on parent support and education rather than on conversation speech is rapid. Four factors that affect
“therapy” with the child. the speechreading situation include the talker, the mes-
Auditory training is the development of listen- sage, the speechreading environment/communication
ing skills using auditory information. Children learn situation, and the speechreader.
to use the information from their amplification device Group AR benefits both the user and the clinician.
to understand the auditory input. This occurs as they Individuals benefit through peer support as well as by
work through a hierarchy of auditory development and improved self-sufficiency, emotional well-being,
that includes detection, discrimination, identification, and improved quality of life. Clinicians benefit from
and comprehension. being able to counsel multiple users at a time.
Parents will need to make a choice about the com- Group AR programs should include the par-
munication mode their child will use. Audiologists ticipants and their communication partners, have
provide parents with information to make an informed fewer than 12 participants, and consist of no more
decision and supports the parent’s decision. Choices than 10 sessions. The programs need to occur in an
for communication mode vary across a continuum appropriate room with appropriate HATS. The con-
from auditory to visual and include auditory-verbal, tent needs to include information on problem manage-
auditory-oral, cued speech, total communication, or ment and managing the emotional response to hearing
American Sign Language. difficulties.
References tative audiologists. International Journal of Audiology, 53,

S60–S67.
Abrams, H., Chisolm, T. H., & McArdle, R. (2002). A cost- Head, L. S., & Abbeduto, L. (2007). Recognizing the role of
utility analysis of adult group audiologic rehabilitation: parents in developmental outcomes: A systems approach
Are the benefits worth the cost? Journal of Rehabilitation to evaluating the child with developmental disabilities.
Research and Development, 39(5), 549–558. Developmental Disabilities Research Reviews, 13(4), 293–301.
American Speech-Language-Hearing Association (ASHA). Joint Committee on Infant Hearing (JCIH). (2007). Posi-
(2019). Child audiologic (hearing) habilitation. Retrieved from tion statement: Principles and guidelines for early hear-
https://www.asha.org/public/hearing/Child-Audio ing detection and intervention programs. Pediatrics, 120,
logic-Habilitation/ 898–921.
Bagatto, M. P., Moodie, S. T., Seewald, R. C., Bartlett, D. J., & Lawrence, B. J., Jayakody, D. M. P., Henshaw, H., Ferguson,
Scollie, S. D. (2011). A critical review of audiological out- M. A., Eikelboom, R. H., Loftus, A. M., & Friedland, P.
come measures for infants and children. Trends in Amplifi- L. (2018). Auditory and cognitive training for cognition
cation, 15(1), 23–33. doi: 10.1177/1084713811412056 in adults with hearing loss: A systematic review and
Boothroyd, A. (2007). Adult aural rehabilitation: What is it meta-analysis. Trends in Hearing, 22, 2331216518792096.
and does it work? Trends in Amplification, 11, 63–71. doi:10.1177/2331216518792096
Centers for Disease Control and Prevention (CDC) (2016). Moeller, M. P., Hoover, B., Peterson, B., & Stelmachowicz,
2016 Hearing screening summary. Retrieved from https:// P. (2009). Consistency of hearing aid use in infants with
www.cdc.gov/ncbddd/hearingloss/2016-data/01-data- early-identified hearing loss. American Journal of Audiology.
summary.html 18(1), 14-23.
Ching, T. Y. C., & Hill, M. (2007). The Parents’ Evaluation Oestreich, K. (2018). Adult group aural rehabilitation: Imple-
of Aural/Oral Performance of Children (PEACH) Scale: menting a successful program. Audiology Today, 30(6), 44–53.
Normative data. Journal of the American Academy Audiology, Pizarek, R., Shafiro, V., & McCarthy, P. (2013). Effect of com-
18, 220–235. puterized auditory training on speech perception of adults
Coninx, F., Weichbold, V., Tsiakpini, L., Autrique, E., Bescond, with hearing impairment. Perspectives on Aural Reha-
G., Tamas, L., . . . Brachmaier, J. (2009). Validation of the bilitation and Its Instrumentation, 20(3), 91. doi:10.1044/
LittlEARS Auditory Questionnaire in children with nor- arri20.3.91
mal hearing. International Journal of Pediatric Otorhinolaryn- Preminger, J. (2003). Should significant others be encouraged
gology, 73(12), 1761–1768. doi:10.1016/j.ijporl.2009.09.036 to join adult group audiologic rehabilitation classes? Jour-
Erber, N. (1982). Auditory training. Washington, DC: A. G. Bell nal of the American Academy of Audiology, 14, 545–555.
Association. Sweetow, R. W., & Sabes, J. H. (2009). Auditory training. In J.
Fitzpatrick, E. (2010). A framework for research and prac- J. Montano & J. B. Spitzer (Eds.), Adult audiologic rehabilita-
tice in infant hearing. Canadian Journal of Speech-Language tion (pp. 267–283). San Diego, CA: Plural Publishing.
Pathology and Audiology, 34(1), 25–32. Tye-Murray, N. (2020). Foundations of aural rehabilitation: Chil-
Fitzpatrick, E., & Doucet, S. P. (2013). Pediatric audiologic dren, adults, and their families (5th ed.). San Diego, CA: Plu-
rehabilitation: From infancy to adolescence. New York, NY: ral Publishing.
Thieme Medical. Yoshinaga-Itano, C., Sedey, A. L., Coulter, D. K., & Mehl, A.
Gallaudet Research Institute. (2009). Regional and national L. (1998). Language of early- and later-identified children
summary report of data from the 2007–2008 Annual Survey of with hearing loss. Pediatrics, 102, 1161–1171.
Deaf and Hard of Hearing Children and Youth. Washington, Zimmerman-Phillips, S., Robbins, A. M., & Osberger, M. J.
DC: Gallaudet University. (2000). Assessing cochlear implant benefit in very young
Grenness, C., Hickson, L., Laplante-Lévesque, A., & David- children. Annals of Otology, Rhinology, and Laryngology,
son, B. (2014). Patient-centered care: A review for rehabili- Supplement, 185, 42–43.
Index
Note: Page numbers in bold reference non-text material.
22q11.2 Deletion syndrome, 273 AD (Alzheimer’s disease), 33, 126

Adduction, of vocal folds, 141
Admittance, 319
A Adults
AAA (American Academy of Audiology), 2, apraxia of speech, 23, 191, 193, 203–204
317 childhood compared with, 212–214
AAS (Adult apraxia of Speech), 121–122, terms, 192
203–204 language disorders
in adults, 121–122, 191, 193 aphasia, 114–125
compared with childhood apraxia of cerebral hemispheres and, 111–114
speech, 212–214 motor speech disorders
AAVE (African American Vernacular English), classification system, 191–203
48 described, 191
Abbreviated Profile of Hearing Aid Benefit dysarthria, 193
(APHAB), 345 Advanced stuttering, 233–234
ABR (Auditory Brainstem Response), Affricatives, 182, 185
328–332 development of, 188
Accent African American Vernacular English (AAVE),
described, 50 48
dialect and, 53 Age, swallowing and, 286
foreign, 53–54 Agents, defined, 73
language, 50–51 Air conduction, pure tone audiometry and,
Acoustic Theory of Speech Production, 154 325
Acoustics Air molecules, 293
described, 293 Airflow, lung pressure and, 132–133
events Allophones, 179–180
aperiodic, 296 Alpert syndrome, 273
periodic, 296 ALS (Amyotrophic lateral sclerosis), 201
phonetics, 169 Alzheimer’s disease (AD), 33, 126
reflex, 307 American Academy of Audiology (AAA), 2,
threshold, 323–324 317
signal American Academy of Speech Correction, 1
described, 293 American English
from talker, 3 consonants, compared to other languages,
speech, 154–158 175–176
assistive listening devices and, 159–160 vowels, compared to other languages, 172
sound filter/source, 154 American Sign Language (ASL), 93
Acquired stuttering, 238–239 American Society for the Study of Disorders
Actions, defined, 73 of Speech, 1
375
American Speech and Hearing Association (ASHA), 2 Audiologist, preparation for, 12–13
American Speech Correction Association, 2 Audiometric testing, 324–327
Amyotrophic lateral sclerosis (ALS), 201 pure tone, 325
The Angel’s Share, 52 results of, 334–335
Anomic aphasia, 119 Auditory
Anvil, 305 anatomy
Aperiodic, acoustic events, 296 temporal bone, 301–302
APHAB (Abbreviated Profile of Hearing Aid Benefit), tests of, 313
345 cortex, primary, 24
Aphasia, 114–125 implantable devices, 349–355
classification of, 114–125 nerve, 313
anomic, 119 pathways, 27–28, 313
apraxia of speech, 121–122 theories, of speech perception, 162–166
Broca’s, 116–117 training, aural habilitation, 365, 368–369
conduction, 117–118 tube, 307
global, 119–120 Aural habilitation, 360–367
primary progressive, 120–121 auditory training in, 365
Wernicke’s, 115–116, 117 communication needs
described, 114 of adults, 367–368
stroke and, 122–123 of children, 360–362
traumatic brain injury and, 124 communication options, 365–366
Applied science, described, 5 intervention
Apraxia, defined, 211 coaching model of, 363
Apraxia of speech, 23, 203–204 family-centered, 364–365
in adults, 121–122, 191, 193 pediatric, 362–363
childhood, 211–216 outcome measures, for children, 367
Arcuate fasciculus, 30–31 Aural rehabilitation, 367–372
Articulation adult intervention, 368
manner of, 174–175 auditory training in, 368–369
place of, 174 communication strategies, 369
Articulatory phonetics, 169 group, 371–372
ASD (Autism Spectrum Disorder) outcome measures, for adults, 370–371
defined, 89 speechreading, 369–370
language characteristics of, 89–91 Auricle, 303
morphology of, 90 Autism, incidence of, 89
phonology, 90 Autism Spectrum Disorder (ASD)
pragmatics and, 91 defined, 89
syntax and, 90 language characteristics of, 89–91
vocabulary and, 90–91 phonology, 90
ASHA (American Speech and Hearing Association), 2 pragmatics and, 91
ASL (American Sign Language), 93 syntax and, 90
Aspiration, 282 vocabulary and, 90–91
Assistive listening devices Automatic speech recognition, 159
auditory implantable devices, 349–355 Axon, 18
bone-anchored implant, 350–351
cochlear implant, 354–355
hearing aids, 343–349
B
rear-ear measures, 344–345 Babbling, 62
selecting/fitting, 343–349 described, 61–63
validation of, 345 BAI (Bone anchored implant), 349
middle ear implant, 351–354 Balance, disorders, 337
speech acoustics and, 159–160 Basal ganglia, 24
Ataxic described, 200
cerebral palsy, 221 lesions, 201
dysarthria, 197, 201, 224 neurological diseases and, 25
Spanish and, 198 Parkinson’s disease and, 145
Index 377
Basal nuclei, 24 Cartilages, laryngeal, 138

Basilar membrane, 308–312 CAS (Childhood apraxia of speech), 121–122, 211–216
Beginning stuttering, 233 characteristics of
Behind-the-ear (BTE) hearing aid, 345 general, 214
Bench-to-bedside, 5 speech, 214, 215
Benedict, Ruth, 46 compared with adult apraxia of speech, 212–214
Bilateral cleft lips, 265 developmental delays and, 215
Bilingualism, 54–55 genetics and, 215–216
Biological theories prevalence of, 214
of stuttering, 235–238 CCC-A (Certificate of Clinical Competence in
anatomical differences, 235 Audiology), 2, 12–13
physiological (functional) differences, 236–237 CCC-SLP (Clinical Competence in Speech Language
Bolus, swallowing and, 284–286 Pathology), 12–13
Bone-anchored implant (BAI), 350–351 CDC (Centers for Disease Control and Prevention), 124
Bone conduction, pure tone audiometry and, 325 on autism, 89
Bony labyrinth, 307 Cell body, 18
Borderline stuttering, 232–233 Centers for Disease Control and Prevention (CDC), 124
Botulinum toxin, 256 on autism, 89
Bound morphemes, 40 Central nervous system (CNS)
Brain described, 17–18
adult language disorders and, 111–114 swallowing and, 283
connections between regions of, 112 Central sulcus, 22
dominant hemisphere, 29 Cerebellum, 27
frontal lobe, 22–23 Cerebral
hemispheres, 22 hemispheres
left, 23 adult language disorders and, 111–112
language and, 38 pediatric speech disorders and, 220–224
MRI of, 31–32 palsy
occipital lobe, 23 ataxic type of, 221
parietal lobe, 24 childhood speech disorders and, 220–224
pathology in dementia, 126–127 dyskinetic type of, 220–221
perisylvian speech and, 112–114 mixed type of, 221
structures of, 26 spastic type of, 220
temporal lobe, 23–24 Certificate of Clinical Competence in Audiology
tissues of, 18 (CCC-A), 2, 12–13
traumatic injury to, 124–125 Childhood apraxia of speech (CAS), 121–122, 211–216
language impairment in, 125–126 characteristics of, 215
tumors, dysarthria in, 225–226 compared with adult apraxia of speech, 212–214
view of developmental delays and, 215
from above, 22 genetics and, 215–216
left hemisphere, 23 prevalence of, 214
Brainstem, 26–27 Childhood speech disorders
Breathing, swallowing and, 282–283 apraxia of speech, 211–216
Broad phonetic transcription, 177 brain tumors and, 225–226
Broca’s aphasia, 116–117 cerebral palsy and, 220–224
Broca’s area, 23, 28–30 speech delay
adult language disorders and, 111 phonetic/phonological, 209–210
Bronx cheers, 61 quantitative measures of, 208–209
Brown, Roger, semantic categories, 73–74 speech sounds and, 208–209
BTE (Behind-the-ear) hearing aid, 345 traumatic brain injury and, 224–225
treatment options/considerations, 226–227
Childhood voice disorders, prevalence/treatment of,
C 257
Camp, Pauline, 2, 6, 13 Children, communication needs of, 360–362
Cancer, of larynx, 256–257 Children who stutter (CWS), 229
Canonical babbling, 63, 65 Children with velocardiofacial syndrome (VCFS), 273
Chomsky, Noam, 57–58 phonetic symbols and, 174–176

language acquisition device (LAD) and, 43 manner of articulation, 174–175
Chromosomes, defined, 101 place of articulation, 174
CI (Cochlear implant), 349, 354–355 voicing, 175
CIC (Completely in-the-canal hearing aid), 348 Constraints, phonotactic, 180
Cleft Conversation
lip/palate (CL/P), 266–269 conversational repairs, 79–80
speech production in, 267–269 pragmatics and, 78
lips, 264–265 topic management, 79
palate turn-taking in, 79
isolated, 266–267 Conversational repairs, 79–80
with/without cleft lips, 266–267 Corpus callosum, 22
Clefting Cortex
epidemiology of, 267 hidden, 24
syndromes and, 272–273 insular, 24
Client Oriented Scale of Improvement (COSI), 345 perisylvian, 31
Clinical Competence in Speech Language Pathology primary
(CCC-SLP), 12–13 auditory, 24
Clinical Competence in Speech of Audiology (CCC-A), motor, 22
12–13 sensory, 24
Clinical work, review of, 7 Corticobulbar tract, 197
CL/P (Cleft lip/palate), 266–269, 272, 273–274 COSI (Client Oriented Scale of Improvement), 345
speech production in, 267–269 CPO (Isolated cleft palate), 266–267, 272, 273–274
Cluster reduction, 187 Cranial nerves, 27
CNS (Central nervous system), described, 17–18 Craniofacial anomalies
Coaching model of intervention, 363 defined, 261
Tye-Murray principles, 364 described, 261
Coarticulation, 149–150 upper lip/associated structures, embryological
Cochlea, 24, 305, 308 development of, 261–267
Cochlear implant (CI), 349, 354–355 Craniosynostosis syndrome, 273
Code switching, 52 Crouzone syndrome, 273
Cognitive-linguistic, described, 3 CT (Computed tomography), 32
Cognitive processes, language and, 42–44 Culture, defined, 45
Communication Curriculum, undergraduate, 10, 11
cultural factors influencing, 47 CWS (Children who stutter), 229
difference versus disorder, 47–48
Communication Sciences and Disorders
activities of, 2
D
as interdisciplinary field, 3–4 Davis, Barbara, 61–63
levels of evidence, 8–10 DBS (Deep brain stimulation), 25
required areas of knowledge, 4 Deacon, Terrence, 46–47
Compensatory errors, 270–272 Deafness genes, 92
Completely in-the-canal (CIC) hearing aid, 348 Deep brain stimulation (DBS), 25
Comprehension Deglutition, 277
language and Dementia, 33, 126–128
0 to 3 months, 60–61 language disorders in, 127–128
8 to 12 months, 65–66 Dendrites, 20
left hemisphere and, 112 Developmental chronologies, language, 58–66
Computed tomography (CT), 32 Developmental stuttering
Conduction aphasia, 117–118 diagnosis of, 231
Conductive hearing loss, 336 natural history of, 231–232
Consonants Diacritic symbols, 176, 178
American English compared to other languages, Diagnostic and Statistical Manual of Mental Disorders
175–176 on Autism Spectrum Disorder, 89
articulation, 270–272 on childhood-onset disorder, 229
imprecise, 193 criteria for, intellectual disability, 99
Index 379
Dialect auditory nerve/pathways, 312

accent and, 53 basilar membrane, 308–312
code switching, 52 cochlea, 308
defined, 51 semicircular canals, 307–308
language and, 51–52 vestibule, 308
Dictionary of American Regional English, 46 middle, 305–307
Diffusion tensor imaging, 32–33 ligaments/muscles of, 306–307
Digital hearing aids, components of, 348 Eardrum, 304–305
Diphthongs, 147 Earmolds, hearing aids, 345
development of, 182, 188 Ebonics, 48
Discourse EBP (Evidence-based practice), 7–10
pragmatics and, 78 Embryological development, craniofacial anomalies,
turn-taking in, 79 261–267
Dominant hemisphere, brain, 29 Endoscopy, swallowing and, 287
Dopamine English, American, vowels compared to other languages,
basal ganglia and, 25 172
substantia nigra and, 199 Epidemiology
Dorsal stream, 30–31 clefting, 267
Down syndrome (DS), 100–104 Down syndrome, 101
epidemiology, 101 fragile X syndrome, 106
language and, 102–104 hearing impairment, 92
Dysarthria, 125, 193 voice disorders, 243–244
ataxic, 197, 201, 224 Equal-appearing interval scale, 193
Spanish and, 198 Erber’s hierarchy, 368
in brain tumors, 225–226 Esophageal phase, of swallowing, 282
dyskinetic, 222–223 Esophagus, swallowing and, 277
flaccid, 195–196, 201 Eustachian tube, 307
hyperkinetic, 200–201 Evaluation, hearing, 317–337
hypokinetic, 198, 201 acoustic reflex threshold, 323–324
mixed, 201 audiometric testing, 324–327
spastic, 196–197, 201, 221–222 case history, 318
subtypes, 193 immittance audiometer, 319–320
unilateral upper motor neuron, 201 otoscopy, 318–319
Dysfluencies patient examples, 337–341
stuttering, 232 physiological responses, 327–332
described, 232 speech audiometry, 326–327
Dyskinetic speech in noise, 327
cerebral palsy, 220–221 speech reception threshold, 326–327
dysarthria, 222–223 tympanometry, 320–323
Dysmorphogenesis, 264 word recognition score, 327
Dysphonia Evidence-based practice (EBP), 7–10
muscle tension, 252–254 Evidence, levels of, 8–10
treatment of, 253–254 Executive function, brain, 22
spasmodic, 200–201 Expression, language and, 0 to 3 months, 58–60
Dystonia External auditory
hyperkinetic dysarthria and, 200 canal, 303–304
oromandibular, 200 meatus, 303–304
Extrinsic muscles, 139
E
EAM (External auditory meatus), 303–304
F
Ear F (Frequency), 295
anatomy of, 303–313 F0 (Fundamental frequency), 245
outer, 303–305 intonation and, 245
canal, acoustic equivalent volume of the, 321 phonation and, 142–143
inner, 307–313 Face validity, 7–8
Family-centered intervention, 364–365 Genotype, defined, 101

Fant, Gunner, 154 GERT (Gastroesophageal reflux disease), 289
Fast mapping, of spoken words, 71 Gesture, 66
FEES (flexible endoscopic evaluation of swallowing), 287 language and, 66
First pharyngeal arch syndrome, 273 GFTA (Goldman-Fristoe Test of Articulation), 208
Flaccid dysarthria, 201 Glia, 18
Imprecise, 195–196 defined, 18
Flanagan, James, 154 Glial cells, 18
Flat (Type B) tympanogram, 322 Glides, 147
Flexible endoscopic evaluation of swallowing (FEES), development of, 182
287 Global aphasia, 119–120
Fluency, recovering of, 234 Glottal stops, 272
Fluency stuttering disorders, 229–239 Glottis, 140
acquired, 238–239 Goldman-Fristoe Test of Articulation (GFTA), 208
advanced, 233–234 Grammatical
beginning, 233 morphemes, SLI/DLD and, 87
borderline, 232–233 morphology, 76–77
causes of, 234–238 development of, 72–74
developmental, 231–232 Gray matter, 18
dysfluencies, 231–232 Gross anatomy, described, 21–22
genetic studies, 231 Group aural rehabilitation, 371–372
incidence of, 229–230 Gyri, 22
natural history of, 230 Gyrus, 24
prevalence of, 229–230
treatment considerations, 239
fMRI (Functional magnetic imaging), 32–33
H
Foreign accent, 53–54 Habilitation, aural, 360–367
Formants, 156–158 Hammer, 305
Fragile X syndrome (FXS), 104–109 Hard palate, 264
Frame movement, 62 cleft, 266
Frame-then-content theory, 62 HATS (Hearing assistive technology systems), 368
Free morphemes, 40 Hearing
Frequency (F), 295 aids, 343–349
Fricatives, 147, 182, 185 components of, 348–349
development of, 188 in-the-canal (ITC), 346–348
stopping, 187 in-the-ear (ITE), 346
Friction, defined, 293 rear-ear measures, 344–345
Frontal lobe, 22–23 selecting/fitting, 343–345
Frontotemporal dementia (FTD), 126 types of, 345–348
FTD (Frontotemporal dementia), 126 validation of, 345
Functional magnetic imaging (fMRI ), 32–33 disorders, 337
Functional theory, of stuttering, 236–237 evaluation, 317–337
Functional voice disorders, 252–254 acoustic reflex threshold, 323–324
Fundamental frequency (F0), 245 audiometric testing, 324–327
intonation and, 245 case history, 318
phonation and, 142–143 immittance audiometer, 319–320
FXS (Fragile X syndrome), 104–109 otoscopy, 318–319
patient examples, 337–341
physiological responses, 327–332
G speech audiometry, 326–327
Gastroesophageal reflux disease (GERT), 289 speech in noise, 327
Genetics speech reception threshold, 326–327
childhood apraxia of speech and, 215–216 tympanometry, 320–323
defined, 101 word recognition score, 327
SLI/DLD and, 88 impairment
speech delay and, 211 adult language disorders and, 111–114
Index 381
epidemiology of, 92 semicircular canals, 307–308

language characteristics in, 92–93 vestibule, 308
language delay and, 92–94 Insertions, muscle attachments, 139
speech/language development and, 93 Insula, 24
loss, 335–337 Insular cortex, 24
conductive, 336 Intellectual disability (ID), 99–100
mixed, 336 Intensity
sensorineural, 336 phonation and, 143–145
normal, 335–336 voice disorders and, 246
tests of, 313 International Clinical Linguistics and Phonetics
Hearing assistive technology system (HATS), 368 Association (ICPLA), 170
Hearing Handicap Inventory for Adults and Elderly International Outcome Inventory of Hearing Aids
(HHIA and HHIE), 368 (IOI-HA), 345, 368
HHIA (Hearing Handicap Inventory for Adults and International Phonetic Alphabet (IPA), 169–170
Elderly), 368 Intonation, pitch and, 65
HHIE (Hearing Handicap Inventory for Adults and Intrinsic muscles, 139
Elderly), 368 IOI-HA (International Outcome Inventory of Hearing
Hidden cortex, 24 Aids), 345, 368
Hierarchical, defined, 42 IPA (International Phonetic Alphabet), 169–170
High admittance (Type Ad) tympanograms, 323 Isolated cleft palate (CPO), 266–267
Hirano, Minoru, 140–141 ITC (In-the-canal) hearing aid, 346–348
Huntington’s disease ITE (In-the-ear) hearing aid, 346
basal ganglia and, 25
hyperkinetic dysarthria and, 200, 201
Hypercontraction, dystonia and, 200
J
Hyperkinetic dysarthria, 200–201 Jargon, 65
Huntington’s disease and, 200, 201
Hypernasality, 269
velopharyngeal insufficiency and, 270
K
Hyperreflexia, 197 Karyotype, defined, 101
Hypofunction, 249
Hypokinetic dysarthria, 198, 201
L
Labov, William, 50–51
I LAD (Language acquisition device), 43
ICPLA (International Clinical Linguistics and Phonetics Language
Association), 170 accent, 50–51
ID (Intellectual disability), 99–100 bilingualism and, 54–55
IDEA (Individuals with Disabilities Education ACT), 13 brain and, 38
Impairment code switching, 52
hearing cognitive processes and, 42–44
adult language disorders and, 111–114 components of, 38–42, 53
epidemiology of, 92 content, 40–42
language characteristics in, 92–93 morphology, 40
language delay and, 92–94 phonology, 38–40
speech/language development and, 93 syntax, 40
Impedance, 319 as a conventional system, 36
Imprecise consonants, 193 delay, hearing impairment and, 92–94
In-the-canal (ITC) hearing aid, 346–348 described, 35–36
In-the-ear (ITE) hearing aid, 346 developmental chronologies, 58–66
Incus, 305 dialect, 51–52
Individuals with Disabilities Education ACT (IDEA), 13 difference versus disorder, 48–50
Inner ear, 307–313 as a dynamic system, 36–37
auditory nerve/pathways, 312 expression, left hemisphere and, 112
basilar membrane, 308–312 form, 38
cochlea, 308 functions, lateralization of, 112
Language (continued) Laryngeal

as a generative system, 37 cartilages, 138
impairment, traumatic brain injury and, 125–126 hyperfunction, 249
mental representations and, 37–38 hypofunction, 249
multilingualism and, 54–55 Laryngopharyngeal reflux (LPR), 289
preverbal development of, 66 Larynx
semantics and, 40 cancer of, 256–257
social use of, 42, 125–126 laryngeal
structural components of, 125 muscles/membranes, 139
Language acquisition device (LAD), 43 cartilages, cartilages, 138
Language development, 69–70 phonation and, 141–145
0 to 3 months speech and, 138–145
expression (production), 58–60 upper airway and, 145–150
perception/comprehension, 60–61 velopharyngeal mechanism, 147–150
3 to 8 months vocal tract and, 146–147
perception/comprehension, 63–65 vocal folds and, 139–141
production, 61–63 Lateral sulcus, 22
8 to 12 months Lateralization of function, 28–29
perception/comprehension, 65–66 Learning theories, of stuttering, 235
production, 65 Left hemisphere, 113
gesture and, 66 language expression/comprehension and, 112
grammatical morphology, 72–74, 76–77 view of, 23
hearing impairment and, 93 Lesions, vocal fold, benign, 251
linguistic sophistication, 74 Levels of evidence, 8–10
mean length of utterance, 74–76 Lewy body dementia, 126
multiword utterances, 72–74 Lexical stress pattern, 180
school years, 77–81 Linguistic
complex sentences, 81 sophistication, 74
conversation repairs, 79–80 wordplay, 78
linguistic wordplay, 78 Lips
metalinguistic skills, 77–78 cleft, 264–265
narrative, 80–81 with/without cleft palate, 266–267
pragmatics, 78 palate, 265–266
sample transcript, 81–82 upper airway and, embryological development of,
topic management, 79 261–267
turn-taking, 78–79 roundings, vowels and, 172
stages of, 70 Liquids, 147, 182–184
12 to 18 months, 71 development of, 185
18 to 24 months, 71–72 swallowing, 280
36 months, 72 Locatives, defined, 73
Language disorders Loudness
adult, cerebral hemispheres and, 111–114 psychoacoustics and, 299
adult language disorders and, aphasia, 114–125 voice disorders and, 246
dementia and, 33, 126–128 LPR (Laryngopharyngeal reflux), 289
pediatric Lung pressure
autism spectrum disorder, 89–92 airflow and, 132–133
Down syndrome, 100–104 voice loudness and, 137
fragile X syndrome, 104–109
hearing impairment and, 92–94
SLI/DLD, 85–89
M
speech delay, 206–211 MacNeilage, Peter, 61–63
Language use, 42, 78 Magnetic resonance imaging (MRI), brain activity, 31–32
Down syndrome, 104 functional (fMRI), 32–33
fragile X syndrome, 108–109 Malleus, 305
hearing impairment and, 94 Mandibular arch, 262
Index 383
Mandibulofacial dysostosis, 273 vocal tract, 146

Mayo Clinic System for Classification of Dysarthria, Mutation, defined, 101
193–195 Myelin, 21
Mean length of utterance (MLU), 74–76 described, 18
Medulla, 26
Mehler, Jacques, 61
MEI (Middle ear implant), 349, 351–354
N
Membranes Narrative, 80–81
basilar, 308–312 phonetic transcription, 177
laryngeal, 139 Nasals, 147
tympanic, 304–305 development of, 182, 188
Metalinguistic skills, 77–78 National Acoustic Laboratories, 344
Midbrain, 26 National Institute of Deafness and Other
Middle ear, 305–307 Communication Disorders (NIDCD), 317
implant (MEI), 349, 351–354 National Institutes of Health (NIH), 5
ligaments/muscles of, 306–307 Natural frequency of vibration, 297
Mixed Negative pressure (Type C) tympanogram, 323
dysarthria, 201 Nervous system
hearing loss, 336 adult language disorders and, 111–114
type of cerebral palsy, 221 cells, 18
MLU (Mean length of utterance), 74–76 described, 17–18
Molecules, air, 293 gross anatomy of, 21–22
Monogenic, defined, 101 swallowing and, 283
Morphemes, 40 Neural mechanism, 313
SLI/DLD and, 87 Neuroanatomy, described, 21–22
Morphology Neurodiagnostic ABR, 330–331
ASD and, 90 Neurodiverse, 89
Down syndrome, 103 Neurogenic stuttering, 238–239
fragile X syndrome, 107–108 Neuron, described, 18–21
hearing impairment and, 94 Neurotransmitter, 20
Motor cortex, primary, 22 Neurotypical, 89
Motor speech disorders in adults NIDCD (National Institute of Deafness and Other
apraxia of speech, 191, 193 Communication Disorders), 5, 317
classification system for, 191–203 NIH (National Institutes of Health), 5
described, 191 Nodules, vocal tract, 250–251
dysarthria, 193 Noise, defined, 145
terms, 192 Nonsignaling cells, 18
Motor theory of speech perception, 160–166 Normal admittance (Type A) tympanogram, 321–322
MRI (Magnetic resonance imaging), brain activity, 31–32 Normal hearing, 335–336
functional (fMRI), 32–33 Nuclei, subcortical, 24–26
MS (Multiple sclerosis), 201
myelin and, 18
MTD (Muscle tension dysphonia), 252–254
O
treatment of, 253–254 OAE (Otoacoustic emissions), 327–328
Multilingualism, 54–55 Objects, defined, 73
Multiple repetitions, stuttering, 231 Obligatory errors, 270
Multiple sclerosis (MS), 201 Occipital lobe, 23
myelin and, 18 Oller, D. Kimbrough, 58, 63
Multiword utterances, 72–74 Oral
Muscle tension dysphonia (MTD), 252–254 preparatory phase, of swallowing, 280
treatment of, 253–254 transport phase, of swallowing, 280
Muscles Organ of Corti, 308–312
extrinsic, 139 Organic voice disorders, 252
intrinsic, 139 Origins, muscle attachments, 139
laryngeal, 139 Oromandibular dystonia, 200
Oscillation, 294–295 Pharyngeal

Ossicles, 305–306 arches, 262
Otoacoustic emissions (OAE), 327–328 phase of swallowing, 280–282
Otoscopy, 318–319 Phenotype, defined, 101
Outer ear, anatomy of, 303–305 Phonation
Overextension, language and, 71 characteristics of, 142–145
intensity and, 143–145
Phonemes, 38–39
P defined, 179
Palate, cleft, with/without cleft palate, 266–267 Phonetic
Palatine shelves, 263 development, 181
Palatoplasty, 272 inventory, 38–39
Paralinguistic, 66 speech delay, 209–210
described, 66 symbols, 153
function, 137 consonants and, 174–176
Parietal lobe, 24 vowels and, 170–173
Parkinson’s disease (PD), 145 transcription, 69, 159, 166, 169, 176
basal ganglia and, 25 broad, 176, 177
Pathways, auditory, 27–28 clinical implications of, 176–177
Patterns of Culture, 46 Phonetically balanced (PB) words, 327
PB (phonetically balanced) words, 327 Phonological
PCC (Percentage of consonants correct), 207–208 development, 181, 186–188
PD (Parkinson’s disease), 145 work learning and, 188
basal ganglia and, 25 learning, 186
Pediatric speech delay, 209–210
intervention, 362–363 Phonology, 38–40
language disorders ASD and, 90
autism spectrum disorder, 89–92 defined, 141
Down syndrome, 100–104 Down syndrome, 102–103
fragile X syndrome, 104–109 fragile X syndrome, 106–107
hearing impairment, 92–94 hearing impairment and, 93
SLI/DLD, 85–89 larynx and, 141–145
speech delay, 203–211 SLI/DLD and, 86–87
voice disorders, 257 Phonotactic
Penfield, Wilder, 30 constraints, 180
People who stutter (PWS), 229 rules, 39–40
Percentage of consonants correct (PCC), 207–208 Phonotrauma, 249–252
Perception treatment of, 251–252
language and Physiological differences theory, of stuttering, 236–237
0 to 3 months, 60–61 Pinna, 303
3 to 8 months, 63–65 Pitch, 245
8 to 12 months, 65–66 intonation and, 65
speech, 160–166 phonation and, 142–143
auditory theories of, 162–166 psychoacoustics and, 298–299
Perceptual phonetics, 169 PNS (Peripheral nervous system), described, 17–18
Periodic Polygenic, defined, 101
acoustic events, 296 Polyps, vocal fold, 251
motion, 295 Pons, 26
vibrations, 294 Pragmatics, 42, 78
Peripheral nervous system (PNS) Down syndrome, 104
described, 17–18 fragile X syndrome, 108–109
swallowing and, 283 hearing impairment and, 94
Perisylvian traumatic brain injury and, 125–126
cortex, 31 Praxis, defined, 211
speech, brain and, 112–114 Preverbal
Persistent speech sound errors, 209–211 language development, 66, 71
Index 385
speech and language development, 57 Semicircular canals, 307–308

Primary Sensorimotor skills, 60–61
auditory cortex, 24 Sensorineural hearing loss, 336
motor cortex, 22 Sensory cortex, primary motor, 24
progressive aphasia, 120–121 Sentences, complex, 81
sensory cortex, 24 Significant Other Assessment of Communication
Production, language and (SOAC), 368
3 to 8 months, 61–63 SLDs (Stuttering-like dysfluencies), 231
8 to 12 months, 65–66 SLI/DLD (Specific language impairment/developmental
Protowords, 65 language disorder), 85–89
Psychoacoustics, 297–298 cause of, 88
loudness and, 299 genetics and, 88
pitch, 298–299 language characteristics of children with, 86–88
sound quality and, 299–300 grammatical morphemes, 87
Psychogenic theories, of stuttering, 234 phonology, 86–87
Puberphonia, 254–255 syntax, 87
Public Law 94-142, 13 vocabulary, 87–88
Pure tone audiometry, 325 summary of, 88
PWS (People who stutter), 229 SLP (Speech-language pathologist), 10, 145, 185
consonant errors and, 270
end goal of, 226
Q preparation for, 10–12
Quality, voice, 145 swallowing disorders and, 288–289
Quasi-resonant nuclei, 58 SOAC (Significant Other Assessment of
Communication), 368
Social communication disorder, 92
R Soft palate, 264
Randomized control trials (RCTs), 8–9 cleft, 266
RCTs (Randomized control trials), 8–9 Solids, swallowing and, 280
REAR (Rear-ear aided response), 345 Somatotropic representation, 23
Rear-ear aided response (REAR), 345 Sound
Rear-ear measures, 344–345 filter, 154
Reduced admittance (Type As) tympanogram, 323 production, vocal tract growth and, 59–60
Reduplicative babbling, 61 quality, psychoacoustics and, 299–300
Research Grants for Translating Basic Research into Clinical source of, 154
Tools, 5–6 vowel, 155–156
Residual speech sound errors, 209–210 Spanish, ataxic dysarthria and, 198
Resonance, 297 Spasmodic dysphonia (SD), 200–201, 256
Resonant frequency, 59, 297 Spastic
vowels and, 156–158 cerebral palsy, 220
Respiratory system dysarthria, 196–197, 201, 221–222
speech and, 131–137 Specialization of function, 28
vegetative breathing and, 133–134 Species-specific, defined, 160
Respirometer, 133 Specific language impairment/developmental language
Rest breathing, 133–134 disorder (SLI/DLD), 85–89
Rhotics, 147 cause of, 88
genetics and, 88
language characteristics of children with, 86–88
S grammatical morphemes, 87
SAC (Self-Assessment of Communication), 368 syntax, 87
SADL (Satisfaction with Amplification in Daily Life), 368 vocabulary, 87–88
Sample size, 48 obtaining a Clinical Competence in Speech-Language
Satisfaction with Amplification in Daily Life (SADL), 368 Pathology, 12
SD (Spasmodic dysphonia), 200–201, 256 phonology, 86–87
Self-Assessment of Communication (SAC), 368 preparation for, 10–12
Semantics, 40 summary of, 88
Spectrograms, of formant frequencies, 156–158 end goal of, 226

Spectrum, waveform and, 295–297 preparation for, 10–12
Speech swallowing disorders and, 288–289
acoustics, 154–158 Speechreading, 369–370
assistive listening devices and, 159–160 Sphenopalatineganglioneuralgia, 284
sound filter/source, 154 Spinal cord, 27
acts, 42 Spirogram, 133
audiometry, 326–327 Spirometer, 133
breathing, 134–137 Standardized tests, 48, 86
abdominal muscles and, 136–137 language difference versus disorder, 48–50
goal of, 134, 136 Stapedius muscle, 306–307
intelligibility and, 137 Stapes, 305
voice loudness and, 137 Stevens, Kenneth, 154
delay, 206–211 Stickler syndrome, 273
disorders in childhood, 208–210, 224–227 Stinchfield, Sara Mae, 1
apraxia of speech, 211–216 Stirrup, 305
brain tumors and, 225–226 Stomach, swallowing and, 278
cerebral palsy and, 220–224 Stopping the fricatives, 187
diagnosis of, 207–208 Stops, 147
genetics and, 211 development of, 182, 188
phonetic/phonological, 209–210 Stuttering, 229–239
quantitative measures of, 208–209 acquired, 238–239
sounds and, 208–209 advanced, 233–234
traumatic brain injury and, 224–225 beginning, 233
treatment options/considerations, 226–227 biological theories of, 235–238
development, hearing impairment and, 93 anatomical differences, 235
in noise, 327 physiological (functional) differences, 236–237
intelligibility, 165–166 borderline, 232–233
speech breathing and, 137 children, 229
lateralization of, 112 causes of, 234–238
mechanism developmental
larynx, 138–145 diagnosis of, 231
respiratory system, 131–137 natural history of, 231–232
upper airway, 145–150 dysfluencies
motor control described, 231
maturation of, 184–185 typical, 232
stuttering and, 237–238 functional theory of, 236–237
MRI and, 31–32 genetic studies, 231
perception, 160–166 incidence of, 229–230
auditory theories of, 162–166 learning theories, 235
motor theory of, 160–166 natural history of, 230
processing, maturation of, 185–186 prevalence of, 229–230
reception threshold, 326–327 psychogenic theories of, causes of, 234
recognition, 159 speech motor control and, 237–238
science, described, 131 treatment considerations, 239
sounds, 206–207 Stuttering-like dysfluencies (SLDs), 231
development of, 181–183, 186–188 Subcortical nuclei, 24–26
errors, residual/Persistent, 209–210 Substantia nigra
explanations for mastery 183–186 dopamine and, 199
influences, 163, 165 hypokinetic dysarthria and, 201
top-down Sulci, 22
vocal tract valving and, 149 Swallowing
synthesis, 159 act of, 278–282
teachers, 1 esophageal phase, 282
Speech-language pathologist (SLP), 10, 145, 185 oral preparatory phase, 280
consonant errors and, 270 pharyngeal phase, 280–282
Index 387
analysis of, 286–289 language impairment in, 125–126

anatomy of, 277–278 nature of brain injury in, 124–125
esophagus, 277 Treacher-Collins syndrome, 273
stomach and, 278 Tube model, vocal tract, 158
breathing and, 282–283 Turn-taking, 78–79
client self-report, 287–288 Tympanic membrane, 304–305
expiration-swallow-expiration, 282–283 Tympanograms, types of, 321–323
health care team for, 288–289 Tympanometry, 320–323
measurement of, 286–289 Type A tympanogram, 321–322
nervous system control of, 283 Type Ad tympanograms, 323
variables influencing, 284–286 Type As tympanogram, 323
bolus, 284–286 Type B tympanogram, 322
Sword throats, 285 Type C tympanogram, 323
Sylvian fissure, 22, 112–113
Synapse, 21
Syndromes, clefting, 272–273
U
Syntax Underextension, language and, 71
ASD and, 90 Undergraduate curriculum, 10, 11
Down syndrome, 103 Unilateral
fragile X syndrome, 107–108 cleft lips, 265
hearing impairment and, 94 upper motor neuron dysarthria, 201
SLI/DLD and, 87 vocal fold paralysis, 254–255
Synthesis, speech, 159 Unstressed syllable deletion, 187
Upper airway
larynx and, 145–150
T velopharyngeal mechanism, 147–150
Tacoma Narrows Bridge, 297 vocal tract and, 146–147
TBI (Traumatic brain injury) Upper lip, embryological development of, 261–267
aphasia and, 124 Upper motor neuron disease (UUMN), dysarthria, 197,
childhood speech disorders and, 224–225 201
dysarthria in, 225 UUMN (Upper motor neuron disease), dysarthria, 197,
language impairment in, 125–126 201
nature of brain injury in, 124–125
Temporal
bone, 301–302
V
lobe, 23–24 Van Riper, Charles, 1, 2
Tensor tympani muscle, 306 Variegated babbling, 65
Terminal buttons, 20 VC (Vital capacity), 134
Thalamus, 24 VCFS (Children with velocardiofacial syndrome), 273
Thordardottir, Elin, 72, 74 Vegetative breathing, 133–134
Threshold ABR, 331–332 Velocardiofacial syndrome, 273
Tongue Velopharyngeal insufficiency (VPI), 269–270
advancement, tongue advancement and, 171–172 consonant articulation and, 270–272
embryological, 263 hypernasality and, 270
Top-down influences of speech sounds, 165 Velopharyngeal mechanism, 147–149
Tourette’s syndrome, basal ganglia and, 25 Velopharyngeal port (VP), 148, 268–269
Transcription VEMPs (Vestibular-evoked myogenic potentials),
phonetic, 69, 159, 166, 169, 176, 177 333–334
broad, 176 Ventral stream, 30–31
clinical implications of, 176–177 Vestibular assessment, 332–334
narrow, 177 Vestibular-evoked myogenic potentials (VEMPs),
Translational research, 5, 6 333–334
Traumatic brain injury (TBI) Vestibule, inner ear, 308
aphasia and, 124 Vestibulo-ocular reflex (VOR), 332
childhood speech disorders and, 224–225 Vestibulospinal reflex (VSR), 333
dysarthria in, 225 Vibration, natural frequency of, 297
Videofluoroscopy, swallowing and, 286–287 loudness and, 246

Videonystagmography (VNG), 333 muscle tension dysphonia, 252–254
Vihman, M.M., 59–60 neurological, 254–256
Visual Analog Scale, 165–166 organic, 252
Vital capacity (VC), 134 parameter measurement, 245
VNG (Videonystagmography), 333 phonotrauma, 249–252
Vocabulary quality and, 246–247
12 to 18 months, 71 spasmodic dysphonia, 256
18 to 24 months, 71–72 spectrum and, 246–247
36 months, 72 vocal folds and, unilateral paralysis of, 254–255
ASD and, 90–91 Voice loudness
Down syndrome, 103–104 lung pressure and, 137
fragile X syndrome, 108 speech breathing and, 137
SLI/DLD and, 87 Voicing, consonants, 175
Vocal VOR (Vestibulo-ocular reflex), 332
folds, 139–141 Vowels, 147
adduction of, 141 American English compared to other languages, 172
benign lesions, 251 development of, 182, 188
polyps, 251 lip rounding and, 172
unilateral paralysis of, 254–255 phonetic symbols and, 170–173
viewing, 244–245 tongue height and, 170–171
nodules, 250–251 resonant frequencies of, 156–158
tract sounds of, 155–156
muscles of, 146 tongue advancement and, 171–172
shape of, 146–147 VP (Velopharyngeal port), 148, 268–269
sound production and, 59–60 VPI (Velopharyngeal insufficiency), 269–270
tube model of, 158 consonant articulation and, 270–272
valving, 149 hypernasality and, 269–270
Vocalic production, 146–147 VSR (Vestibulospinal reflex), 333
Voice
perceptual evaluation of, 244
quality of, 145
W
Voice disorders Waveform, 295
childhood, 257 spectrum and, 295–297
classification of, 247–257 Wernicke, Carl, 29, 117
functional, 252–254 Wernicke’s aphasia, 115–116, 117
hypo-hyperfunction continuum, 247–252 Wernicke’s area, 28–30
neurological, 254–257 adult language disorders and, 111
described, 243 White matter, 18
diagnosis of, 244–247 Word recognition score, 327
case history, 244 Wordplay, linguistic, 78
epidemiology of, 243–244 Words
functional, 252–254 fast mapping, 71
hypo-hyperfunction continuum, 248–249 speech sound components, 77
intensity and, 246 Work learning, phonological development and, 188

Introduction To Communication Sciences and Disorders (Weismer) - Rehabilita Shop

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Introduction To Communication Sciences and Disorders (Weismer) - Rehabilita Shop

Uploaded by

Copyright:

Available Formats

Introduction

Gary Weismer, PhD

Copyright © 2021 by Plural Publishing, Inc.

Typeset in 10/12 Palatino by Flanagan’s Publishing Services, Inc.

For permission to use material from this text, contact us by

Library of Congress Cataloging-in-Publication Data

1 Introduction to Communication Sciences 1

2 The Nervous System: Language, Speech, and

The Dominant Hemisphere and the Perisylvian Language Areas 28

4 Communication in a Multicultural Society 45

5 Preverbal Foundations of Speech and 57

6 Typical Language Development 69

7 Pediatric Language Disorders I 85

8 Pediatric Language Disorders II 99

9 Language Disorders in Adults 111

10 Speech Science I 131

11 Speech Science II 153

The Sound Filter 154

13 Typical Phonological Development 179

14 Motor Speech Disorders in Adults 191

The Dysarthrias: A Summary 201

15 Pediatric Speech Disorders I 205

16 Pediatric Speech Disorders II 219

17 Fluency Disorders 229

Psychogenic Theories 234

18 Voice Disorders 243

19 Craniofacial Anomalies 261

21 Hearing Science I: Acoustics and Psychoacoustics 293

23 Diseases of the Auditory System and 317

24 Assistive Listening Devices 343

25 Aural Habilitation and Rehabilitation 359

Outcome Measures for Adults 370

Gretchen Bennett, MA, CCC-SLP Breanna Krueger, PhD, CCC-SLP

We would build a profession independent domination of medical and Freudian perspectives on

An Interdisciplinary Field critical to language usage. The clinician and scientist

Researchers and clinicians are often trained in the same Linguistics

potential outcome variable)? At first glance, the first Levels of Evidence

Introduction to Audiology Hearing science, approaches to evaluation of

Preclinical Observation Introduction to clinical issues via lectures and

Auditory Rehabilitation Principles and techniques for auditory training of

Child Language Disorders: Assessment Child language disorders in various populations,

Table 1–4. Steps to Obtaining Certificate of Clinical Competence in Speech-

CCC-SLP Undergraduate degree (or equivalent) in accredited university program →

CCC-A Undergraduate degree (or equivalent) in accredited university program →

Introduction ology is relevant to the role of the nervous system in

Table 2–1. Selected Examples of Neuroanatomy and Neurophysiology Topics

Structure of nerve attachment to Release of neurotransmitter and its

Clusters of cells in brainstem that Movement of tongue when part of

Relative volume of auditory cortex in Observation via functional brain

nerve is part of the PNS. When the fibers of the nerve

Cerebral hemispheres Nerves to and from brainstem

Brainstem Cerebellum Nerves to and from spinal cord

Spinal cord Sensory receptors and motor endplates

Dendrites Axon terminals

Axon Synaptic cleft

Tour of Gross Neuroanatomy

The two hemispheres are connected by a massive Frontal Lobe

Figure 2–6. Selected subcortical nuclei, contained within the cere-

The Basal Ganglia and Parkinson’s Disease

Figure 2–8. A simplified view of the peripheral and central auditory

Speech and Language, Together Always

References Bostum, A. C., Dum, R. P., & Strick, P. L. (2018). Functional

Introduction he thought of as) the wonder of speech and language

cool_the_etymology_and_history_of_the_concept_ or 60-year life span. How are these examples different

harmony /hɑrməni/ phonetic symbol /ɑ/ = “ah”; /i/ =