Neural Encoding of Vocalic Sounds

in Newborns
By Sonia Arenillas-Alcón; Jordi Costa-Faidella, PhD; Teresa Ribas-Prats;

María Dolores Gómez-Roig, MD, PhD; and Carles Escera, PhD

poken language is the most prevalent form of human
communication, and the similar development of
speech perception pathways across individuals
around the globe suggests a universal biological basis
for language acquisition.1–3 However, the functional maturity
state of human innate speech perceptual abilities is not well
established yet. Is the newborn brain ready to encode the
sounds of language in all their complexity already at birth? Or,
rather, do the underlying neural mechanisms need to be stim-
ulated during the first months of life to mature? If so, what
kinds of speech sound information can newborns process in
an adult-like manner at birth, and what kinds they cannot? The
answer to all these questions may provide relevant informa-
tion to guide appropriate early interventions to alleviate future
language impairments.
Our research group ventured to shed some light on these
questions by analyzing a specific brain response derived
from EEG recordings during passive listening to speech
sounds: the frequency-following response (FFR). The FFR
provides a window into the neural encoding of several char-
acteristics of language sounds: their fundamental frequency,
which reflects the neural tracking of voice pitch, and their
temporal fine structure, which characterizes vowel identity would benefit from the massive neural plasticity during the
through formant profiles.4 An accurate encoding of these two first years of life.8–13
speech sound characteristics is essential for future language However, reality poses some challenges. To date, only re-
sponses reflecting voice pitch encoding have been recorded
optimally in newborns.14–19 Although behavioral studies seem
Newborns exhibit a mature neural to indicate that babies are able to discriminate between differ-
ent phonemes from temporal fine-structure differences,20 no
encoding of voice pitch but an available tools could so far reveal the underlying neural en-
coding mechanisms.
immature neural encoding of vowel
Pursuing the idea of developing a powerful tool to carry out
acquisition.1,5–7 Knowing the typical development of these longitudinal assessments of speech neural encoding in ba-
speech perceptual skills and their maturity state at birth would bies, we devised a novel stimulus whose internal structure
allow us to detect perceptual impairments at a very early allowed us to simultaneously assess voice pitch tracking and
stage, facilitating appropriate interventions or stimulation that formant structure encoding through the analysis of the FFR in

From left: Ms. Arenillas-Alcón is a psychologist specialized in neuropsychology. She is part of

the Brainlab-Cognitive Neuroscience Research Group at the Institute of Neurosciences of the
University of Barcelona (UBNeuro) in Spain. Her scientific interests focus on studying speech
sound encoding mechanisms in neonates. Dr. Costa-Faidella is a lecturer of cognitive neuro-
science in the department of clinical psychology and psychobiology at the University of
Barcelona and a member of the Brainlab team at UBNeuro. Ms. Ribas-Prats is a psychologist
and an expert in special education and neurosciences. She is also a member of the Brainlab
team at UBNeuro. Dr. Gómez-Roig is the head of the department of obstetrics and gynecology at SJD Barcelona Children’s Hospital and the coordinator of
general obstetrics at SJD Barcelona Children’s Hospital’s Centre for Maternal Fetal and Neonatal Medicine. Dr. Escera is a professor of cognitive neuroscience
in the department of clinical psychology and psychobiology at the University of Barcelona, the principal investigator of the Brainlab, and a member of
UBNeuro and the Academia Europeae. All authors are affiliated with the Institut de Recerca Sant Joan de Déu in Esplugues de Llobregat, Barcelona.

10 The Hearing Journal  July 2021


frequency that characterizes the moth-

er’s womb acting as a sound filter,5,22–24
which suggests that adult-like pro-
cessing of vowel formants defined by
higher spectral frequencies depends
Research groups around the world have
carried out a wealth of studies in infants
that relate an abnormal speech encod-
ing with delays and impairments in the
development of literacy skills, both in
reading and writing.25–27 Changes in
voice pitch contour are crucial in pho-
neme acquisition in tonal languages
Figure 1. Distribution of signal-to-noise ratios (SNRs) for adults and newborns at the such as Mandarin, but not really in non-
three frequency peaks of interest, extracted from the FFR. In each plot, a black-filled dia- tonal languages like English or Spanish,
mond indicates the mean; a horizontal black line, the median; and vertical lines, the inter- in which formant structure determines
quartile range. A: The SNR at the fundamental frequency of the stimulus (F0 = 113 Hz) vowel identity4,28 and voice pitch encod-
was not different between adults and newborns, suggesting that babies exhibit adult-like ing is more related to facilitate speaker
pitch encoding skills. B, left: the SNR at the /o/ vowel first formant (F1 = 452 Hz) was
recognition.29 Those studies could not
higher during the /o/ section than the /a/ section in both groups. B, right: only in adults,
the SNR at the /a/ vowel first formant (F1 = 678 Hz) was higher during the /a/ section
reliably show differences in the neural
than the /o/ section, while newborns exhibited SNRs close to 0, suggesting a deficient encoding of temporal fine structure,
encoding of higher frequency formants. mainly because the typical consonant-
vowel stimuli used posed several tech-
a recording lasting only 30 minutes. The stimulus features two nical limitations.
different vowels (Spanish /o/ and /a/) and a rising pitch end- Considering our results represent the first snapshot into
ing (/oaá/). The short recording time is compatible with con- the human brain’s functional state of maturation regarding the
straints typical of hospital settings and represents almost a encoding of the temporal fine structure of speech sounds,
50% duration reduction compared to usual speech sound they involve a key step in the establishment of the FFR as a
discrimination studies.14,21 potential biomarker for the early detection of possible future
For our study, we recruited 34 healthy term newborns (17 literacy impairments.13,17 Likewise, they leave many open
females; mean gestational age = 40.19 ± 1.08 weeks; aged questions and new exciting avenues for future research.
14-78 h after birth) from the SJD Barcelona Children’s Hospi- We need to know the developmental sequence of vowel
tal. Obstetric pathologies, high-risk gestations, and other risk formant neural encoding throughout normal child growth and
factors related to hearing impairments were considered ex- relate it to the anatomical development of the auditory system.
cluding factors. For comparison purposes, we also recruited This will provide a normative frame with which we could com-
a sample of 18 healthy young adults (14 females; mean age pare neurophysiological responses of babies born with charac-
= 26.94 ± 3.78 years) with no self-reported history of neuro- teristics that put them at risk to develop language impairments
logical, psychiatric, or hearing impairments. and even other neurodevelopmental disorders such as dyslexia
FFRs to the /oaá/ stimulus were recorded from sleeping or autism.30,31 Also, such a longitudinal assessment will cer-
newborns and awake, relaxed-with-eyes-closed adult partici- tainly reveal sensitive periods that could become targets to
pants. Our analyses revealed: (1) no differences in the neural compensate for these putative future language impairments, or
encoding of voice pitch between age groups (see Fig. 1A); even accelerate maturation by taking advantage of the extraor-
(2) a weaker but present neural encoding of vowel formants dinary brain plasticity that characterizes the first two years of
with lower frequency spectral content (below 500 Hz) in life.
newborns as compared with adults (Fig. 1B, left); and (3) an As a first step, we are retesting our study participants who
absent neural representation of vowel formants with higher are now nearly 2 years old. Our main goal is to relate the
frequency spectral content (above 500 Hz) in newborns (Fig. maturation speed of vowel formant encoding to observable
1B, right). deficits or delays in current language skills, pursuing the de-
Thus, our results confirm previous findings showing that finitive establishment of the FFR as a biomarker of language
the skills of neonates to encode voice pitch are no different development.
than those exhibited by adults, but crucially indicate that the
neural encoding of vowel formants is not yet fully mature at
birth, especially for formants with a spectral content above
References for this article can be found at
500 Hz. Intriguingly, 500 Hz is the assumed low-pass cutoff

July 2021 The Hearing Journal 11

