Nihms 1772536

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 22

HHS Public Access

Author manuscript
J Fluency Disord. Author manuscript; available in PMC 2023 March 01.
Author Manuscript

Published in final edited form as:


J Fluency Disord. 2022 March ; 71: 105896. doi:10.1016/j.jfludis.2022.105896.

The Effect of Gap Duration on the Perception of Fluent Versus


Disfluent Speech
Haley J. Warnera,*, D.H. Whalenb,c,d, Daphna Harele, Eric S. Jackson, PhD, CCC-SLPa
aDepartment of Communicative Sciences and Disorders, New York University, 665 Broadway, 9th
Floor, New York, NY 10012
bProgram in Speech-Language-Hearing Sciences, City University of New York, 365 Fifth Ave.,
Author Manuscript

New York, NY 10016


cHaskins Laboratories, 300 George St., Suite 900, New Haven, CT 06511
dYale University, New Haven, CT
eDepartment of Applied Statistics, Social Science and Humanities, New York University, Kimball,
246 Greene Street, 3rd Floor, New York, NY 10003

Abstract
Purpose—Gap duration contributes to the perception of utterances as fluent or disfluent, but few
studies have systematically investigated the impact of gap duration on fluency judgments. The
purposes of this study were to determine how gaps impact disfluency perception, and how listener
Author Manuscript

background and experience impact these judgments.

Methods—Sixty participants (20 adults who stutter [AWS], 20 speech-language pathologists


[SLPs], and 20 naïve listeners) listened to four tokens of the utterance, “Buy Bobby a
puppy,” produced at typical speech rates. The gap duration between “Buy” and “Bobby” was
systematically manipulated with gaps ranging from 23.59 ms to 325.44 ms. Participants identified
stimuli as fluent or disfluent.

Results—The disfluency threshold – the point at which 50% of trials were categorized as
disfluent – occurred at a gap duration of 126.46 ms, across all participants and tokens. The SLPs
exhibited higher disfluency thresholds than the AWS and the naïve listeners.

Conclusion—This study determined, based on the specific set of stimuli used, when the
perception of utterances tends to shift from fluent to disfluent. Group differences indicated that
Author Manuscript

SLPs are less inclined to identify disfluencies in speech potentially because they aim to be less
critical of speech that deviates from “typical.”

*
Corresponding author:.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our
customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review
of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered
which could affect the content, and all legal disclaimers that apply to the journal pertain.
Warner et al. Page 2

Keywords
Author Manuscript

fluency; disfluency; stuttering; gap duration

1. Introduction
Fluency refers to the uninterrupted, automatic, effortless, and continuous flow of speech
(Fillmore, 1979; Starkweather, 1987). A breakdown in any of these characteristics could
lead to speech being perceived as disfluent. For example, a short hesitation or pause during
speech may contribute to the perception of disfluency by the listener. However, identifying
fluent versus disfluent speech presents challenges because fluency operates on a continuum
and is based on subjective assessment which is impacted by listener background and life
experiences (Brundage et al., 2006; Cordes et al., 1992; Finn & Ingham, 1989). This has
Author Manuscript

important research and clinical implications for identifying disorders such as stuttering,
which can rely on the classification of fluent and disfluent speech. To begin to delimit how
listeners perceive an utterance as fluent versus disfluent, the current study examined the
impact of one parameter of fluency, gap duration, on perceptions of fluent versus disfluent
speech within specific range and frequency parameters (e.g., Durlach & Braida, 1969;
Parducci, 2012; Parducci et al., 1960). In addition, this study assessed the impact of listener
background and experience on the perception of fluent versus disfluent speech.

1.1. What determines whether a listener perceives an utterance as fluent or disfluent?


Studies examining fluency in stuttering or in second language use, show that speech rate,
stress timing, repairs, and gap duration and frequency (i.e., silent and filled pauses) are
all linked to the perception of utterance fluency (Bosker et al., 2014; Cucchiarini et al.,
Author Manuscript

2010; Duez, 1982, 1985; Goldman-Eisler, 1961; Krivokapić, 2007; Love & Jeffress, 1971;
Trofimovich & Baker, 2006). For instance, studies of non-native speakers that examined gap
duration and its impact on fluency typically examined many variables (e.g., interjections) in
addition to gap duration, to determine which variables contribute to global fluency ratings
(Bosker et al., 2014; Cucchiarini et al., 2010). Listeners take into account many, if not all of
these characteristics when assessing whether the utterance they heard was fluent or disfluent
(Bosker et al., 2014; Segalowitz, 2010). However, in the context of speech disorders such
as stuttering, some of these characteristics may be more salient than others. Gap duration
is a particularly salient variable in judging fluency, and listeners are able to perceive gaps
easily (Few & Lingwall, 1972; Love & Jeffress, 1971; Martin & Strange, 1968; Prosek &
Runyan, 1983). Gap duration refers to the length of time between syllables, represented
by the absence (or low level) of acoustic energy (Lickley, 1994). As the overt features of
Author Manuscript

stuttering include syllable repetitions and audible and inaudible prolongations of sounds,
extended gaps in the speech of people who stutter (i.e., blocks) are inevitable. However, not
all gaps contribute to the perception of disfluency. Rather, it is those gaps that disrupt the
flow of speech that cause listeners to perceive speech as disfluent (Lövgren & Doorn, 2005).
These gaps are referred to as hesitations or within-constituent pauses (Duez, 1982, 1985;
Ruder & Jensen, 1972) and typically occur at non-syntactic boundaries (where pausing is
intrinsically more variable). The current study focused on the impact of gap duration on

J Fluency Disord. Author manuscript; available in PMC 2023 March 01.


Warner et al. Page 3

perceptions of fluent and disfluent speech based on utterances produced at a typical speaking
Author Manuscript

rate.

Most studies of the impact of gap duration on fluency perception focused on the observably
“fluent” speech of speakers who stutter to determine whether pauses contributed to listeners
judging entire speech samples as fluent or disfluent. Some studies found that listeners
(including trained listeners) were unable to distinguish between the fluent speech of
stuttering and non-stuttering speakers based on gap duration (Few & Lingwall, 1972;
Krikorian & Runyan, 1983; Love & Jeffress, 1971). Other studies found listeners were able
to distinguish between the fluent speech of adults who stutter (AWS) and adults who do not
stutter (AWNS) based on gap duration (Brown & Colcord, 1987; Howell & Wingfield, 1990;
Young, 1984). Studies of both stuttering and non-native speakers have typically included
only gap durations greater than 200 ms, which is problematic because listeners are able to
perceive gaps in speech below 150 ms (Campione & Véronis, 2002; Hieke et al., 1983).
Author Manuscript

None of these studies systematically manipulated gap duration to determine the specific
impact of gap duration on the perception of disfluency.

One study systematically manipulated gap duration. Lövgren and Doorn (2005) collected
short speech samples containing natural silent pauses, from various news programs.
These natural pauses were systematically manipulated resulting in four conditions: C1(un-
manipulated) to C4 (longest durations). With four source utterances and four pause
conditions per sample, this resulted in 16 samples. Mean gap durations, per condition, from
the 16 speech samples were: C1 = 57 ms; C2 = 117 ms; C3 = 212 ms; C4 = 403 ms.
Participants judged whether each speech sample was fluent or disfluent in a forced-choice
task. Results indicated that C2 was judged disfluent 34% of the time, whereas C3 was
judged disfluent 72% of the time. Lövgren and Doorn (2005) reported a gap duration
Author Manuscript

threshold for fluency between 117 and 212 ms. This threshold could have been impacted by
context (e.g., speech rate, prosody) and range effects, indicating that the threshold is specific
to the gap durations included in that study (e.g., Durlach & Braida, 1969; Parducci, 2012;
Parducci et al., 1960). However, the methodology was somewhat coarse-grained as using the
mean gap duration of separate instances of pauses within a 15–20 s speech sample precludes
examining the impact of specific gap durations on fluency perception. Thus, an examination
of the impact of gap duration on fluency perception using a more fine-grained manipulation
for a single, short utterance is warranted.

1.2. The Impact of Listener Background and Experience on Fluency Judgments


Different listener backgrounds and experience impact perceptions of disfluency. This is
due to previous experience hearing different communication styles or impaired speakers
Author Manuscript

(e.g., speakers who stutter), as well as to differences in professional training (e.g., naïve
listeners vs. speech-language pathologists [SLPs]). For example, Cordes et al. (1992)
studied the impact of differing amounts of clinical experience (i.e., undergraduate students,
graduate students, experienced clinicians) on fluency perception. The undergraduate
students’ previous experience with stuttering only included a few hours of lecture in an
introductory speech and hearing course, which included viewing approximately one half-
hour of videotaped people who stutter. The graduate student group included students from

J Fluency Disord. Author manuscript; available in PMC 2023 March 01.


Warner et al. Page 4

the speech-language pathology department. Experienced clinicians included clinicians and


Author Manuscript

researchers with extensive experience in stuttering (i.e., a combination of doctoral students,


clinical supervisors and professors specializing in stuttering). Cordes et al. (1992) reported
that the graduate student group and the experienced clinician group exhibited similar
increased agreement in identifying disfluencies, when compared to the undergraduate group.
Additionally, the graduate students and the experienced clinicians identified more subtle
stuttering events (i.e., stuttering events lasting less than 200 ms), versus the undergraduate
group. These results signify that additional experience leads to changes in identifying
disfluencies only when compared to a novice group, the undergraduate student group.
Furthermore, in a comparison of students, speech-language pathologists (not stuttering
experts), and highly experienced judges, Brundage et al. (2006) found that students and
non-expert SLPs did not exhibit differences in intrajudge and interjudge agreement, and
that both groups identified less than 50% as much stuttering as highly experienced judges.
Author Manuscript

These findings indicated that consistency in stuttering judgments and accuracy in identifying
stuttering only improves by judges who are experts in the field. The student group comprised
university students enrolled in Communicative Sciences and Disorders classes; the majority
of these students had not taken a class related to stuttering. The non-expert SLPs were
American Speech-Language-Hearing Association (ASHA) certified SLPs and had a range of
clinical experience, between 1.5–30 years. The experienced judges were deemed stuttering
specialists they had a publication record in the area of stuttering spanning more than eight
years (Cordes and Ingham, 1995). Interestingly, AWS are more likely to report disfluencies
than AWNS (Lickley et al., 2005). For example, when AWS and AWNS listen to utterances
containing single instances of disfluency, AWS are more likely to identify speech as
disfluent, regardless of whether that speech is produced by an AWS or an AWNS. These
findings suggest that the life experiences of AWS cause them to have more sensitive criteria
for perceiving disfluencies in speech than AWNS. AWS may have increased motivation to
Author Manuscript

listen to speech, versus other groups of listeners, which may have impacted their speech
perception skills. Comparing gap duration thresholds between AWS, SLPs, and naïve
listeners will elucidate the impact of listener background and experience on perceiving
utterances as fluent or disfluent.

1.3. Purpose
The purposes of this study were 1) to estimate a gap duration threshold for disfluency
perception (the point in time at which a gap typically leads to perceptions of disfluent
speech), for material consisting of short utterances with typical speech rate, which contain
one manipulated gap within a specific duration range (i.e., 23.59 ms - 325.44 ms), and 2)
to determine the impact of listener experience and background on disfluency perception. As
Author Manuscript

no previous studies have found a threshold for disfluency including the range and frequency
of gaps used in the present study, our first research question was exploratory, and we did
not hypothesize a gap duration threshold for disfluency. Concerning the second research
question, we tested the impact of listener background and experience on fluency perception
by comparing three groups: SLPs, AWS, and naïve listeners. Based on Lickley et al. (2005),
we hypothesized that AWS would have the lowest gap threshold. Based on Brundage et al.
(2006), we hypothesized that SLPs and naïve listeners would exhibit similar thresholds.

J Fluency Disord. Author manuscript; available in PMC 2023 March 01.


Warner et al. Page 5

2. Methods
Author Manuscript

2.1. Participants
This study was approved by the Institutional Review Board at New York University (NYU).
Participants provided informed consent to participate. Sixty participants ranging in age from
18 to 50-years-old (M=28.9, SD=6.0) comprised three groups: 1) 20 ASHA certified SLPs
(19 women; mean age =33.6, SD=5.1); 2) 20 AWS (six women; mean age =28, SD=6.9);
and 3) 20 naïve listeners defined as neurotypical adults who were not SLPs or AWS (10
women; mean age =25.8, SD=1.4). Gender ratios aimed to be reflective of gender ratios
within each participant group. The SLP participant group was predominantly female as
96% of SLPs are female (ASHA, 2019), the AWS group was primarily male as AWS are
predominantly male (Yairi et al., 1996), and the naïve group was evenly distributed between
male and female participants. All participants were self-reported native speakers of Standard
Author Manuscript

American English (i.e., learned it before age six and spoke it daily). Participants had typical
hearing, confirmed by a hearing screening on the day of the experiment. Participants passed
the hearing screening at 25 dB HL at three frequencies: 500Hz, 2000Hz, and 8000Hz,
modeled after the standard NYU clinical hearing screening protocol. One participant did
not respond to pure tones at 8000Hz during the screening, however, was still included
in the experiment because the acoustic parameter of interest did not depend on the high
frequencies. Participants reported a negative history of neurological and speech-language
impairment, except for the subgroup of AWS who reported stuttering. The SLP participants
were all ASHA certified speech-language pathologists whose years of clinical practice
ranged from 2–23 years (M=9.0, SD=5.2). SLP participants had a variety of clinical
experience with fluency disorders, however none specialized in stuttering; in addition, none
were AWS. The AWS were recruited from the fourth author’s research database of stutterers,
and all were previously diagnosed as AWS by a certified SLP who specializes in stuttering
Author Manuscript

intervention. Naïve and SLP participants were recruited via word of mouth as well as flyers
posted around the NYU campus.

2.2. Stimuli
Stimuli were generated using two fluent and two disfluent versions of the utterance, “Buy
Bobby a puppy.” The original and unmanipulated utterances (i.e., tokens) were collected
from a previous study (Jackson, Tiede, Beal, et al., 2016). Four tokens were used to increase
generalizability of the findings. These tokens were selected in part because they exhibit
a typical speech rate, comparable to the “normal rate” condition in Smith and Kleinow
(2000; see their Figure 1). Oscillograms and spectrograms for the four tokens are provided
in Appendix B. Token 1 and Token 2 were produced by a female AWS and a male AWNS,
Author Manuscript

respectively. Token 3 and Token 4 were produced by the same female AWS and a different
male AWNS. The disfluent tokens, Token 3 and Token 4, had a naturally occurring longer
gap between “Buy” and “Bobby” than the fluent tokens, Token 1 and Token 2. The fluent
and disfluent tokens were categorized by an SLP with five years of experience and expertise
in stuttering intervention and confirmed by two additional SLPs. The gaps occurred between
the words “Buy” and “Bobby,” and this position was used for all tokens. The gap began at
the conclusion of the diphthong /aɪ/ in “Buy” and ending at the stop release for the first /b/ in
“Bobby,” confirmed by no visible acoustic energy on the spectrogram. No other disfluencies

J Fluency Disord. Author manuscript; available in PMC 2023 March 01.


Warner et al. Page 6

occurred within the tokens besides the gap between the words “Buy” and “Bobby.” These
Author Manuscript

tokens were manipulated by adding or deleting silence using Audacity® (Audacity Team,
1999). Length of gap duration was systematically manipulated in increments of 20 ms across
the four tokens, except for the gap in Token 4, which was first decreased by 150 ms and
then 20 ms in subsequent manipulations. The purpose of reducing the gap duration for
Token 4 first by 150 ms was to increase the similarity in speech rate between Token 4 and
Tokens 1–3. Gap times were increased for the fluent tokens and decreased for the disfluent
tokens. Appendix A includes utterance durations and manipulated gap durations both as
absolute values and as percentages of total utterance durations. Calculating gap duration as
a percentage of the total utterance accounted for potential differences in speaker rate across
samples.

2.3. Procedure
Author Manuscript

Each experiment lasted approximately 30 minutes. The experiment was self-paced and took
place in a quiet location of the participants’ choosing (e.g., an office, their home) or in
a research room in the NYU Communicative Sciences and Disorders Speech-Language
Hearing Clinic. For the duration of the experiment, participants were seated in front of a
MacBook Air 13-inch laptop computer (Apple). To minimize distraction, a plastic keyboard
cover hid all laptop keys except for the “D” and “F” keys. Participants used the “D” laptop
key to indicate if the utterance was disfluent and the “F” key to indicate fluent. The test
procedure was created using the software PsychoPy v3.0.0 (Peirce & MacAskill, 2018).
Before beginning the experiment, participants were verbally provided with the following
instructions and definition of fluency:

Disfluency (versus fluency) can be reflected as pauses or hesitations in speech, for


example, in the middle of a word or phrase. Disfluencies are sometimes obvious,
Author Manuscript

sometimes they are very subtle. You will be hearing several versions of the same
utterance, “Buy Bobby a puppy.” After you hear an utterance please decide if
what you heard was fluent or disfluent by pressing the “D” or “F” keys on the
laptop. These instructions will be on the screen when you begin. Do you have any
questions?

Whereas other definitions of fluency typically include additional characteristics (e.g.,


automaticity, continuity, effort; Fillmore, 1979; Starkweather, 1987), the provided definition
focused on pauses and hesitations, because gap duration was the only marker of disfluency,
and the only parameter that was manipulated. The same instructions were provided on
the computer screen for participants to read until the stimuli were presented. Participants
completed five practice trials and then the stimuli were presented in a randomized order. For
each trial, the different versions of “Buy Bobby a puppy” were presented and followed by
Author Manuscript

the response screen (i.e., a grey screen with the white text “Disfluent (D) or Fluent (F)?”).
The participant was prompted to select ‘D’ or ‘F’ on the laptop. Eight versions of the four
original utterances (i.e., the original plus seven manipulations), were repeated ten times per
participant for a total of 320 trials (i.e., 8×4×10). One naïve participant listened to only eight
versions of each utterance totaling 256 trials due to a computer error. The trials were divided
into four blocks, with a self-paced break offered in between each block. Participants had
unlimited time to select their response, although the sound files could not be replayed.

J Fluency Disord. Author manuscript; available in PMC 2023 March 01.


Warner et al. Page 7

2.4. Data Analysis


Author Manuscript

2.4.1. Disfluency threshold – Research Question 1—To examine the relationship


between gap duration and fluency perception, calculated as percent [of trials perceived as]
fluent, the Spearman correlation coefficient was determined. The variable, “percent fluent”
was calculated per manipulation, per participant. For example, if a participant marked nine
out of 10 trials fluent (versus disfluent) for the utterance containing a gap of 23.59 ms, the
“percent fluent” for that particular utterance was 0.9. To determine the disfluency threshold,
a logistic regression model was fit using the generalized linear model (glm) function in
the lme4 package (Bates et al., 2014) in R (R core team, 2014). This model included data
from all tokens, with absolute gap duration as the independent variable and response (i.e.,
fluent or disfluent) as the outcome variable. A logistic regression curve was plotted using
the ggplot package (R core team, 2014; Wickham, 2016). The disfluency threshold was
determined as the point at which the logistic regression curve crossed the 50th percentile
Author Manuscript

found using the predict function with the findInt and uniroot numerical solvers (Brent,
1973). The disfluency threshold was calculated both for gap duration in absolute length
and gap duration as a percent of the total utterance; the latter was used as a normalized
metric to account for speech rate. The 50th percentile indicated that in 50% of trials, the
utterance was categorized as disfluent. In addition, thresholds of disfluency with higher
certainty (i.e., 80%, 90%, 95%) were calculated. To compare the impact of each incremental
(20 ms) manipulation on fluency judgments, pairwise t-tests with Bonferroni correction were
applied.

2.4.2. Impact of Listener Background and Experience – Research Question


2—Pairwise t-tests were used to compare responses between the AWS, the SLPs, and the
naïve listeners within tokens. Additionally, disfluency thresholds were determined per group
Author Manuscript

and per token using a glm model function with gap duration as the independent variable
and response (i.e., fluent or disfluent) as the outcome variable. The predict function with the
findInt and uniroot numerical solvers was used to determine disfluency thresholds (Brent,
1973).

Group differences across tokens were compared using the generalized linear mixed model
(glmer) function in the lme4 package (Bates et al., 2014), and p-values were calculated
using the ImerTest package using the Satterwaithe approximation for the degrees of freedom
of the T distribution (Kuznetsova et al., 2017) for R. A mixed-effects logistic regression
model was chosen to further assess group differences as it allows for the test of several
variables on a binary outcome variable (i.e., fluent or disfluent) while incorporating fixed
and random effects. Our model included response (i.e., fluent or disfluent) as the outcome
variable with absolute gap duration and group as fixed factors. We used a manual stepwise
Author Manuscript

regression procedure to select the fixed effects of our model. This model fitting procedure
involves testing each variable in an additive manner and comparing Akaike information
criterion (AIC) values. Random intercepts were included for the variables participant and
token as both are repeated measures within our data (Harel & McAllister, 2019). Including
participant and token as random factors accounts for within-participant and within-token
variability. Additionally, the model fit was estimated by calculating R-squared values
(Nakagawa et al., 2017) using the r.squaredGLMM function in the MuMIn package (Bartoń,

J Fluency Disord. Author manuscript; available in PMC 2023 March 01.


Warner et al. Page 8

2020). Group comparisons included AWS versus SLPs, AWS versus Naives, and Naives
Author Manuscript

versus SLPs, using a releveling procedure. Lastly, the kappa statistic was used for each
group to test interrater reliability of fluency judgments between groups (McHugh, 2012).
See supplemental material for statistical analyses.

3. Results
3.1. Disfluency threshold – Research Question 1
Unsurprisingly, there was a negative correlation between gap duration and percent
designated as fluent across all tokens (r = −.688, p < .001). The 50% disfluency threshold
occurred at a gap duration of 126.46 ms (or 11.52% of the total utterance) across all
participants and tokens, as shown in Figure 1. This disfluency threshold indicates a gap
duration at which 50% of the utterances were categorized as disfluent. The 80% fluent
and disfluent thresholds were indicated by gap durations of 67.41 ms (6.66% of the total
Author Manuscript

utterance) and 185.50 ms (16.38% of the total utterance), respectively. The 90% fluent
and disfluent thresholds were indicated by gap durations of 32.88 ms (3.82% of the total
utterance) and 220.04 ms (19.21% of the total utterance), respectively. The 95% fluent
and disfluent thresholds were indicated by gap durations of 24.14 ms (2.57% of the total
utterance) and 251.87 ms (21.83% of the total utterance), respectively.

Significant differences were primarily found in responses (fluent/disfluent) between


consecutive gap manipulations at or around our proposed disfluency threshold (i.e., the
point in time where judgements tended to shift from fluent to disfluent (Table 1). For Token
1, significant differences were found between consecutive gap manipulations 3 (63.59 ms)
and 4 (83.59 ms, t = 4.76, p = .001, 95% CI [0.107, 0.260]), 4 and 5 (103.59 ms, t = 3.06, p
= .010, 95% CI [0.055, 0.255]), and 5 and 6 (123.59 ms; t = 2.81, p = .009, 95% CI [0.046,
Author Manuscript

0.266]), after Bonferroni correction for multiple comparisons. For Token 2, a significant
difference was observed between consecutive manipulations 4 (98.2 ms) and 5 (118.2 ms; t
= 2.86, p = .100, 95% CI [0.045, 0.248]). For Token 3, a significant difference was found
between gap manipulation 3 (145.33 ms) and 4 (125.33 ms; t = 2.97, p = .009, 95% CI
[0.052, 0.262]). For Token 4, significant differences were found between manipulation 2
(175.44 ms) and 3 (125.44 ms; t = 4.07, p = .001, 95% CI [0.049, 0.210]) and 5 (85.44 ms)
and 6 (65.44 ms; t = 3.14, p = .006, 95% CI [0.112, 0.221]).

3.2. Group differences – Research Question 2


As shown in Table 2, the within-token t-tests revealed significant differences between SLPs
and AWS for three out of four tokens, and differences between naïve listeners and SLPs for
two out of four tokens. No differences were observed between the AWS and naïve groups.
Author Manuscript

The SLPs consistently had the highest thresholds for disfluency and the AWS had the lowest
thresholds (Table 2). Figure 2 depicts the means of percentages of items selected fluent
for each manipulation against gap duration for each token for each group with error bars
representing standard error of the mean. Visual inspection of the data indicated that SLPs
required longer gap times to mark utterances as disfluent in all tokens aside from Token 2,
where responses were similar between groups.

J Fluency Disord. Author manuscript; available in PMC 2023 March 01.


Warner et al. Page 9

The mixed model, which accounted for within participant and token variability, revealed
Author Manuscript

that the SLPs marked fewer utterances as disfluent than the AWS as gap duration increased
(β = 0.580, z = 2.047, p = .042, 95% CI [0.017, 1.552]). There was no evidence of a
difference between naïve listeners and the AWS group (β = 0.174, z = 0.615, p = .538, 95%
CI [−0.518, 1.016]) as well as the SLP group (β = 0.406, z = 1.431, p = .152, 95% CI
[−0.518, 1.016]). Regarding model fit, our model led to a marginal R-squared value of 0.391
and a conditional R-squared value of 0.590.

Regarding consistency within groups, Cohen’s kappa indicated that AWS exhibited the
greatest consistency in fluency judgements (kappa = 0.54) as compared with the naïve group
(kappa = 0.45) and SLPs (kappa = 0.44). The kappa values for all three groups indicate a
moderate level of agreement as they fall between 0.41–0.60 (McHugh, 2012).
Author Manuscript

4. Discussion
The purposes of this study were to determine a gap duration threshold for when speech
perception is likely to shift from fluent to disfluent within a particular set of stimuli, and to
assess the impact of listener background and experience on this threshold. Judgments were
made based on short utterances with typical speech rates containing gaps with range and
frequency parameters specific to this study. Disfluency perception operates on a continuum,
however across groups and tokens, judgments tended to change from fluent to disfluent at
a gap duration of 126.46 ms or 11.52% of the total duration of the utterance. The SLPs
exhibited higher disfluency thresholds than the AWS. There was some evidence that the
naïve group exhibited lower thresholds than SLPs, but this result was token-dependent
and impacted by within-participant variation. These findings are discussed in greater detail
below.
Author Manuscript

4.1. Disfluency threshold – Research Question 1


Gap duration, particularly at non-syntactic boundaries, is a critical indicator of whether
speech is perceived as fluent or disfluent. The current study systematically manipulated gap
duration of a single utterance and was therefore able to estimate a point in time, based on
the current set of stimuli, at which judgments tended to shift from fluent to disfluent. We
found that utterances containing gaps of approximately 126 ms or greater are likely to be
perceived as disfluent. In the only previous study to manipulate gap duration, Lövgren and
Doorn (2005) found that short speech samples with pauses ranging from 98 – 479 ms were
judged to be disfluent significantly more than speech samples with pauses ranging from 27 –
100 ms. While Lövgren and Doorn (2005) systematically manipulated gap duration, fluency
judgments were based on entire speech samples (15–20 s in length), within which there were
Author Manuscript

gaps of differing durations. Instead, the utterances in the present study only contained one
gap, allowing for a more specific disfluency threshold to be proposed.

Interestingly, 20 ms increases in gap duration, specifically those in close proximity to gap


duration thresholds, had significant impacts on disfluency judgments. For example, a shift
from a gap duration of 103.59 ms to 123.59 ms for Token 1 contributed to significantly
more disfluent judgments, but a shift from a gap duration of 143.59 ms to 163.59 ms did
not contribute to significantly more disfluent judgments. Thus, utterances containing gaps

J Fluency Disord. Author manuscript; available in PMC 2023 March 01.


Warner et al. Page 10

close to the proposed threshold are more ambiguous, which is in line with previous work on
Author Manuscript

categorical perception (Rosen & Howell, 1981). It may be that utterances with gap durations
near the disfluency threshold, as identified in the current study, represent tenuous fluency
(Adams & Runyan, 1981), or speech that appears fluent but is on the cusp of breaking
down and may only be identifiable using physiological measurements (e.g., motion tracking,
acoustic measurements). Adding a threshold as that determined in the current study may
improve investigators’ ability to identify instances of tenuous fluency.

Importantly, fluency judgments are driven by context, and therefore, it is unlikely that there
is a fixed point threshold when judgments shift from fluent to disfluent. For example,
changes in speaking rate may impact fluency judgments. The tokens in the current study
were in part chosen because they reflect a typical speaking rate (i.e., approximately 6
syllables/second). These rates are comparable to the “normal” rate of the same utterance
(“Buy Bobby a puppy”) used in Smith and Kleinow (2000). Further, we included gap
Author Manuscript

duration as a percentage of utterance duration to account for the subtle changes in speech
rate across the utterances. These results mirrored those for the absolute measure, suggesting
that disfluency perception in the present study may be based on absolute time and not
dependent upon speech rate. Future studies could examine greater ranges of speech rate or
duration – for example, “slow” and “fast” rates as in Smith and Kleinow (2000).

In addition, the disfluency threshold determined here only applies to the range and frequency
of gap durations used in the stimuli. That is, a different range of gap durations may have
yielded a different gap duration threshold. It should be noted, however, that range effects
observed in previous studies have been due to responses being at the center of the range and
impacted by the order of stimuli presented (e.g., Bailey et al., 1977; Brady & Darwin, 1978).
Our trials were presented randomly and the disfluency threshold was not in the middle of
Author Manuscript

the range of gap durations (i.e., median= 104.46 ms). Further, while all utterance versions
were presented the same number of times, disfluency perception in natural, spontaneous
speech could be influenced by frequency effects. Pauses in spontaneous speech will not
have the same frequency distribution for durations as in the current study and this could
affect threshold estimates (e.g., Durlach & Braida, 1969; Parducci, 2012; Parducci et al.,
1960). The threshold determined in the current study may not reflect a threshold determined
during spontaneous speech. Additionally, the gap in the current study occurred in the same
place between the same two words for all stimuli (i.e., the gap began at the diphthong /aɪ/
in “Buy” and ended at the stop release for the first /b/ in “Bobby”). Future work should
examine the perception of disfluency including gaps of differing quantities and durations, as
well as examine gaps between other types of phonemes or gaps in additional word positions
and within different utterances.
Author Manuscript

The threshold proposed in this study could be used in conjunction with other parameters
of fluency (e.g., speech rate, prosody) as a step in understanding whether speech is likely
to be perceived as fluent or disfluent. When connected speech with gap durations near
the disfluency threshold (i.e., 126 ms) are included in research, the speech samples should
be examined with increased attention as fluency judgments may shift around gaps of this
duration. Utterances with gaps around 126 ms are not likely to be unanimously perceived
as fluent or disfluent. For example, this threshold is relevant for research that compares the

J Fluency Disord. Author manuscript; available in PMC 2023 March 01.


Warner et al. Page 11

speech motor skills of AWS and AWNS which typically excludes disfluent speech from
Author Manuscript

analysis (e.g., Few & Lingwall, 1972; Healey & Ramig, 1986; Love & Jeffress, 1971; Peters
et al., 1989; Smith et al., 2010; Young, 1964; Zimmermann & Hanley, 1983). Objective
measures of disfluency are especially important for research examining speech variability
over repeated trials, because variations in the timing of trials, such as insertions of pauses,
can impact variability measures (e.g., Jackson, Tiede, Riley, et al., 2016; Kleinow & Smith,
2000; Lucero, 2005). Studies with the goal of examining subtle speech differences between
AWS and AWNS could possibly benefit from considering the gap duration thresholds
proposed in the current study.

4.2. Group differences – Research Question 2


The second aim of this study was to determine if listeners with different backgrounds differ
in their perceptions of disfluency. The SLPs exhibited higher disfluency thresholds than the
Author Manuscript

AWS, and there was some evidence that they exhibited higher thresholds than the naïve
listeners. It appears that the professional training of SLPs does not make them more likely
to report speech disfluencies. This is a curious finding. The SLP participants had a range
of years of clinical experience, and none of them specialized in fluency disorders. Our
findings are consistent with Brundage et al. (2006), which showed that only SLPs with
expertise in fluency disorders are more skilled at identifying disfluencies. The focus of
SLP education is generally on atypical or stutter-like disfluencies, as opposed to typical
disfluencies. This focus on atypical disfluencies could have led generalist SLPs to disregard
short pauses in speech, despite the task instructions indicating that disfluency can include
quick and subtle pauses in speech. An alternative explanation is that SLPs are accustomed
to encouraging people who stutter for example, by being less critical of disfluencies. The
SLPs in the current study could have been striving to be less critical of speech that deviates
Author Manuscript

from “typical” speech, perhaps due to the integration of counseling and empathy-building in
fluency disorders courses. On the other hand, AWS and naïve listeners are not trained or do
not have this background. AWS are even often overly critical of themselves and their speech
(Lickley et al., 2005).

Regarding the differences between the SLPs and the naïve listeners, the tokens for which
there were differences between groups were produced by a female speaker (Token 1/Token
3), whereas the tokens for which there were no differences were both produced by male
speakers (Token 2/Token 4). It may be that gender contributed to the differences, or that
specific characteristics of the female speaker contributed to the differences. Given the
limited number of speakers, and that the only parameter manipulated was gap duration, we
cannot answer this question. Future work could increase the number of stimuli by using
more speakers, particularly male speakers as the majority of adult stutterers are male, as
Author Manuscript

well as manipulate other parameters (e.g., prosody) to determine their impact on fluency
perception.

4.3. Considerations
Four discrete tokens were used, therefore it is possible that other speech parameters, in
addition to gap duration, contributed to the participants’ perception of disfluency. Two
tokens were produced by an AWS and two by AWNS, and two tokens were produced by

J Fluency Disord. Author manuscript; available in PMC 2023 March 01.


Warner et al. Page 12

a male and two by a female speaker. Fundamental frequency or resonances of the speakers
Author Manuscript

could have contributed to differences in responses between tokens. Additionally, because the
majority of AWS are male, participants may be more accustomed to hearing disfluencies
in the speech of males, which may have impacted responses. Further, although the tokens
contained neutral stress patterns, gaps in speech can be perceived as a characteristic of
stress (Duez, 1985; Lickley, 1994) and it is possible that participants interpreted gaps
as markers of stress as opposed to markers of disfluency. Specifically, participants could
have perceived the gap between “Buy” and “Bobby” to indicate the following word (i.e.,
“Bobby”) as stressed versus identifying the utterance as disfluent. Future research could
examine other parameters of fluency (e.g., speech rate, prosody). For example, prosody
could be systematically manipulated while keeping other parameters of fluency consistent
(e.g., gap duration, speech rate), to determine how changes in prosody impact perception of
short utterances as fluent or disfluent.
Author Manuscript

Lastly, the provided definition of fluency in the instructions may have differentially
impacted the groups. The instructions aimed to draw participant’s attention to pauses or
hesitations in speech as gap duration was the only marker of disfluency within the stimuli,
as well as the only manipulated variable. However, due to their personal and professional
experiences, the AWS or SLPs may have judged the utterances based on pre-existing
definitions of fluency, whereas the naïve group may have judged the utterances based on
the provided definition of fluency. It is therefore possible that the differences found between
groups were impacted by listeners applying different guidelines for fluency and disfluency.

5. Conclusion
This study estimated a threshold for the perception of disfluency based on short utterances
Author Manuscript

spoken at typical rates, which contained manipulated gaps within a study-specific range.
This threshold approximates when the perception of these utterances shifted from fluent to
disfluent, based on gap duration. SLPs exhibited the highest disfluency thresholds indicating
that generalist SLPs are less inclined than naïve listeners or AWS to identify disfluencies in
speech. Future work could include greater variation in gap durations and speakers to further
study the impact of context on the perception of disfluency.

Supplementary Material
Refer to Web version on PubMed Central for supplementary material.

Acknowledgments:
Author Manuscript

The authors thank all of the individuals who participated in this study. This work was funded in part by NIH grant
DC-002717 to Haskins Laboratories.

Bios
Haley J. Warner is a doctoral student in the stuttering and vvariability (savvy) lab at New
York University and a speech-language pathologist. She received her master’s degree in
Communicative Sciences and Disorders from New York University.

J Fluency Disord. Author manuscript; available in PMC 2023 March 01.


Warner et al. Page 13

D.H. Whalen is a Distinguished Professor of Speech-Language-Hearing Sciences at the


Author Manuscript

City University of New York’s Graduate Center. He is also affiliated with the Linguistics
Departments of CUNY and of Yale University. At the independent research institute Haskins
Laboratories in New Haven, CT, he is Vice-President of Research.

Daphna Harel is an Associate Professor of Applied Statistics at the Department of Applied


Statistics, Social Science, and Humanities at New York University. Her research focuses on
modeling and measurement in a variety of health-related fields as well as the social sciences.

Eric S. Jackson, PhD, CCC-SLP is an Assistant Professor and Director of the stuttering
and vvariability (savvy) lab at NYU. His research program investigates the contextual
variability of stuttering and the factors that drive variability, including social interaction and
anticipation. Dr. Jackson is also a speech-language pathologist with more than 10 years of
experience.
Author Manuscript

Appendix A.: Stimuli, duration of gap between “Buy” and “Bobby.” With
un-altered/original productions highlighted in grey.

Token Manipulation number Sentence Duration (ms) Gap Duration (ms) Gap Duration (% of
utterance)
1 1 930 23.59 2.54%

1 2 950 43.59 4.59%

1 3 970 63.59 6.56%

1 4 990 83.59 8.44%

1 5 1010 103.59 10.26%


Author Manuscript

1 6 1030 123.59 12.00%

1 7 1050 143.59 13.68%

1 8 1070 163.59 15.29%

2 1 1053.69 38.2 3.63%

2 2 1073.69 58.2 5.42%

2 3 1093.69 78.2 7.15%


2 4 1113.69 98.2 8.82%

2 5 1133.69 118.2 10.43%

2 6 1153.69 138.2 11.98%

2 7 1173.69 158.2 13.48%

2 8 1193.69 178.2 14.93%


Author Manuscript

3 1 1139.33 185.33 16.27%

3 2 1119.33 165.33 14.77%

3 3 1099.33 145.33 13.22%

3 4 1079.33 125.33 11.61%

3 5 1059.33 105.33 9.94%

3 6 1039.33 85.33 8.21%

3 7 1019.33 65.33 6.41%

J Fluency Disord. Author manuscript; available in PMC 2023 March 01.


Warner et al. Page 14
Author Manuscript

Token Manipulation number Sentence Duration (ms) Gap Duration (ms) Gap Duration (% of
utterance)
3 8 999.33 45.33 4.54%

4 1 1276.53 325.44 25.49%

4 2 1126.53 175.44 15.57%

4 3 1076.53 125.44 11.65%

4 4 1056.53 105.44 9.98%

4 5 1036.53 85.44 8.24%

4 6 1016.53 65.44 6.44%

4 7 996.53 45.44 4.56%

4 8 976.53 25.44 2.61%


Author Manuscript

Appendix B.: Spectrograms and Oscillograms for each un-manipulated


utterance (i.e., “Buy Bobby a Puppy”).
Author Manuscript
Author Manuscript

J Fluency Disord. Author manuscript; available in PMC 2023 March 01.


Warner et al. Page 15

References
Author Manuscript

Adams MR, & Runyan CM (1981). Stuttering and fluency: Exclusive events or points on a continuum?
Journal of Fluency Disorders, 6(3), 197–218.
American Speech-Language-Hearing Association. (2019). A demographic snapshot of SLPs. The
ASHA Leader, 24(7).
Bailey PJ, Summerfield Q, & Drooman M (1977). On the identification of sine-wave analogues of
certain speech sounds. Haskins Laboratories Status Report, SR-51/52, 1–25.
Bartoń K (2020). MuMIn: Multi-Model Inference (R package version 1.43.17) [Computer software]
https://CRAN.R-project.org/package=MuMIn
Bates D, Mächler M, Bolker B, & Walker S (2014). Fitting Linear Mixed-Effects Models using lme4.
ArXiv:1406.5823 [Stat] http://arxiv.org/abs/1406.5823
Bosker HR, Quené H, Sanders T, & De Jong NH (2014). The perception of fluency in native and
nonnative speech. Language Learning, 64(3), 579–614.
Brady SA, & Darwin CJ (1978). Range effect in the perception of voicing. Journal of the Acoustical
Society of America, 63, 1556–1558.
Author Manuscript

Brent R (1973). Algorithms for Minimization without Derivatives.


Brown SL, & Colcord RD (1987). Perceptual comparisons of adolescent stutterers’ and nonstutterers’
fluent speech. Journal of Fluency Disorders, 12(6), 419–427.
Brundage SB, Bothe AK, Lengeling AN, & Evans JJ (2006). Comparing judgments of stuttering
made by students, clinicians, and highly experienced judges. Journal of Fluency Disorders, 31(4),
271–283. [PubMed: 16982086]
Campione E, & Véronis J (2002). A large-scale multilingual study of silent pause duration. Speech
Prosody 2002, International Conference.
Cordes AK, & Ingham RJ (1995). Judgments of Stuttered and Nonstuttered Intervals by Recognized
Authorities in Stuttering Research. Journal of Speech, Language, and Hearing Research, 38(1),
33–41.
Cordes AK, Ingham RJ, Frank P, & Ingham JC (1992). Time-Interval Analysis of Interjudge and
Intrajudge Agreement for Stuttering Event Judgments. Journal of Speech, Language, and Hearing
Research, 35(3), 483–494.
Author Manuscript

Cucchiarini C, Doremalen J. van, & Strik H. (2010). Fluency in non-native read and spontaneous
speech. DiSS-LPSS Joint Workshop 2010.
Duez D (1982). Silent and non-silent pauses in three speech styles. Language and Speech, 25(1),
11–28.
Duez D (1985). Perception of silent pauses in continuous speech. Language and Speech, 28(4), 377–
389. [PubMed: 3842875]
Durlach NI, & Braida LD (1969). Intensity Perception. I. Preliminary Theory of Intensity Resolution.
The Journal of the Acoustical Society of America, 46(2B), 372–383. [PubMed: 5804107]
Few LR, & Lingwall JB (1972a). A further analysis of fluency within stuttered speech. Journal of
Speech and Hearing Research, 15(2), 356–363. [PubMed: 5047873]
Few LR, & Lingwall JB (1972b). A Further Analysis of Fluency within Stuttered Speech. Journal of
Speech and Hearing Research, 15(2), 356–363. [PubMed: 5047873]
Fillmore C (1979). On fluency. Individual Differences in Language Ability and Language Behavior,
81, 85–102.
Author Manuscript

Finn P, & Ingham RJ (1989). The Selection of “Fluent” Samples in Research on Stuttering: Conceptual
and Methodological Considerations. Journal of Speech, Language, and Hearing Research, 32(2),
401–418.
Goldman-Eisler F (1961). The distribution of pause durations in speech. Language and Speech, 4(4),
232–237.
Harel D, & McAllister T (2019). Multilevel Models for Communication Sciences and Disorders.
Journal of Speech, Language, and Hearing Research, 62(4), 783–801.
Healey C, & Ramig P (1986). Acoustic Measures of Stutterers’ and Nonstutterers’ Fluency in Two
Speech Contexts. Journal of Speech Language and Hearing Research, 29(3).

J Fluency Disord. Author manuscript; available in PMC 2023 March 01.


Warner et al. Page 16

Hieke AE, Kowal S, & O’Connell DC (1983). The trouble with “articulatory” pauses. Language and
Speech, 26(3).
Author Manuscript

Howell P, & Wingfield T (1990). Perceptual and acoustic evidence for reduced fluency in the vicinity
of stuttering episodes. Language and Speech, 33(1), 31–46. [PubMed: 2283919]
Jackson ES, Tiede M, Beal D, & Whalen DH (2016). The Impact of Social–Cognitive Stress on
Speech Variability, Determinism, and Stability in Adults Who Do and Do Not Stutter. Journal of
Speech Language and Hearing Research, 59(6), 1295.
Jackson ES, Tiede M, Riley MA, & Whalen DH (2016). Recurrence Quantification Analysis of
Sentence-Level Speech Kinematics. Journal of Speech, Language, and Hearing Research, 59(6),
1315–1326.
Kleinow J, & Smith S (2000). Influences of Length and Syntactic Complexity on the Speech Motor
Stability of the Fluent Speech of Adults Who Stutter. Journal of Speech, Language, and Hearing
Research, 43(2), 548–559.
Krikorian CM, & Runyan CM (1983). A perceptual comparison: Stuttering and nonstuttering
children’s nonstuttered speech. Journal of Fluency Disorders, 8(4), 283–290.
Krivokapić J (2007). Prosodic planning: Effects of phrasal length and complexity on pause duration.
Author Manuscript

Journal of Phonetics, 35(2), 162–179. [PubMed: 18379639]


Kuznetsova A, Brockhoff PB, & Christensen R (2017). ImerTest Package: Tests in Linear Mixed
Effects Models. Journal of Statistical Software, 82(13), 1–26.
Lickley RJ (1994). Detecting disfluency in spontaneous speech [Ph.D., University of Edinburgh] http://
hdl.handle.net/1842/21358
Lickley RJ, Hartsuiker RJ, Corley M, Russell M, & Nelson R (2005). Judgment of Disfluency in
People who Stutter and People who do not Stutter: Results from Magnitude Estimation. Language
and Speech, 48(3), 299–312. [PubMed: 16416939]
Love LR, & Jeffress LA (1971). Identification of Brief Pauses in the Fluent Speech of Stutterers and
Nonstutterers. Journal of Speech and Hearing Research, 14(2), 229–240. [PubMed: 5558075]
Lövgren T, & Doorn JV (2005). Influence of manipulation of short silent pause duration on speech
fluency. Proc. DISS2005, 123–126.
Lucero JC (2005). Comparison of Measures of Variability of Speech Movement Trajectories Using
Synthetic Records. Journal of Speech, Language, 48(2), 336–344.
Author Manuscript

MacAskill MR, & Peirce JW (2018). Building Experiments in PsychoPy. Sage.


Martin JG, & Strange W (1968). The perception of hesitation in spontaneous speech. Perception &
Psychophysics, 3(6), 427–438.
McHugh ML (2012). Interrater reliability: The kappa statistic. Biochemia Medica, 22(3), 276–282.
[PubMed: 23092060]
Nakagawa S, Johnson PCD, & Schielzeth H (2017). The coefficient of determination R2 and intra-
class correlation coefficient from generalized linear mixed-effects models revisited and expanded.
11.
Parducci A (1965). Category judgment: A range-frequency model. Psychological Review, 72(6), 407–
418. [PubMed: 5852241]
Parducci A (2012). Contextual effects: A Range-Frequency Analysis. In Carterette E, Psychophysical
Judgment and Measurement (pp. 127–140). Elsevier.
Parducci A, Calfee RC, Marshall LM, & Davidson LP (1960). Context effects in judgment: Adaptation
level as a function of the mean, midpoint, and median of the stimuli. Journal of Experimental
Author Manuscript

Psychology, 60(2), 65. [PubMed: 14430377]


Peters HFM, Hulstijin W, & Starkweather CW (1989). Acoustic and Physiological Reaction Times of
Stutterers and Nonstutterers. Journal of Speech, Language, and Hearing Research, 32(3).
Prosek RA, & Runyan CM (1983). Effects of segment and pause manipulations on the identification of
treated stutterers. Journal of Speech, Language, and Hearing Research, 26(4), 510–516.
R core team. (2014). R: A language and environment for statistical computing. http://www.R-
project.org/
Robb M, & Blomgren M (1997). Analysis of F2 Transitions in the speech of stutterers and
nonstutterers. Journal of Fluency Disorders, 22(1), 1–16.

J Fluency Disord. Author manuscript; available in PMC 2023 March 01.


Warner et al. Page 17

Rosen SM, & Howell P (1981). Plucks and bows are not categorically perceived. Perception &
Psychophysics, 30(2), 156–168. [PubMed: 7301516]
Author Manuscript

Ruder KF, & Jensen PJ (1972). Fluent and hesitation pauses as a function of syntactic complexity.
Journal of Speech and Hearing Research, 15(1), 49–60. [PubMed: 5012811]
Segalowitz N (2010). Cognitive bases of second language fluency. Routledge.
Smith A, Sadagopan N, Walsh B, & Weber-Fox C (2010). Increasing phonological complexity reveals
heightened instability in inter-articulatory coordination in adults who stutter. Journal of Fluency
Disorders, 35(1), 1–18. [PubMed: 20412979]
Starkweather CW (1987). Fluency and stuttering. Prentice-Hall, Inc.
Trofimovich P, & Baker W (2006). Learning second language suprasegmentals: Effect of L2
experience on prosody and fluency characteristics of L2 speech. Studies in Second Language
Acquisition, 28(1), 1–30.
Vasic N, & Wijnen FNK (2005). Stuttering as a monitoring deficit. In Hartsuiker RJ, & Bastiaanse R,
Postma A, & Wijnen F, Phonological encoding and monitoring in normal and pathological speech
(pp. 226–247). Hove: Psychology Press.
Wickham H (2016). ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag. https://
Author Manuscript

ggplot2.tidyverse.org
Yairi E, Ambrose N, & Cox N (1996). Genetics of stuttering: A critical review. Journal of Speech and
Hearing Research, 39, 771–784. [PubMed: 8844557]
Young MA (1964). Identification of Stutterers from Recorded Samples of Their “Fluent” Speech.
Journal of Speech and Hearing Research, 7(3), 302–303. [PubMed: 14213509]
Young MA (1984). Identification of stuttering and stutterers. Nature and Treatment of Stuttering: New
Directions, 13–30.
Zimmermann GN, & Hanley JM (1983). A Cinefluorographic Investigation of Repeated Fluent
Productions of Stutterers in an Adaptation Procedure. Journal of Speech, Language, and Hearing
Research, 26(1), 35–42.
Author Manuscript
Author Manuscript

J Fluency Disord. Author manuscript; available in PMC 2023 March 01.


Warner et al. Page 18

Highlights
Author Manuscript

• We report a threshold for disfluency perception based on a fixed set of stimuli


with equal frequency, for utterances with typical speech rates.

• Adults who stutter have a lower disfluency threshold for these materials than
speech-language pathologists.

• Background and life experience impacts the perception of disfluency.


Author Manuscript
Author Manuscript
Author Manuscript

J Fluency Disord. Author manuscript; available in PMC 2023 March 01.


Warner et al. Page 19
Author Manuscript
Author Manuscript
Author Manuscript

Figure 1.
Logistic regression curve based on the generalized linear model (glm) determined by
percentages of utterances selected as fluent against gap duration (ms) across all participants
and tokens. Horizontal line at the 50th percentile for fluency/disfluency and vertical dotted
line at intercept between the curve and 50th percentile to visualize disfluency threshold.
Intercept value, or disfluency threshold, shown in figure in milliseconds and as a percent of
the total the utterance.
Author Manuscript

J Fluency Disord. Author manuscript; available in PMC 2023 March 01.


Warner et al. Page 20
Author Manuscript
Author Manuscript
Author Manuscript

Figure 2.
Means of percentages of items selected fluent against gap duration (ms) for each token per
participant group. Error bars represent standard error of the mean. Horizontal line at the 50th
percentile for fluency/disfluency.
Author Manuscript

J Fluency Disord. Author manuscript; available in PMC 2023 March 01.


Warner et al. Page 21

Table 1.

Results from pairwise t-tests with Bonferroni correction comparing the impact of each incremental (20 ms)
Author Manuscript

gap manipulation on disfluency judgements. Base manipulation indicates the original, unaltered, utterance
production. Asterisk indicates significant differences (p < .05) in responses, between manipulations.

Token Manipulation Gap Duration (ms) Response differences between manipulations 95% CI
1 Base 23.59

2 43.59 1 – 2: t = 2.03, p = 1 [0.001, 0.730]

3 63.59 2 – 3: t = 2.11, p = 1 [0.003, 0.104]

4 83.59 3 – 4: t = 4.76, p = .001* [0.107, 0.260]

5 103.59 4 – 5: t = 3.06, p = .010 * [0.055, 0.255]

6 123.59 5 – 6: t = 2.81, p = .009 * [0.046, 0.266]

7 143.59 6 – 7: t = 0.07, p = 1 [0.046, 0.266]

8 163.59 7 – 8: t = 1.91, p = .719 [−0.106, 0.114]


Author Manuscript

2 Base 38.2

2 58.2 1 – 2: t = 1.10, p = 1 [−0.031, 0.108]

3 78.2 2 – 3: t = 1.65, p = 1 [−0.013, 0.148]

4 98.2 3 – 4: t = 1.88 p = 1 [−0.005, 0.181]

5 118.2 4 – 5: t = 2.86, p = .100 * [0.045, 0.248]

6 138.2 5 – 6: t = 2.34, p = .670 [0.019, 0.232]

7 158.2 6 – 7: t = 2.10, p = 1 [0.006, 0.224]

8 178.2 7 – 8: t = 1.70, p = 1 [−0.014, 0.179]

3 Base 185.33

2 165.33 1 – 2: t = 1.95, p = .778 [−0.002, 0.192]

3 145.33 2 – 3: t = 0.71, p = 1 [−0.068, 0.143]


Author Manuscript

4 125.33 3 – 4: t = 2.97, p = .009 * [0.052, 0.262]

5 105.33 4 – 5: t = 2.23, p = .246 [0.013, 0.215]

6 85.33 5 – 6: t = 3.38, p = .018 * [0.062, 0.236]

7 65.33 6 – 7: t = 2.78, p = 1 [0.025, 0.149]


8 45.33 7 – 8: t = 3.02, p = 1 [0.019, 0.091]

4 Base 325.44

2 175.44 1 – 2: t = 3.57, p = .398 [−0.003, 0.120]

3 125.44 2 – 3: t = 4.07, p = .001 * [0.049, 0.210]

4 105.44 3 – 4: t = 2.21, p = .165 [0.058, 0.256]

5 85.44 4 – 5: t = 1.55, p = 1 [−0.023, 0.187]

6 65.44 5 – 6: t = 3.14, p = .006 * [0.112, 0.221]


Author Manuscript

7 45.44 6 – 7: t = 3.17, p = .065 [0.095, 0.275]

8 25.44 7 – 8: t = 1.87, p = 1 [0.046, 0.161]

Note. CI = confidence interval.

J Fluency Disord. Author manuscript; available in PMC 2023 March 01.


Warner et al. Page 22

Table 2.

Threshold for disfluency for each group (i.e., AWS, naïve, SLP) and results from pairwise t-tests comparing
Author Manuscript

groups. Disfluency threshold determined by the intersection between logistic regression curve and 50th
fluency percentile, separated by token. Fitted logistic regression curve was generated with gap duration
as the independent variable and response (i.e., fluent or disfluent) as the outcome variable. T-tests were
conducted, per token, comparing responses between groups. Asterisk indicates significant differences (p < .05)
in responses, between groups.

Token 1 Token 2 Token 3 Token 4

AWS 113.15 ms 120.44 ms 130.25 ms 97.97 ms

AWS vs. Naïve t = 0.31, p = .755 t = −0.81, p = .419 t = −0.37, p = .715 t = −0.94, p = .350

95% CI [−0.063, 0.087] [−0.105, 0.044] [−0.876, 0.060] [−0.114, 0.040]

Naive 114.37 ms 127.35 ms 134.12 ms 112.07 ms

Naive vs. SLP t = −2.24, p = .011* t = −0.68, p = .499 t = −2.61, p = .009* t = −1.91, p = .057
Author Manuscript

95% CI [0.022, 0.176] [−0.050, 0.103] [0.024, 0.172] [−0.002, 0.158]

SLP 129.87 ms 134.27 ms 154.63 ms 131.55 ms

SLP vs. AWS t = 2.24, p = .026* t = 1.44, p = .151 t = 2.94, p = .004* t = 2.76, p = .006*
95% CI [0.011, 0.163] [−0.021, 0.135] [0.037, 0.187] [0.033, 0.196]

Note. CI = confidence interval.


Author Manuscript
Author Manuscript

J Fluency Disord. Author manuscript; available in PMC 2023 March 01.

You might also like