Download as pdf or txt
Download as pdf or txt
You are on page 1of 16

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/226490067

A Data-Driven Methodology for Evaluating and Optimizing Call Center IVRs

Article  in  International Journal of Speech Technology · January 2002


DOI: 10.1023/A:1013674413897

CITATIONS READS

23 335

2 authors, including:

Bernhard Suhm
Raytheon BBN Technologies
41 PUBLICATIONS   1,061 CITATIONS   

SEE PROFILE

Some of the authors of this publication are also working on these related projects:

Esprit/BRA project MIAMI - Multimodal Integration in Multimedia Interfaces View project

All content following this page was uploaded by Bernhard Suhm on 29 April 2015.

The user has requested enhancement of the downloaded file.


INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY 5, 23–37, 2002

c 2002 Kluwer Academic Publishers. Manufactured in The Netherlands.

A Data-Driven Methodology for Evaluating and Optimizing


Call Center IVRs∗

BERNHARD SUHM AND PAT PETERSON


BBN Technologies, Speech and Language Processing, 70 Fawcett Street, Cambridge, MA 02138, USA
bsuhm@bbn.com
patp@bbn.com

Received May 29, 2001; Revised August 22, 2001

Abstract. Usability of many call center IVRs (Interactive Voice Response systems) is dismal. Callers dislike
touch-tone IVRs and seek agent assistance at the first opportunity. However, because of high agent costs, call center
managers continue to seek automation with IVRs. The challenge for call centers is providing user-friendly, yet
cost-efficient, customer service. This article describes a comprehensive methodology for usability re-engineering
of telephone voice user interfaces based on detailed call center assessment and call flow redesign. At the core of our
methodology is a data-driven IVR assessment, in which we analyze end-to-end recordings of thousands of calls to
evaluate IVR cost effectiveness and usability. Because agent time is the major cost driver in call center operations,
we quantify cost-effectiveness in terms of agent time saved by automation in the IVR. We identify usability problems
by carefully inspecting user-path diagrams, a visual representation of the sequence of events of thousands of calls
as they flow through the IVR. Such an IVR assessment leads directly into call-flow redesign. Assessment insights
lead to specific suggestions on how to improve a call-flow design. In addition, the assessment enables us to estimate
the cost savings of a new design, thus providing the necessary business justification. We illustrate our IVR usability
and re-engineering methodology with examples from large commercial call centers, demonstrating how the staged
process maximizes the payback for the call center while minimizing risk.

Keywords: call centers, interactive voice response, assessment, user-centric design, usability re-engineering,
cost-benefit analysis

1. Introduction hand, IVR usability and its impact on call center opera-
tions is poorly understood. While the demise of touch-
Many companies have enthusiastically adopted touch- tone IVRs has been predicted (Tatchell, 1996), they
tone interactive voice response systems (IVRs), intro- are still widespread. Recently, with the maturation of
duced more than two decades ago, to increase customer speech recognition technology, speech-enabled IVRs
service efficiency. Consumers, on the other hand, who have begun to replace touch-tone IVRs in some do-
have dealt with many IVRs that are difficult to use, mains, but they come with their own limitations, such as
attempt to bypass the automated system and prefer to poor recognition accuracy under noisy conditions, lack
speak to live agents. This paradox can be explained by of privacy, and high upfront investment costs. Mean-
considering that on the one hand, financial pressures while, despite the acknowledged dismal usability of
force call centers to cut operating costs, and on the other many deployed telephone voice user interfaces,1 call
centers find it difficult to diagnose usability problems
∗ Thetools and processes described in this paper are the subject of and to redesign call flows to optimize both customer
pending patents. satisfaction and business benefit.
24 Suhm and Peterson

Decision makers in call centers lack adequate in- lization” and IVR/agent “average handling time.” IVR
formation. Standard IVR performance reports do not utilization (or “IVR take-rate”) is commonly defined
capture information on usability and lack sufficient de- as the difference between the percentage of callers en-
tail. Call center managers are often misled to believe tering the IVR and the percentage leaving the IVR
that the existing IVR is performing well. Even if they to talk to a live agent. While often interpreted as
recognize that something is wrong, they cannot iden- the success rate for serving callers in an automated
tify the specific problems, much less how to remedy fashion, IVR take-rate is a poor measure of IVR au-
them. Without understanding the value of usability and tomation, because callers hanging up in the IVR may
its impact on the business, IVR usability engineering not have received any useful information. In several
is rarely taken seriously. large call centers we have seen that the majority of
Usability design and re-engineering know-how for callers hanging up in a touch-tone IVR have actu-
IVRs ranges from style guides for touch-tone IVRs ally received no useful information and therefore have
(Halstead-Nussloch, 1989) to comprehensive collec- not been served. For example, based on standard IVR
tions of best practices for IVR design (Resnick and reports, one call center believed that its touch-tone
Virzi, 1995), which also cover state-of-the-art speech- IVR served more than 30% of the callers in the auto-
enabled IVRs (Balentine and Morgan, 1999). While ap- mated system. Our detailed analyses revealed that
plying best practices can often improve IVR usability, only 1.6% of all callers were actually served, and
the measurement and validation of those improvements almost 20% hung up without receiving any useful
can be quite difficult, especially in the context of a pro- information.
duction IVR with many different tasks and a wide range
of callers. When IVR design methods yield different,
plausible designs, it is often impossible to decide which 1.2. Speech—The Next Generation
design works best just by applying guidelines without of Telephone Voice User Interfaces
some form of empirical evaluation.
While there is no doubt that existing touch-tone IVRs
are inadequate, it is unclear what will actually become
1.1. Evaluation of Telephone Voice User Interfaces the new generation of IVRs. Multimodal interfaces
(e.g., Gibbon et al., 2000) are expected to improve in-
A search of the literature reveals little research on the terface usability in many areas. Conceivably, overcom-
basic design problems in IVRs, such as prompting and ing the limitation of voice prompts for system output
menu design in touch-tone IVRs. Some research iden- and DTMF for user input would improve IVR usabil-
tified guidelines for menu and form styles, including ity. Not surprisingly, a study showed that users prefer
their respective usability characteristics. Few standard a telephone augmented with a display over two vari-
usability evaluation methods have been applied to IVRs ations of a standard, voice-only touch-tone telephone
(Edwards et al., 1997; Delogu et al., 1998). These meth- (Roberts and Engelbeck, 1989).
ods have also been applied successfully to the evalua- After several successful commercial deployments,
tion of research speech user interfaces, commonly be- speech-enabled IVRs are being marketed as the next
lieved to be the next generation of IVRs (Yankelovich generation of IVRs, citing studies that suggest callers
et al., 1995; Bennacef et al., 1996; Walker et al., generally prefer speech-enabled over touch-tone IVRs
1997). However, standard usability tests, measuring (Nuance, 2000; Bers et al., 2001). However, other stud-
task completion times and rates in a laboratory study, ies showed that users may actually prefer touch-tone in-
are not practical for complex call center IVRs that of- teraction over speech for certain tasks (Fay, 1994). Cer-
fer many tasks and have to accommodate a wide range tain applications may be more conducive to touch-tone
of users. Usability walkthroughs (Nielsen, 1993), on interaction, allowing users to complete their jobs faster
the other hand, while fast and inexpensive, do not pro- using a touch-tone IVR. Furthermore, voice recogni-
vide any quantitative data and may miss subtle usability tion comes with inherent limitations, such as lack of
problems. privacy and deterioration of recognition accuracy in
Instead of considering usability measures, call cen- noisy environments or for speakers with foreign ac-
ter managers commonly evaluate IVRs using reports cents. Under such circumstances, users may want to
generated by various system components. Such re- switch to touch-tone interaction even if they otherwise
ports typically contain measures such as “IVR uti- prefer speech on the specific application.
Evaluating and Optimizing Call Center IVRs 25

Clearly, “multimodal” IVRs show promise to be- calls. We have developed algorithms that infer the com-
come the next generation of IVRs. Multimodal in this plete IVR sequence from call recordings, and we em-
context means at least speech and touch-tone interac- ploy human transcribers to annotate significant events
tion, and possibly other modalities, such as a (small) in agent-caller dialogs. Section 3 describes our IVR
display. With further maturation of speech recognition assessment analyses, which evaluate cost-effectiveness
technology, speech-enabled IVRs will be able to con- and usability of IVRs. We compile the event sequences
duct increasingly complex dialogs with callers. Sev- of thousands of calls into a single number—total IVR
eral companies have developed speech-enabled IVRs benefit—that quantifies both cost effectiveness and ob-
that can process responses to open-ended prompts jective usability (Suhm and Peterson, 2001). To iden-
(Gorin et al., 1996; Lee et al., 2000). Such natural tify usability problems, we inspect user-path diagrams.
language call routing eliminates the need for layered User-path diagrams visually represent the complete
menus, which are unavoidable when using either touch- path of thousands of calls through the IVR. Section 4
tone or spoken keyword menu designs for more than illustrates how to apply our IVR assessment methodol-
just a few routing destinations. Layered menus are ogy to call-flow redesign, using case studies from sev-
one of the main causes of poor usability in existing eral large commercial call centers. We illustrate how
IVRs. the assessment analyses allow us to identify usabil-
ity problems and to generate concrete suggestions for
call-flow redesign. We describe how we quantitatively
1.3. A Novel IVR Assessment Methodology compare alternative call-flow designs using our com-
parative IVR analysis. Comparative IVR analyses en-
To improve our ability to conduct effective IVR usabil- able us to identify the most effective design possible
ity engineering and to advance research towards the and project the benefit and cost-savings for a redesign.
next generation of IVRs, this article presents a method- Section 4 concludes by introducing natural language
ology for IVR usability evaluation and redesign. At call routing as the ultimate remedy for curing the touch-
the core of the methodology is a novel evaluation tone menu blues, and Section 5 closes this article with
measure that combines objective usability and cost- a summary.
effectiveness of IVRs in a single measure. This novel
measure also overcomes the flaws of standard IVR per-
formance measures. By quantifying the benefit of an 2. Data Capture from Commercial IVRs
IVR, we can accurately estimate the cost-savings po-
tential for call-flow redesign, and usability practitioners The only complete records of user and system behavior
can compare alternative IVR designs objectively. The in IVRs are complete calls. Therefore, a comprehen-
article also presents a number of usability analyses that sive usability assessment of IVRs must be based on
identify specific IVR usability problems and lead to end-to-end recordings of calls. A call typically begins
concrete suggestions for how to improve the design. in a dialog with an automated (IVR) system, called the
Although our assessment does capture the customer IVR-caller dialog, which may be followed by a dia-
experience, this article does not discuss subjective us- log with a live agent, called the agent-caller dialog.
ability explicitly. We believe that standard methods This section describes methods for collecting end-to-
for evaluating subjective usability, such as question- end data for calls, including both IVR-caller and agent-
naires and surveys, are adequate for telephone voice caller dialogs. Following subsections describe how we
user interfaces. transform call recordings into a complete sequence of
Our assessment methodology includes a process for events (or an event trace), to make processing efficient.
collecting detailed usability data from commercially This includes automatic methods to infer the complete
deployed IVRs and tools for processing the data ef- IVR event sequence and annotation to capture signifi-
ficiently. In an IVR assessment, we record thousands cant events in the agent-caller dialog.
of live calls in an unobtrusive fashion, and we apply
automated tools to determine the complete IVR event 2.1. End-to-End Recording and Complete
sequence for each call. Call Event Trace
Section 2 describes our methods for collecting data
from thousands of live calls and efficiently processing Calls can be recorded end-to-end either on-site or
that data into a database of event traces for complete off-site. For on-site recording, standard recording
26 Suhm and Peterson

equipment can be employed. For example, if the IVR the most basic usability problems, such as callers get-
is hosted on a PC, incoming calls could be recorded ting trapped in “voice mail jail” or “touch-tone hell”.
on the PC using an appropriate hardware card. This To illustrate IVR report inaccuracy, consider the report-
recording procedure is attractive for research purposes. ing of hang-ups in the IVR segment of a call. Without
In complex call centers, however, on-site end-to-end knowing whether specific calls actually accomplished
recording is difficult because a call is handled typically anything in the IVR, IVR reports frequently count all
in more than one piece of equipment. Furthermore, a calls that hang up in the IVR segment as “resolved”
call may be transferred to remote sites. This happens (or caller self-serve), regardless of whether or not the
often in large call centers that handle specific types of caller obtained any useful information or accomplished
calls with specialized agent queues (called “skill-based anything in the IVR.
routing”). Frequently, such specialist agent queues are While IVR logging can be customized to include
distributed across several geographically disparate call event traces for calls, the IVR code would have to be
centers. Therefore, calls that are handled initially in modified to write to an event log at appropriate states
one location frequently must be transferred to another in the call. Generating such code is error-prone and in-
location. In such situations, off-site recording may be trusive to call center operations. To obtain a complete
the only way to record calls end-to-end. event trace independent of IVR reports, we have de-
Recordings of complete calls represent a large veloped a method (which we call IVR analysis) that
amount of data that is difficult to analyze in its raw form. infers the complete IVR event sequence from the call
To make the analysis of call data efficient, we transform recording alone.
the recordings into a trace of significant events for each
call. Significant events in the IVR-caller dialog include 2.2.1. Touch-Tone IVR Events. Our IVR analysis
system prompts and caller input, either touch-tone or employs three main tools to capture the event sequence
speech. In the agent-caller dialog, we look at events for the IVR-caller dialog: a prompt detector, a DTMF
such as exchanges of various kinds of information (e.g., detector, and a prompt inference tool. First, we use
account numbers, dollar amounts), description of the a commercially available DTMF detector to detect
reason for the call (e.g., question about a bill, inquiry touch-tones. Next, our prompt detector recognizes im-
into flight schedules), and completion of transactions portant known prompts in recordings. Finally, when-
(e.g., making a payment arrangement or flight reserva- ever the IVR is so complex that detection of all prompts
tion). While most of our IVR assessment analyses are would be impractical, we employ a prompt inference
based on this event trace, the ability to switch between tool to infer the complete prompt sequence efficiently.
call recording and its representation as event sequence An additional, crucial step is to determine the exit
is crucial throughout the analysis process. condition from the IVR-caller dialog. The exit condi-
The following two subsections describe how we ex- tion indicates whether the call ended in the IVR with
tract the complete call event trace from end-to-end a hangup or was transferred to an agent. We detect
recordings. We begin with a method to obtain the event IVR transfer prompts, such as “Please wait for the next
trace for the IVR-caller dialog, called IVR analysis. available representative,” to determine whether a call
This method can be applied to both touch-tone and was transferred to an agent. If the prompt detector de-
speech-enabled IVRs. We then outline how we obtain tects the transfer prompt, we infer that the call was
an event sequence for agent-caller dialog. transferred to an agent. Otherwise, we assume that the
caller hung up. This method of inferring the IVR exit
2.2. Automatic Analysis of IVR-Caller Dialogs condition fails when the caller hangs up during the hold
time, before reaching an agent. However, such cases
The preferred method for capturing the IVR event se- can be corrected during the annotation analysis, which
quence would be an event log that is generated by the we describe below in Section 2.3.
IVR. However, the reports that current IVR platforms
generate are generally inadequate and inaccurate. They 2.2.2. Speech-Enabled IVR Events. We follow a
are inadequate because they typically are based on “peg similar process to capture events in speech-enabled
counts”, which indicate how many times a prompt or IVRs, with the following modifications.
menu was visited overall, but provide no information First, the analysis must rely on prompt detection to
on specific calls. Peg counts are unable to identify even disambiguate the event sequence after any speech input
Evaluating and Optimizing Call Center IVRs 27

from the caller. Unlike touch-tone IVRs, where we can Table 1. Call type distribution example.
recognize touch-tone input by the caller reliably, recog- Call type % Calls
nition of user speech input is error-prone. Therefore, the
state transition after speech input cannot be inferred re- Sales 24
liably from the speech alone. Establish new account 17
Second, to evaluate speech recognition performance, Payment information and arrangements 11
all segments of a recording that contain user speech Billing questions 10
must be identified and annotated with the sequence of Repair 7
words that was actually spoken. Speech segments can Other 31
be identified using speech detection algorithms, which
are in the public domain. Then, since speech recogniz-
ers are prone to recognition errors, the true sequence
of words on those spoken segments must be annotated we infer the call type distribution from the caller-
manually, using human transcribers. agent dialog, provided that the call was served by
an agent. If callers are fully served in the IVR for
certain call types, we adjust the call-type distribu-
2.3. Annotation of Agent-Caller Dialogs tion accordingly. Table 1 shows such a call-type dis-
tribution from one of our case studies. We use the
Our annotation analysis captures the sequence of sig- call type distribution to identify IVR usability prob-
nificant events for anything that follows the IVR-caller lems and to estimate upper bounds on IVR automa-
dialog, i.e., waiting on hold and agent-caller dialogs. tion. The call type distribution is also valuable during
Significant events include start of the agent-caller di- call-flow redesign, since frequently-asked questions
alog, the reason for the call (and topics discussed), should be offered near the top of touch-tone menus.
exchanges of information between caller and agent, Section 4.1 will describe these applications of call type
and completion of transactions. In addition, the anno- distribution in IVR usability re-engineering in more
tation analysis may characterize the call as a whole detail.
according to certain attributes, such as the degree to Figure 1 illustrates the process for capturing the
which the call was resolved and agent courtesy. We IVR event trace using IVR analysis and call anno-
currently employ human transcribers to perform these tation, followed by its application to IVR usabil-
annotations, either based on end-to-end recordings or, ity re-engineering. The data capture phase can be
if recordings of the caller-agent dialog are not avail- viewed as building a call database; each record of
able, by annotating in real time while listening to the database represents the complete event trace of
a call. one specific call. The IVR analysis determines the
Annotating the reason for calls in randomly se- events of the IVR-caller dialog, and annotation de-
lected agent-caller dialogs allows us to estimate their termines significant events of the caller-agent dia-
frequency distribution. The distribution of the rea- log. Based on this call database, in the analysis
sons for calls (referred to as call types in the re- phase we evaluate usability and cost-effectiveness
mainder of this article) is a first and crucial step of IVRs comprehensively by analyzing the call
towards understanding why customers are calling a event traces in various ways. The following section
call center, but it is frequently not available in com- describes our IVR assessment analyses. In partic-
mercial call centers. Call centers sometime infer call ular, we introduce user-path diagrams as an effec-
type distributions based on peg counts of IVR sec- tive diagnostic tool that visualizes traffic and levels
tions, i.e., based on how often callers access cer- of call resolution. We also describe our automa-
tain IVR sections. However, these distributions are tion analysis, which quantifies the benefit of an IVR
inaccurate because callers can bypass the IVR com- to both the caller and the call-center in a single
pletely by transferring to a live agent, and callers number, called total IVR benefit. Such an assess-
who do cooperate frequently make wrong choices ment leads to IVR usability re-engineering: usabil-
in the IVR, thus routing themselves to the wrong ity problems are identified, alternative designs can
IVR section. In our experience, only 35% to 75% be compared quantitatively, and the re-engineering
of all callers get to the right place using touch-tone cost can be justified (by quantifying the improvement
menus. Instead of relying on inaccurate IVR reports, opportunity).
28 Suhm and Peterson

Figure 1. Overview of our IVR assessment and redesign methodology.

3. Evaluating Cost-Effectiveness and Usability we apply the common usability measures of task com-
of IVRs pletion rates and task completion time in a form that
is more suitable for evaluating production IVRs. Be-
The IVR usability evaluation methodology presented yond quantifying benefit of an existing IVR, total IVR
in this section analyzes call event traces to identify benefit allows us to measure the potential for improve-
IVR usability problems and to quantify the benefit of ment by estimating upper bounds on total IVR benefit
an IVR, both to the user and to the call center. based on annotations of caller-agent dialogs. This step
To quantify benefit in a single number, the first sub- is crucial to obtain the necessary business justification
section describes total IVR benefit as a measure that for call-flow usability reengineering.
combines (objective) usability and cost-effectiveness IVR automation analysis typically reveals general
of an IVR. Total IVR benefit is calculated using our problem areas of an IVR, for example, whether callers
IVR automation analysis. To measure IVR automation, get to the right place, whether callers can be identified
Evaluating and Optimizing Call Center IVRs 29

efficiently, and whether callers succeed in obtaining Table 2. Typical agent-time savings for automated tasks.
useful information. To identify specific usability prob- Automated Caller Useful Completion of
lems, we employ user-path diagrams, an application task identification Routing information transactions
of state-transition diagrams to the evaluation of IVRs.
Saved agent 15 40 40 40
User-path diagrams allow a usability practitioner to
seconds
identify specific usability problems and generate con-
crete improvement suggestions. In our experience with
evaluating large commercial call centers, user-path di-
agrams have proven to be a very useful diagnostic tool. least 4 : 1. Therefore, we quantify cost-effectiveness of
Beyond evaluating objective usability, a comprehen- a telephone IVR in terms of agent time. We define the
sive usability evaluation methodology must also ad- total IVR benefit as the agent time that is saved by the
dress subjective usability—especially in call centers, IVR, compared to handling the complete call by live
where delivering superior customer service is very agents.
important. Subjective usability sometimes even out- An IVR “saves” agent time whenever it performs
weighs objective performance. However, we believe tasks successfully that otherwise would have to be per-
that standard methods for evaluating subjective usabil- formed by an agent. Tasks that typically can be per-
ity, such as surveys and questionnaires, are adequate formed within an IVR include identifying the caller,
for quantifying customer satisfaction in call centers. providing information to the caller, performing trans-
Methods for evaluating subjective usability of IVRs actions, and routing the caller to specialized agents. In
therefore, are not discussed further in this article. some cases, completing these tasks successfully may
resolve the call so that the caller hangs up without any
3.1. Evaluating IVR Cost Effectiveness assistance from an agent. We refer to such calls as
self-serve or full automation. It is important to note,
Evaluating call center IVRs is difficult. Evaluation cri- however, that even if a call is not fully automated, the
teria from the caller’s point of view (usability) and from IVR can still provide significant savings through partial
the call center’s point of view (cost-effectiveness) ap- automation. Table 2 shows typical agent-time savings
pear difficult to reconcile. Existing evaluation methods across categories of “automatable” tasks, i.e., tasks that
are inadequate and address either usability or cost- can be performed within an IVR. These savings can be
effectiveness in isolation. As mentioned earlier, stan- derived from benchmark assumptions or measured in
dard reports generated by IVR hardware are inaccurate annotated agent-caller dialogs. While the emphasis in
and do not report usability measures. Methods to eval- this context is on cost, we note that IVR automation
uate subjective usability exist, but they do not quantify rates correspond to sub-task completion rates. Hence,
the cost for the call center. Common laboratory usabil- IVR automation is a more differentiated version of the
ity evaluations, using task-based measures in controlled standard task-completion usability measure, and total
experiments on a few tasks, are impractical for com- IVR benefit thus combines cost-effectiveness with task
plex call center IVRs, which can offer many different completion.
functions (tasks). We therefore introduce the total IVR The key to our IVR evaluation methodology is the
benefit as a single measure that combines IVR usability measurement of cost-effectiveness in terms of agent
and cost-effectiveness. time saved at the task level, by first quantifying IVR
automation and then calculating an overall benefit mea-
sure, as described next.
3.1.1. Total IVR Benefit. How can we quantify both
usability and cost effectiveness of a telephone voice
user interface? On the one hand, callers want to ac- 3.1.2. Quantifying IVR Automation. Total IVR ben-
complish their goals quickly and easily over the phone. efit could be measured directly by timing the length
Therefore, objective usability can be quantified by the of agent-caller dialogs. But as agent time has a large
standard measures of task completion rates and times. variation, the length of thousands of agent-caller di-
On the other hand, agent time dominates the cost in alogs would have to be measured, which currently re-
most call centers. The ratio between cost of agents quires manual annotation of calls. Furthermore, it is im-
and all other costs, such as telecommunications time, possible to obtain unbiased data from commercial call
IVR hardware and software, and facilities charges, is at centers, because many factors may have a significant
30 Suhm and Peterson

Table 3. IVR automation analysis, with two agent categories (“specialist”, “floor”).

(Agent seconds saved per)


Traffic automation category Benefit [agent secs]

Call profile Calls % Calls Account Routing Info delivery One call Net

Fully-automated calls 307 5.6 15 40 40 95 5.3


Transfers to specialist with readout 99 1.8 15 40 40 95 1.7
Transfers to agent with readout 101 1.8 15 40 55 1.0
Transfers to specialist with ID 641 11.6 15 40 55 6.4
Transfers to specialist, no ID 545 9.9 40 40 3.9
Transfers to agent with ID 471 8.5 15 15 1.3
Transfers to agent, no ID 2927 52.9
Abandons 439 7.9
Total 5530 100.0 29% 29% 9% 19.6

impact on caller behavior and agent handling time. We Automation rates are defined as the percentage of
therefore have developed a method to estimate total automation achieved over all calls for each automat-
IVR benefit based on call event-sequence data, called able task. This percentage can be calculated simply by
IVR automation analysis. adding the percentages of all call profiles that include
As the first step in IVR automation analysis, we de- the specific automatable task.
fine tasks that can be automated in the IVR, as shown Table 3 shows an example IVR automation analysis,
in Table 2. Typically, the completion of a task can be in which we distinguish two agent types, “specialist”
associated with reaching a certain state in the IVR. and “floor.” The left column lists the call profiles. The
Thus, the set of completed tasks can be inferred di- next two columns (labeled “Traffic”) show the break-
rectly from the event sequence data for a call, using a down of the total data set, which consists of 5530 calls,
simple lookup table that documents which IVR states into the various profiles. For example, 5.6% of the calls
correspond to the completion of which tasks. were fully automated, and 7.9% of the callers were
We make one important exception to the assumption abandoned without the caller’s getting anything done.
that IVR states indicate successful task completion. Then, the three “Automation” columns show the au-
Specifically, we do not assume that routing decisions tomation categories for each profile. This analysis is
made in the IVR are necessarily correct. Rather, we based on three automation categories: capture of the
look at subsequent agent-caller interactions to deter- caller’s account number, routing, and delivery of in-
mine, based on the annotated reason for a call, whether formation. In each “Automation Category” column we
the call was correctly routed or misrouted to an agent. enter the associated agent time savings from Table 2
Calls that misroute to specialists usually need to be for those call profiles in which that automation com-
transferred somewhere else and, therefore, incur a cost ponent was achieved. For example, the profile “Trans-
equal to the time it takes the specialist to reroute the call, fer to agent with readout” achieved capture of the ac-
which can be thought of as a negative routing benefit. count number and automated delivery of information.
Given the definition of tasks that can be completed The bottom row in Table 3, for the three “Automation”
within an IVR, we characterize each call according to columns, shows the automation rates by category: 29%
distinct combinations of automated tasks, which we capture of account number, 29% routing, and 9% in-
refer to as call profiles. Given a set of calls with their formation delivery.
event sequence data, we annotate every call with its set For each call profile, the saved agent time over all
of completed tasks and use the pattern of completed calls into the center (shown as the last column in
tasks to accumulate counts for each call profile. The Table 3) is the product of the total agent time saved
call traffic into an IVR is thus partitioned into a set for one call with the corresponding percentage of traf-
of call profiles, each representing a distinct pattern of fic. For example, the call profile “transfers to specialist
automation. with ID” saves 55 seconds of agent time, because the
Evaluating and Optimizing Call Center IVRs 31

call was transferred to the right place (routing automa- encompass many IVR events and multiple IVR-caller
tion), and the caller was identified (account number au- interactions in the captured sequence data.
tomation). Since 11.6% of all calls fit this profile, the The nodes of the tree correspond to IVR states, arcs
net saving of agent time is estimated as 11.6% times correspond to state transitions, and leaves correspond
55 seconds, which equals 6.4 agent seconds over all to end conditions of calls. Each node and leaf is marked
calls to the center. The total IVR benefit, then, is the with the percentage of all calls that reached the node or
sum of the net IVR benefits for all call profiles. For leaf. In addition, arcs may be marked with the user input
the example in Table 3, our analysis estimates a total that causes the corresponding state transition, such as
IVR benefit of 19.6 agent seconds saved, shown in the pressing a certain touch-tone in response to a prompt.
bottom right corner cell of Table 3. In other words, We found it helpful to distinguish at least three end con-
we estimate that this IVR shortens, on the average, the ditions. “Self-serve” refers to calls that are resolved in
agent handling time for every call by 19.6 seconds. the IVR, i.e., where the customer completes the call
in the IVR, without talking to a live agent. “To agent”
are calls that transfer to an agent. “Abandon” refers to
3.2. Evaluating IVR Usability calls where the caller hangs up, either in the IVR with-
out obtaining any useful information, or on hold before
Evaluating usability typically encompasses quantify- reaching a live agent. If the call center operates with
ing usability, identifying usability problems, and eval- distinct categories of agents, the “to agent” category
uating subjective usability factors. Our assessment is typically broken down into various subcategories,
methodology currently quantifies usability and pro- each representing a distinct routing destination from
vides methods to identify usability problems, but we an operational point of view.
do not (yet) formally evaluate subjective usability fac- Figure 2 shows an excerpt from a user-path diagram.
tors, such as user satisfaction. Rectangular boxes represent IVR states, arrows rep-
Common usability measures include task comple- resent call traffic, and circles indicate places where
tion rates and task completion times. Our IVR automa- calls leave the IVR. In this example, 82% of all callers
tion analysis provides task completion rates in a form make it past the opening menu to a state that prompts
that is suitable to the problem of evaluating telephone the callers to key in their account number, called “ID
user interfaces. The automation analysis can also be Entry”. In this figure 8.5% of all callers abandon the
used to quantify usability of telephone user interfaces. call while attempting to provide their account num-
Specifically, low automation rates point to usability ber, shown as an arrow to the right. On the other hand,
problems. In the example above, the low success rate 63.9% of all callers enter their account number suc-
for capturing account numbers (only 29% of all callers) cessfully and reach the main menu. At the main menu,
reveals a severe shortcoming and usability problem in 28.5% of the callers select an option that routes them
this call flow.
In addition to IVR automation analysis, we have de-
veloped a number of other tools for evaluating usability.
In this article, we describe user-path diagrams as a di-
agnostic tool for identifying IVR usability problems,
and as an analytic tool for estimating the impact of
design changes.
User-path diagrams visualize user behavior in the
IVR by representing event sequence data as a tree,
similar to state-transition diagrams. State-transition di-
agrams have been applied to many engineering prob-
lems, including user interface design (Parnas, 1969).
Applied to visualizing user behavior in IVRs, state-
transition diagrams can visualize the paths of many
users through an IVR, hence the name user-path di-
agram. To manage the complexity of user-path trees,
we cluster individual IVR states into sub-dialogs, such
as ID entry or menu selection. Such sub-dialogs may Figure 2. Excerpt from a user-path diagram.
32 Suhm and Peterson

to a specialist agent, while 0.8% route themselves to a call-flow redesign recommendations. The next subsec-
general (floor) agent, and 1.7% abandon the call. tion (4.3) presents our comparative IVR analysis: by ap-
We identify usability problems by inspecting user- plying the same assessment methodology that is used to
path diagrams. Usability problems are found by look- identify usability problems, alternative designs can be
ing at those areas of the call flow that receive little or compared quantitatively. Instead of relying on educated
no caller traffic or that have high rates of abandoned guesses, more or less comparative IVR analysis allows
calls or transfers to an agent. In Fig. 2, for example, the us to determine the overall best design. This section
state cluster named “ALT ID Entry” receives 9.6% of closes by demonstrating how our assessment methodol-
all calls, but 86% of these calls either are abandoned or ogy also leads to building a business case, empowering
are transferred to a floor agent, and the account number call center managers to make informed decisions be-
is correctly entered in only 14%. Obviously, this part tween touch-tone re-engineering and speech-enabling
of the IVR is ineffective. Section 4.1 presents a more their IVR.
detailed example of how to identify IVR usability prob-
lems by inspecting user-path diagrams. 4.1. Identifying IVR Usability Problems

This section demonstrates how to identify and quan-


4. Application to IVR Redesign tify IVR usability problems by analyzing user-path di-
agrams and call-type distributions. The call center in
Using case studies from several large call centers, this this example serves many functions, including sales,
section illustrates our IVR assessment methodology billing questions, and repair services. The user-path di-
and its application to call-flow redesign. The first two agram in Fig. 3 shows the first two menu layers in detail,
subsections elaborate how we employ user-path dia- but abbreviates the provision of automated information
grams and call-type distributions to identify IVR us- as “Automated Billing Information” and “Automated
ability problems, and how this leads to quick hit and Fulfillment”.

Figure 3. Identifying IVR usability problems by inspecting a user-path diagram.


Evaluating and Optimizing Call Center IVRs 33

Visual inspection of this user-path diagram reveals Based on our assessments of call centers across sev-
the following IVR usability problems, identified in eral industries, we have identified the following com-
Fig. 3. mon IVR usability problems:

(1) About 30% of calls either are abandoned or are • Excessive complexity—many IVR functions are un-
transferred “cold” to an agent at the main menu. derused because customers get confused early in the
This traffic represents the callers who attempt to call.
bail out of the IVR at the first opportunity. While • Caller identification difficulties—IVR scripting that
we can empathize with such callers, they are likely attempts to identify the caller is frequently too diffi-
to be transferred to the wrong agent, who then has cult, preventing many callers from reaching the parts
to transfer the caller, which means a second period of the IVR that deliver automated customer service.
of waiting on hold for the correct agent. Even with effective use of Automatic Number Iden-
(2) While 18% of callers choose “other billing ques- tification (ANI), the success rate may be low because
tion,” only 3% actually find the billing IVR on this customers call from phones other than the one reg-
alternative path, and 15% bail out to an agent— istered with their account.
after spending more than 1 minute in the IVR • Confusing touch-tone menus—menu wording is of-
without having received or provided any useful ten based on call center operations terminology and
information. may not reflect how the customers think about their
(3) The billing IVR achieves very little automation, reason for the call. The customers make selections
because only 5% of all callers find “Automated that do not lead to self-service and instead require
Billing Information”. Only 3% of the callers obtain assistance from an agent.
automated information in the billing IVR.
By contrast, a standard IVR report for this call 4.2. Assessment Quick Hits and Re-engineering
center would indicate a 19% IVR take rate, which
really just means that 19% of all callers hung up in In the course of evaluating existing telephone user inter-
the IVR. The IVR report would not reveal that less faces, we usually observe a number of usability prob-
than one in six such callers (3% overall) actually lems in enough detail to diagnose the problems and
obtained useful information! recommend solutions. We refer to solutions that are
Many call centers commit the mistake of infer- clear-cut and non-controversial as quick hits. When the
ring the call-type distribution from IVR reports. In diagnosis is clear but the solution is not, we frequently
our example, IVR peg counts would indicate that recommend call-flow reengineering, where alternative
10% of callers reach the billing IVR, but this does designs are tested side-by-side for efficacy.
not mean that 10% of all incoming calls are about We have encountered many obvious quick hits, some
billing questions, because many callers may not more than once. For example, in a few cases we ob-
find the billing IVR! In conjunction with knowl- served a suspiciously large proportion of callers being
edge of the true call-type distribution, presented bumped out of a touch-tone numeric entry task because
in Table 1 in Section 2.3, the following additional of numbers that had too few digits. By listening to call
issues become obvious. recordings, we realized that callers were struggling to
(4) 21% call about billing-related questions, but only enter long digit strings, and that they were cut off be-
10% of the callers find the billing IVR. fore completing their entry. The solution was to in-
(5) 24% of the calls should be handled by a sales repre- crease digit timeout parameters. Another quick hit is
sentative (as indicated in the call-type distribution), that many call flows unnecessarily invite callers to bail
but only 6% of the callers are transferred to a sales out early in the call flow, thus effectively bypassing
representative out of the IVR. what the automated system is intended to achieve. We
refer to this problem as call-flow “leakage.” Frequently,
Anecdotally, our IVR assessment in this case study also simple wording changes can encourage callers to make
revealed that hardly anyone was choosing a specific au- a conscious selection at least at the first menu, which
tomated fulfillment. To offer this service, the company may deliver very significant benefits by getting them
had recently purchased a speech-enabled system that transferred to the right destination.
essentially was wasted, as our assessment showed. An Call-flow reengineering is called for in cases where
early assessment might have prevented this waste. the best design is not obvious. An important example
34 Suhm and Peterson

Figure 4. Comparative IVR analysis example.

is in touch-tone menus. Our detailed routing analy- in Fig. 4, required that callers first make a selection at
sis can identify specific menus that are not effective. the main menu. Only for specific choices were callers
While simple wording changes may help, identifying asked to provide their account number; callers bail-
wording that works best is difficult, and a comprehen- ing out to an agent at the main menu were not even
sive solution frequently involves changes to the menu asked to enter their account number. We suggested
structure. In the example of Fig. 3, our analysis re- asking callers for their account number before pre-
vealed that routing to billing and sales does not work. senting them the choices of the main menu. To de-
However, it is not obvious how to merge the two alter- termine whether this design (shown as “Touch-tone
native paths to the billing IVR, which confuses many Design B” in Fig. 4) was superior, we exposed both
callers. In such cases, we develop several alternative designs to thousands of live calls and conducted a com-
designs and quantitatively compare them by applying parative IVR analysis. The increase in total IVR ben-
our assessment methodology, which we describe next. efit, by nine agent seconds, as seen in Fig. 4, proved
that Design B was indeed much more effective than
Design A. The improvement was due to increases in
4.3. Comparative IVR Design successful capture of account numbers (+29%), de-
livery of information (+11%), and improved routing
As part of the reengineering process, we typically eval- (+5%). This figure also shows the automation analy-
uate alternative designs side-by-side with real traffic. sis for our speech-enabled natural language call router,
We call this process comparative IVR analysis. For each which is discussed in more detail in the following
design we measure automation rates and calculate IVR subsection (4.4).
benefit. Differences in automation rates indicate which For a statistical treatment of such comparisons, stan-
IVR design is better for each automatable task. For dard tests on the difference of proportions can be ap-
overall comparisons, differences in total IVR benefit plied to differences in automation rates and call profile
reveal which design is superior on the whole. Com- rates, after an adequate Bonferroni adjustment to the
parative IVR analysis can thus validate that a new IVR significance level. In our case study, even increases of
design is indeed better, and furthermore, it can quantify 1.5% in any automation rate are significant (z = 3.68;
the cost savings. p < 0.01). Hence, the increases in all three automation
In another case study, IVR automation analysis (see categories reported above were significant.
Table 3) revealed that the baseline call flow was inef- The statistical treatment of total IVR benefit is more
fective in capturing the caller’s account number. The complex. An analysis that we cannot present here due to
baseline call flow, shown as “Touch-tone Design A” space limitations shows that a difference of more than
Evaluating and Optimizing Call Center IVRs 35

one agent second is significant ( p < 0.05). Hence, the preferred it because it was easier, more natural, and
nine agent-second increase in benefit of the redesigned more efficient to use than touch-tone menus.
IVR is highly significant. Our assessment analyses from the trial showed that
the natural language call router provided benefit even
beyond the improved touch-tone design, as shown in
4.4. Natural Language Call Routing Fig. 4. Overall, the number of successful routes in the
IVR increased by a factor of three over the original
Touch-tone menus are inherently limited in their ability touch-tone system. After accounting for the part of the
to get callers to the right destination. Touch-tone menus gain that could be attributed to call-flow redesign, the
force callers to match the reason for their call with just speech-enabled call router increased IVR benefit by an
one of a few options, which are often expressed using additional nine agent seconds, thus effectively doubling
call center jargon. Moreover, as menu complexity in- the total IVR benefit compared to the baseline.
creases, IVR usage decreases because callers become
frustrated, and routing mistakes increase because of
caller confusion. In a case study, our end-to-end analy- 4.5. Benefit Projections
sis of calls showed that 25% of all calls were routed to
specialists, but less than 80% of these—20% overall— Due to the cost of IVR changes in large call centers, the
went to the correct specialist. Callers misrouted them- redesign of telephone user interfaces must be justified
selves at one of four menu layers because they could with a business case. Our IVR automation analysis and
not determine which touch-tone option best matched benefit calculation can provide the necessary business
their question. When a mistake is made, the opportu- justification for IVR redesign because the cost savings
nity to automate the call or to save agent time is lost of the redesigned IVR can be estimated. Based on an au-
because the caller hangs up out of frustration or times tomation analysis of the existing IVR and knowledge of
out to a customer service agent. usability problems, we can derive bounds for improve-
Natural language call routing helps to solve these ments in the various automation categories. From these
problems by cutting through the tangle of call-flow bounds, we can project total IVR benefit to determine
options and letting callers state their purpose in their upper limits on annual cost savings, which are then
own words. We recently conducted a trial of the BBN used to justify reengineering effort.
Call Director in a large call center. Results show that Our re-engineering methodology, which is based on
natural language call routing delivers significant im- evaluating designs with real callers, eventually pro-
provements over touch-tone menus, both in terms of duces very tight benefit projections. In the example
customer satisfaction and agent labor savings. Of more below, the numbers for reengineered touch-tone and
than 10,000 callers who experienced the BBN Call Di- speech-enabled systems are based on comparative eval-
rector and had a preference, an overwhelming 82% said uations with real callers.
that they preferred describing their problem with words Figure 5 compares four IVR designs: an initial
to navigating touch-tone menus. Most said that they touch-tone baseline; a quick hit touch-tone design;

Figure 5. Benefit projection example.


36 Suhm and Peterson

a reengineered design representing a practical upper We described several tools for IVR evaluation and
limit for touch-tone; and a speech-enabled design that usability re-engineering, including IVR automation
uses BBN’s Call Director natural language call router. analysis, user-path diagrams, and comparative IVR
The height of the columns indicates the total IVR ben- analyses. These tools enable IVR usability practition-
efit. The first two columns represent the redesign de- ers to solve the tough problems in IVR redesign. By
scribed in Section 4.3 above. With further changes to identifying IVR usability problems and comparing al-
the touch-tone menus, we could realize an additional ternative designs, an IVR assessment can tell call center
five agent seconds of benefit (see the third column in managers very specifically what’s wrong with their ex-
Fig. 5), but that is probably close to the limit of what isting IVR and how to improve it. By quantifying the
can be done with a purely touch-tone interface. In con- improvement opportunity and measuring potential cost
trast, we projected the benefit of re-engineering with savings, we justify the cost of call-flow re-engineering
speech to ten agent seconds beyond the quick-hit re- and help call center managers to prioritize the use of
design, which is represented by the second column. their limited resources. An assessment typically deliv-
Such projections of IVR benefit can be translated ers immediate cost savings with quick-hit recommen-
easily into cost savings using basic call center cost pa- dations, with payback periods of much less than a year.
rameters, such as total call volume and agent cost. In Our methodology of quantifying IVR automation
this example, the increase in IVR benefit from the base- and benefit is far superior to standard IVR reports. In
line to the “quick-hit” modified call flow corresponds particular, we have shown that the standard measure of
to annual agent cost savings of more than $1M. “IVR take rate” can mislead call center managers to be-
lieve that their IVR is quite effective, while IVR usabil-
5. Summary and Conclusions ity may in effect be very poor. We have presented total
IVR benefit as an accurate, quantifiable measure that
Telephone voice user interfaces, an important class of combines objective usability and cost-effectiveness.
human-computer interfaces, have been neglected by re- We recommend adoption of total IVR benefit as the
searchers in the field of human-computer interaction. standard benchmark for IVR performance.
Usability evaluation and engineering methods for IVRs The dependence of the analysis of agent-caller di-
are not well developed. Decision-makers in call cen- alogs on human annotators significantly impacts the
ters, under strong financial pressures, strive to cut costs cost for an assessment. In the future, we hope that au-
without being able to assess the significant impact of dio mining technology will lower costs of transcription
usability on customer satisfaction and the financial bot- analysis and allow call center managers to monitor their
tom line. To remedy this situation, we have presented an performance in a fully automated fashion.
IVR assessment methodology that evaluates both cost- Our methodology currently does not formally eval-
effectiveness and usability. Moving beyond previous uate user satisfaction or any other subjective usabil-
laboratory studies of research spoken dialog systems, ity measure. While the impact of user satisfaction on
which evaluate only task completion rate and time, our customer attrition can be large, most managers of call
methodology allows practitioners to evaluate IVR us- centers focus on operational savings and ignore user
ability in the field in a systematic and comprehensive satisfaction, because it is difficult to quantify. We be-
fashion. lieve that standard methods developed in the human
An evaluation of a telephone voice user interface factors community are sufficient to evaluate user sat-
must be based on thousands of end-to-end calls. Calls isfaction of telephone voice user interfaces. Some of
must be recorded in their entirety to capture the com- these methods, such as expert walkthroughs and sur-
plete user experience, and thousands of calls are nec- veys in the evaluation phase, and usability tests or fo-
essary to obtain statistical significance in the analy- cus groups in the redesign phase, are complementary
ses. We have presented methods to analyze such large to our data-driven assessment. With each method hav-
amounts of audio data efficiently. Our analysis trans- ing its own strengths and weaknesses, a combination
forms gigabytes of audio data into detailed event traces. of complementary methods can be powerful, bringing
For the IVR section, the event sequence is captured in the user perspective in various ways throughout the
in a fully automated procedure, while manual tran- entire evaluation and design process.
scription is necessary to annotate events in agent-caller As the ultimate cure for the touch-tone menu blues,
dialogs. this article referred to natural language call routing.
Evaluating and Optimizing Call Center IVRs 37

Natural language call routing avoids menus by allow- Edwards, K., Quinn, K., Dalziel, P.B., and Jack, M.A. (1997). Eval-
ing callers to describe problems in their own words. uating commercial speech recognition and DTMF technology for
While this technology has been investigated for many automated telephone banking services. IEEE Colloquium on Ad-
vances in Interactive Voice Technologies for Telecommunication
years, only recent breakthroughs have increased the Services, pp. 1–6.
accuracy of natural language routing to levels that are Fay, D. (1994). User acceptance of automatic speech recognition in
far superior to those of touch-tone menus. With such telephone services. International Conference on Spoken Language
superior technology and a solid IVR evaluation and re- Systems (ICSLP). Yokohama, Japan: IEEE, Vol. 3, pp. 1303–
engineering methodology, we are poised to make large 1306.
Gibbon, D., Mertens, I., and Moore, R. (Eds.). (2000). Handbook
improvements in the usability of telephone voice user of Multimodal and Spoken Dialogue Systems: Resources, Termi-
interfaces. nology, and Product Evaluation. Dordrecht, Netherlands: Kluwer
Academic Publishers, pp. 102–203.
Gorin, A., Parker, B., Sachs, R., and Wilpon, J. (1996). How may I
Acknowledgments help you? Interactive Voice Technology for Telecommunications
Applications (IVTTA). Italy: IEEE, pp. 57–60.
The assessment methodology presented in this paper Halstead-Nussloch, R. (1989). The design of phone-based interfaces
for consumers. International Conference for Human Factors in
was developed over four years of research and consult- Computing Systems (CHI). New York: ACM, Vol. 1, pp. 347–352.
ing for several large call centers. The authors grate- Lee, C.H., Carpenter, B., Chou, W., Chu-Carroll, J., Reichl, W., Saad,
fully acknowledge the contribution of all members of A., and Zhou, Q. (2000). On natural language call routing. Speech
the Call Director team at BBN Technologies, past and Communication, 31:309–320.
present. Sincere thanks also to Dan McCarthy for his Nielsen, J. (1993). Usability Engineering. Morristown, NJ: AP
Professional.
comments and proofreading the article. Nuance. (2000). 2000 Speech User Scorecard. Menlo Park, CA:
Nuance Communications.
Parnas, D.L. (1969). On the use of transition diagrams in the design
Note of a user interface of interactive computer systems. Proceedings
of ACM Conference. pp. 379–385.
1. For simplicity, this article uses the term “IVR” often instead of Resnick, P. and Virzi, R.A. (1995). Relief from the audio interface
the more correct, yet longer term “telephone voice user inter- blues: expanding the spectrum of menu, list, and form styles.
face”. Technically, the latter refers to a class of human-computer Transactions on Computer-Human Interaction, 2:145–176.
interfaces (and may be more intuitive to readers with a human- Roberts, T.L. and Engelbeck, G. (1989). The effects of device tech-
computer interaction background), while the former refers to a nology on the usability of advanced telephone functions. Inter-
specific instance of such an interface (and should be very familiar national Conference on Human Factors in Computing Systems
to most readers with a background in call centers). We apologize (CHI). New York: ACM, Vol. 1, pp. 331–338.
to the readers for any confusion this may cause. Suhm, B. and Peterson, P. (2001). Evaluating commercial touch-
tone and speech-enabled telephone voice user interfaces using a
single measure. International Conference on Human Factors in
Computing Systems (CHI). Seattle, WA: ACM, Vol. 2, pp. 129–
References 130.
Tatchell, G.R. (1996). Problems with the existing telephony customer
Balentine, B. and Morgan, D.P. (1999). How to Build a Speech Recog- interface: The pending eclipse of touch-tone and dial-tone. Inter-
nition Application. San Ramon, CA: Enterprise Integration Group. national Conference on Human Factors in Computing Systems
Bennacef, S., Devillers, L., Rosset, S., and Lamel, L. (1996). Dialog (CHI). Vancouver, BC: ACM, Vol. 2, pp. 242–243.
in the RAILTEL telephone-based system. International Confer- Walker, M.A., Litman, D., Kamm, C., and Abella, A. (1997). PAR-
ence on Spoken Language Systems (ICSLP). Philadelphia, PA: ADISE: A framework for evaluating spoken dialogue agents. 35th
IEEE, Vol. 1, pp. 550–553. Annual Meeting of the Association of Computational Linguistics.
Delogu, C., Di Carlo, A., Rotundi, P., and Satori, D. (1998). A com- Madrid: Morgan Kaufmann, pp. 271–280.
parison between DTMF and ASR IVR services through objec- Yankelovich, N., Levow, G.-A., and Marx, M. (1995). Designing
tive and subjective evaluation. Interactive Voice Technology for SpeechActs: Issues in speech user interfaces. International Con-
Telecommunications Applications (IVTTA). Italy: IEEE, pp. 145– ference on Human Factors in Computing Systems (CHI). Denver,
150. CO: ACM, Vol. 1, pp. 369–376.

View publication stats

You might also like