Speech Analytics - The Complete Semantic Index

The Complete Semantic Index
An Overview of Verints Speech Analytic Technology

April 2009

This document contains proprietary and confidential information of Verint
Systems Inc. and may not be distributed to any persons or organizations for
which it was not intended.
Unauthorized use, duplication, or modification of this document in whole or in part without the written consent of Verint
Systems Inc. is strictly prohibited.
By providing this document, Verint Systems Inc. is not making any representations regarding the correctness or
completeness of its contents and reserves the right to alter this document at any time without notice.
Features listed in this document are subject to change. Please contact Verint for current product features and specifications.
All marks referenced herein with the or TM symbol are registered trademarks or trademarks of Verint Systems Inc. or its
subsidiaries. All rights reserved. All other marks are trademarks of their respective owners.
2009 Verint Systems Inc. All rights reserved worldwide.

2009 Verint Systems Inc. All Rights Reserved Worldwide.
Confidential and Proprietary Information of Verint Systems Inc.
All materials (regardless of form and including, without limitation, software applications, documentation, and any other
information relating to Verint Systems, its products or services) are the exclusive property of Verint Systems Inc. Only
expressly authorized individuals under obligations of confidentiality are permitted to review materials in this
document. By reviewing these materials, you agree to not disclose these materials to any third party unless
expressly authorized by Verint Systems, and to protect the materials as confidential and trade secret information. Any
unauthorized review, retransmission, dissemination or other use of these materials is strictly prohibited. If you are not
authorized to review these materials, please return these materials (and any copies) from where they were obtained.
All materials found herein are provided AS IS and without warranty of any kind.
The Verint Systems Inc. products are protected by one or more of the following U.S., European or International
Patents: USPN 5,659,768; USPN 5,790,798; USPN 6,278,978; USPN 6,370,574; USPN 6,404,857; USPN
6,510,220; USPN 6,757,361; USPN 6,782,093; USPN 6,952,732; USPN 6,959,405; USPN 7,047,296; USPN
7,149,788; USPN 7,155,399; USPN 7,203,285; USPN 6,959,078; USPN 6,724,887; USPN 7,216,162; European
Patent 0 833 489; GB 2374249; and other provisional rights from one or more of the following Published US Patent
Applications: US 10/061,469; US 10/061,489; US 10/061,491; US 11/388,854; US 11/388,944; US 11/389,471; US
10/818,787; US 11/166,630; US 11/129,811; US 11/477,124; US 11/509,553; US 11/509,550; US 11/509,554; US
11/509,552; US 11/509,549; US 11/509,551; US 11/583,381; US 10/181,103; US 09/825,589; US 09/899,895; US
11/037,604; US 11/237,456; US 09/680,131; US 11/359,356; US 11/359,319; US 11/359,532; US 11/359,359; US
11/359,358; US 11/359,357; US 11/359,195; US 11/385,499; US 11/394,496; US 11/393,286; US 11/396,061; US
11/395,992; US 11/394,410; US 11/394,794; US 11/395,350; US 11/395,759; US 60/799,228; US 11/479,926; US
11/479,841; US 11/479,925; US 11/479,056; US 11/478,714; US 11/479,899; US 11/479,506; US 11/479,267; US
60/837,816; US 11/528,267; US 11/529,132; US 11/540,281; US 11/540,322; US 11/529,947; US 11/540,902; US
11/541,056; US 11/529,942; US 11/540,282; US 11/529,946; US 11/540,320; US 11/529,842; US 11/540,904; US
11/541,252; US 11/541,313; US 11/540,086; US 11/540,739; US 11/540,185; US 11/540,107; US 11/540,900; US
10/610,780; US 10/832,509; US 11/608,340; US 11/608,350; US 11/608,358; US 10/771,315; US 10/771,409. Other
U.S. and International Patents Pending.
VERINT, the VERINT logo, ACTIONABLE INTELLIGENCE, POWERING ACTIONABLE INTELLIGENCE, WITNESS
ACTIONABLE SOLUTIONS, STAR-GATE, RELIANT, VANTAGE, X-TRACT, NEXTIVA, ULTRA, AUDIOLOG,
WITNESS, the WITNESS logo, IMPACT 360, the IMPACT 360 logo, IMPROVE EVERYTHING, EQUALITY,
CONTACTSTORE, CLICK2STAFF, and SMARTSIGHT are trademarks or registered trademarks of Verint Systems
Inc. or its subsidiaries. Other trademarks mentioned are the property of their respective owners
Unauthorized use, duplication, or modification of this document in whole or in part without the written consent of Verint
Systems Inc. is strictly prohibited.
By providing this document, Verint Systems Inc. is not making any representations regarding the correctness or
completeness of its contents and reserves the right to alter this document at any time without notice.
Features listed in this document are subject to change. Please contact Verint for current product features and specifications.
All marks referenced herein with the or TM symbol are registered trademarks or trademarks of Verint Systems Inc. or its
subsidiaries. All rights reserved. All other marks are trademarks of their respective owners.
2009 Verint Systems Inc. All rights reserved worldwide.

Table of Contents
Verint Impact 360 Speech Analytics ....................................................................................... 1
Accuracy ................................................................................................................................... 2
Efficiency .................................................................................................................................. 2
Intelligence ............................................................................................................................... 4
Conclusion ............................................................................................................................... 5

Page 1
Verint Impact 360 Speech Analytics
Verints market-leading, 5
th
-generation Impact 360 Speech Analytics solutions are based on a unique
technology called the Complete Semantic Index. This technology helps compensate for the limitations of
existing off-the-shelf speech recognition by incorporating a process that leverages best of breed ASR
(Automated Speech Recognition) engines with a unique categorization and indexing layer that adds
significant improvements in accuracy, efficiency and intelligence.
The first step in building a Complete Semantic Index includes phonetic processing that translates the
audio into a string of the basic sound elements of human speech (called phonemes). The output of this
initial stage alone can provide basic audio search capability but has limited accuracy and no linguistic
context. For this reason, Impact 360 Speech Analytics is comprised of additional processing stages that
add linguistic and acoustic mapping on the initial phonetic output, leveraging an up to 100,000 term
(a.k.a. full language) dictionary, and a best of breed LVCSR engine (Large Vocabulary Continuous
Speech Recognition). Here, Verints technology, coupled with the knowledge gleaned from deployments
across hundreds of customer sites, comes into play. This additional stage contains rich linguistic and
acoustic models that embed extensive knowledge on how different people express themselves in contact
center conversations across multiple verticals, both in their choice of words as well as their accents and
dialects. Armed with these statistical models, the Complete Semantic Index is able to significantly improve
the accuracy compared with identifying terms based only on strings of sounds (phonemes), and provide a
transcription of the call content. Verint continuously evaluates the state of the art of speech technology
and the result is a speech engine that incorporates richer linguistic and acoustic models, providing even
higher accuracy and performance.

In the next step, transcriptions of the recently processed calls are added to an index, which is updated on
an hourly basis. This index contains all the content of all processed calls, as well as additional statistical
information and call metadata. This index enables the powerful and unique functionality of Impact 360
Speech Analytics solutions, such as the Category Wizard, automated term suggestions and
TellMeWhy. These capabilities, compared with phonetic search solutions, are reviewed in the following
three sections, covering Accuracy, Efficiency and Intelligence.

Page 2
Accuracy
The unique linguistic and acoustic models that the Complete Semantic Index leverages significantly
reduces false positive results and provides improved accuracy. These layers can help distinguish
between similar sounding words and phrases such as the word Cancel and the phrase Can Sell, which
would not likely be distinguished by a phonetics only search. Since there are only about 40 different
distinct phonemes that make up the spoken English language there are many cases such as this one
where words and phrases may sound similar but convey very different meanings. In fact a phonetics only
search solution is likely to generate about 3 such false alarms (false positive term recognition) for every
word searched on each hour of audio processed. The additional linguistic and acoustic models used by
the Complete Semantic Index can reduce these false positive results by factor of 10. On the same
probability of detection (recall), Phonetics-only search generates ten times more inaccurate results (false
positives) compared with the same search using a Complete Semantic Index. When using phonetics-only
search, a multiple term search or category results in 10,000 hours of contact center calls, may translate
into tens of thousands of false positives or false alarms. For example, a 500 seat contact center can
generate on average 60,000 hours of audio per month which can equate to 700,000 calls. A single
phonetic search on this data can result in on average over 180,000 errors.
Although for the majority of terms the linguistic layers used by the Complete Semantic Index significantly
improves search accuracy (Impact 360 Speech Analytics also identifies many acronyms while
automatically suggesting additional alternative terms), for rare non English terms and names, a phonetics
only search solution can have some advantages. As phonetic processing does not leverage a full
language model it can, in some cases, identify unique names that are not part of every day contact center
language. In the case where a proper name or customer
specific term is missing from the language model, Verint
leverages the phonetic layer to add a phonetic match of the
name. This tuning process can help customers quickly and
easily add unique names to the model if required. As
discussed above, the search accuracy of the terms added
using this phonetic layer may not be as high as other terms,
but this provides a quick and easy way to address potential
out-of-vocabulary situations. In cases where the accuracy of
these added terms in not sufficient, Verint offers a process that
allows for the linguistic and acoustic models to be customized
to a specific environment thus bringing even non English
terms and names to the same accuracy level as all other
terms, providing higher overall recognition accuracy.
The Verint approach leverages the best of phonetics and LVCSR recognition layers providing a complete
solution that produces higher overall accuracy together with the needed flexibility for contact center
environments. However, the unique Complete Semantic Index provides many additional benefits
discussed in the following two sections.

Efficiency
The phonetics pre-analysis processing stage is short, which translates into relatively fast call processing.
However the resulting phonetic index presents several challenges that greatly reduce the efficiency of a
solution that relies mostly on a phonetics index. A phonetic index is very similar in size to the original
audio files, making it harder to scale it to accommodate for large call volumes, and make searching
against it much slower (about 600 times slower than a parallel search against the Complete Semantic
Index). Each new search against a phonetic index may take anywhere between minutes to hours to return

Page 3
Category Building Wizard automatically suggests relevant terms

results on large volumes of audio, and is dependent on the number of terms searched and the audio
sample size. This makes ad hoc searching and category definition a very time consuming process. For
example, a phonetic only ad-hoc search against a months worth of calls for a 500 seat contact center will
take approximately 70 minutes, using the fastest available phonetic technology. In comparison, the
Complete Semantic Index produces new search results for any number of term or phrase combinations
within less than 3 seconds for millions of calls.
In addition to the large phonetics index, most 3
rd
party solutions also create an additional copy of the
original audio, in effect tripling the total storage required compared to a fully integrated Speech Analytics
solution.
Verint has succeeded to greatly improve the efficiency of the Complete Semantic Index processing
phase. Verint can now process calls in speeds of 40 RT (Real Time), transcribing 500-1000 hours of
audio per day on a single standard off-the-shelf server. In most multi-site enterprise environments, at
least one transcription server is allocated to each site, to avoid transfer of large call sets from site to site.
In such environments, the total number of servers required for a phonetics only search solution is similar
to the number required for Verints Complete Semantic Index solution, which provides much higher value
and accuracy. Overall, Verint provides the most hardware efficient solution that can transcribe and
analyze the entire content of all processed calls.
The Complete Semantic Index also introduces many other unique benefits that increase efficiency. For
example, category definition is fast and easy. By analyzing the entire content of calls, the category
definition wizard proactively suggests correlated terms and phrases available in order to accurately
categorize calls. Often, the system will surface terms that users may not otherwise think of. Without the
ability of analyzing the entire content of calls users have to guess the terms they will use and do not know
how effective or relevant these terms are, and whether they may be missing significant sets of related
calls. For example: refund, return, cash back, my money, can all mean the same but are
phonetically different, so using phonetics only search you need to know and input all terms separately.
With Verints Speech Analytics a user can input one of these terms and the system will surface the
correlated alternatives actually being spoken in that specific environment, providing much faster and more
accurate category definitions.
Building a robust category with Impact 360
TM
Speech Analytics typically takes between 30 minutes to 2
hours. Defining categories in a phonetics only search solution is a long process that includes many
iterations and long waits for each new search to return results. Consequently, building a robust category
may take days or weeks and require much more experienced users or even professional services from

Page 4
Impact 360
TM
Speech
Analytics automatically
surfaced that problems
with passwords are the
key driver of calls relating
the self service portal

the phonetics search solution vendor. Verint also provides the ability to easily retune categories to keep
up with changes in the business environment, by automatically surfacing new correlated terms that can
be added to the category definition.
Once a category is defined, all calls are instantly re-categorized, enabling immediate drilldown into the
root causes driving this category. Verint automatically increases category accuracy by using a unique
ranking algorithm backed by self-learning capabilities. All of these capabilities are unique to Verints
Complete Semantic Index technology and are not available in any phonetic search solution that can only
return results on phonetic matches, based only on terms users know to search for.

Intelligence
Large contact centers handle millions of calls in an ever changing dynamic environment. There are many
issues that drive calls into the contact center, and every new day introduces new reasons and challenges
that need to be recognized, evaluated and resolved. In order for a speech analytics solution to be able to
continuously analyze this massive amount of data and surface key trends, even if they are just emerging,
it needs to provide more than just smart search capabilities. Phonetics-only search solutions cannot
surface unknown issues. Such solutions require users to know exactly what issues to search for and
exactly how they are articulated by agents and customers.

Verints new TellMeWhy functionality automatically surfaces unknown issues. For every call set,
whether it is billing related calls, or calls with duration of 10 minutes an above, Verints TellMeWhy can
surface the key drivers of these calls. By leveraging the Complete Semantic Index, the system compares
the content of the specific subset of calls to the entire content of the rest of the calls, identifying terms and
phrases that are unique to this call subset. TellMeWhy then clusters these terms and prioritizes the top
five root cause groups, automatically suggesting the top driving reasons for any call set. This functionality
can help focus users attention to call drivers, whether known or unknown, that have the most impact on
their business. Answering common call center management questions such as: why did I have a call
spike yesterday, what drives my longest calls, what were customers calling about yesterday, what is
the biggest complaint driver.

Page 5
Verints TellMeWhy functionality enables instant drilldown into each Root Cause call set, pulling and
prioritizing the relevant calls so users can quickly get to the information they need.
When playing a call, Verint presents a unique visual map of the call. This visual audio navigation aid
consist of an energy envelope (waveform) of the call, with the terms suggested by the system highlighted
above it to help get to the relevant segment in the call. A synchronized interactive transcription of the
entire call is also available, with terms that relate to distinct business issues highlighted in different colors.
Users can recognize additional issues in the transcription and skip directly to the relevant area in the call
by clicking on any of the terms. This visual transcript map can save over 90% of the time it takes to review
the call compared with a solutions that do not have a visual synchronized transcription to every processed
call. It enables a quick and efficient call review while providing easy access to additional insights and
intelligence embedded in these calls.

Conclusion
Verints market-leading, 5
th
- generation Impact 360 Speech Analytics solution leverages the full content
and context of every processed call with a phonetic layer for adding unique terms, providing the most
complete solution with unique advantages and capabilities, including:
1. Highest accuracy - with contextual understanding
2. High flexibility a combination of phonetic, LVCSR, and unique indexing technology
3. Instant query and categorization results no need to predefine searches in advance
4. Call relevancy ranking prioritizing the most pertinent calls
5. Proactive category wizard defining categories in less than 2 hours with minimal training
6. TellMeWhy surfacing what users may not know to search for
7. Visual map of the call quick and efficient call review with additional insights
8. Full Integrated Solution enabling lower total cost of ownership, increased security and many
WFO application synergies with quality monitoring, data analytics and reporting applications.

Speech Analytics - The Complete Semantic Index

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Speech Analytics - The Complete Semantic Index

Uploaded by

Copyright:

Available Formats

The Complete Semantic Index

An Overview of Verints Speech Analytic Technology

You might also like