VXML

VoiceXML Programmers Guide
VOICEXML PROGRAMMER S GUIDE
BeVocal, Inc. 685 Clyde Avenue Mountain View, CA 94043 Part No. 520-0001-02 Copyright 2005. BeVocal, Inc. All rights reserved.
Table of Contents
Preface. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1
Audience . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 Conventions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 How to Use This Guide . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 References. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1. Getting Started . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5
VoiceXML . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Tags and Elements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Simple Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 Documents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 Dialogs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 Grammars . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 Events . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 Links . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 Universal Commands and Grammars . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 Procedural Logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 User Interaction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 Flow of Execution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 Explicit Transition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 Recognition-Triggered Transition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 Subdialogs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 Collecting Input and Playing Prompts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2. Forms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
Form Items . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 Form-Item Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 Execution of a Form . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 User Interaction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
3. Event Handling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .25

Predefined Events . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 Default Event Handlers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 Application-Defined Event Handlers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 Events in Subdialogs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 Throwing Events . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 Application-Defined Events . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
4. Fetching and Caching Resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

How Fetching and Caching Work. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 Requests and Responses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 Using Multiple Caches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 Fundamentals of Controlling Caching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 What You Can Control from VoiceXML . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 Prefetching Resources. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 Prefetch Cache Details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 Fetch Hints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 Restrictions on Prefetching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 Handling Fetching Delays . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 Timeouts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 Background Audio . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 Queued Prompts when Fetching. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 Controlling the Use of Cached Resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 Maximum Age . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 Maximum Stale Time. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 Caching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 Mimicking Response Headers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 Submitting Complex JavaScript Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
5. Using Multiple-Recognition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
Multiple Recognition Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 N-Best Recognition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 Multiple Interpretations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 Combining the Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 Working with Multiple Recognition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 Using N-Best Recognition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 Enabling N-Best Recognition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 Checking for Multiple Utterances . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
Selecting an Utterance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 Using Multiple Interpretations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 Enabling Multiple Interpretations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 Checking for Multiple Interpretations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 Selecting an Interpretation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 Using Both Features Together . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 Enabling Both Features. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 Checking for Multiple Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 Selecting a Result . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 Simple Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 Generating a Subdialog . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
6. Controlling Outbound Calls . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69

Call Status . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 Limitations on Outbound Calls . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 Interactions Without an Outbound Call . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 Interactions During a Transfer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 Without a Transfer Grammar. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 With Transfer Grammars. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 Interactions During a Dialed Call . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 Putting a Dialed Call on Hold . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74 Listening to a Dialed Call . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 With Listen Grammars . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 Without a Listen Grammar . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 Interrupting a Dialed Call . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 VoIP and Outbound Calls. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
7. Go-Back Facility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
Retracting User Responses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79 Go-Back Stack . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 Stack Entries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 Go-Back Destinations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 Menus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 Mixed-Initiative Forms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 Input Items . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 Enabling the Go-Back Facility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
Activating the Universal Grammar . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83 Setting the Minimum Stack Size . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83 Controlling Go-Back Behavior . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 Suppressing Retraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 Customizing Go-Back . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85 Using the Go-Back Facility. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86 Selecting the Minimum the Stack Size . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86 Using Blocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86 Using Subdialogs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88 Using Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
8. TTS and Recorded Voice Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89

Specifying TTS Voices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89 Supported TTS Voices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89 Property syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89 Property Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90 Property Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90 Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90 Voice tag Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90 Voice tag description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90 Specifying Recorded Voices. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91 Supported Recorded Voices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91 Lists of Fallback Voices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92 Rationale . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92 Algorithm for Selecting the Voice. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92 Selecting Voice Example. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92 Best Practices for Voice Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93 Overriding Recorded Voices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
9. Dynamic SSML . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .95

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95 Using Dynamic SSML . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96 Examples and Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96 Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98 SSML Document . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99 Extensions to the SSML spec . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
10. SOAP Client Facility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .101

Locating and Identifying SOAP Services . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101 Calling SOAP Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103 Type conversion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103 SOAP Headers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105 SOAP Methods are JavaScript objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105 Error Handling For WSDL-Based Services . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105 Error Handling for Non-WSDL-Based Services . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
11. Tags . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .109

Tag Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .110 Tag Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .112 Tag Descriptions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .115 <assign> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .116 <audio> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .118 <bevocal:connect> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122 <bevocal:dial> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124 <bevocal:disconnect> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131 <bevocal:enroll> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133 <bevocal:foreach>. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140 <bevocal:hold> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142 <bevocal:listen> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144 <bevocal:register>. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147 <bevocal:verify> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150 <bevocal:whisper> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153 <block> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155 <break> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157 <catch> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159 <choice> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162 <clear> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165 <data> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167 <disconnect>. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171 <div> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173 <dtmf> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174 <else> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177 <elseif> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179 <emp> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181 <emphasis>. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182
VOICEXML PROGRAMMER S GUIDE 7
<enumerate>. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183 <error> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186 <example> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189 <exit> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 190 <field> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192 <filled>. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200 <foreach> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202 <form>. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204 <goto> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207 <grammar> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209 <help> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 225 <if> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 228 <initial> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 230 <item> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232 <lexicon>. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 234 <link>. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235 <log> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 238 <mark> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 240 <menu> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241 <meta> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 244 <metadata> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 246 <noinput> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 247 <nomatch> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 250 <object> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253 <one-of> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 254 <option> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 256 <p> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 258 <paragraph> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 259 <param> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 260 <phoneme> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 262 <prompt>. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 264 <property>. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 268 <pros> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 272 <prosody> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 273 <record> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 275 <reprompt> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 279 <rethrow> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 281
<return>. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 283 <rule> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 285 <ruleref> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 287 <s> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 290 <say-as> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 291 <sayas>. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 294 <script> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 297 <send> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 301 <sentence> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 304 <speak> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 305 <sub> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 306 <subdialog>. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 307 <submit> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .311 <tag> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 314 <throw> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 316 <token> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 319 <transfer> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 320 <value> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 328 <var> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 332 <voice> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 334 <vxml>. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 336
12. Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 339

Property Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 340 Property Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 341 Property Descriptions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 343 audiofetchhint . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 343 audiomaxage. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 344 audiomaxstale . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 344 bargein . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 344 bargeintype . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 344 bevocal.audio.capture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 345 bevocal.audio.outputvolume . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 345 bevocal.dtmf.flushbuffer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 345 bevocal.fetchaudio.allfetches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 345 bevocal.fetchaudio.extend . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 346 bevocal.fetchaudio.flushqueue . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 346 bevocal.fetchaudio.sounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 346
bevocal.finaltimeout . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 346 bevocal.goback . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 347 bevocal.grammar.interpretationtype . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 347 bevocal.grammar.phoneticpruning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 347 bevocal.grammar.weightfactor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 347 bevocal.grammar.wordtransitionpenalty . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 348 bevocal.hotwordmax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 348 bevocal.hotwordmin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 348 bevocal.incrementErrorOnNSP. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 348 bevocal.locale . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 348 bevocal.logging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 349 bevocal.maxdialogerrors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 349 bevocal.maxerrors. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 349 bevocal.maxinterpretations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 349 bevocal.mingoback . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 350 bevocal.securelogging.enabled. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 350 bevocal.securelogging.key . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 350 bevocal.security.key . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 351 bevocal.sounds.listening . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 351 bevocal.sounds.maskrecognitionlatency. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 351 bevocal.sounds.recognition. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 351 bevocal.transfer.terminatetones . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 352 bevocal.utterance.prefix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 352 bevocal.voice.name . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 352 bevocal.vxml.maxrecognitionlatency. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 352 caching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 352 completetimeout . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 353 confidencelevel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 353 datafetchhint . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 353 datamaxage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 353 datamaxstale . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 354 documentfetchhint . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 354 documentmaxage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 354 documentmaxstale . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 354 fetchaudio . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 354 fetchaudiodelay . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 355 fetchaudiominimum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 355
10
fetchtimeout. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 355 grammarfetchhint . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 355 grammarmaxage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 355 grammarmaxstale . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 356 incompletetimeout . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 356 inputmodes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 356 interdigittimeout. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 356 maxnbest. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 357 maxspeechtimeout . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 358 recordutterance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 358 recordutterancetype . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 358 scriptfetchhint . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 359 scriptmaxage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 359 scriptmaxstale . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 359 ssmlfetchhint . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 359 ssmlmaxage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 360 ssmlmaxstale . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 360 sensitivity. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 360 speedvsaccuracy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 360 termchar . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 360 termtimeout . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 360 timeout . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 361 universals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 361
13. Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 363

Variable Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 363 Variable Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 364 Variable Descriptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 364 _event . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 364 _message . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 364 application.lastaudio$ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 364 application.lastresult$ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 365 session.bevocal.timeincall. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 369 session.bevocal.version . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 369 session.iidigits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 369 session.telephone.ani . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 369 session.telephone.dnis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 369
11
14. JavaScript Functions and Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 371

JavaScript Constants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 371 bevocal.outboundrequestid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 371 bevocal.sessionid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 371 _addHeader . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 371 bevocal.cookies.addClientCookie . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 372 bevocal.cookies.deleteClientCookie . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 375 bevocal.cookies.getClientCookie . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 375 bevocal.cookies.getClientCookies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 376 bevocal.enroll.removeEnrolledPhrase . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 376 bevocal.getProperty . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 377 bevocal.getVersion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 377 bevocal.log . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 378 bevocal.soap.serviceFromWSDL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 378 bevocal.soap.serviceFromEndpoint . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 379 bevocal.soap.locateService . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 381 bevocal.soap.SoapException. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 382 bevocal.soap.SoapFault. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 383 bevocal.soap.FaultDetails . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 383
12
Preface
VoiceXML is a markup language for writing telephone-based speech applications. This document describes BeVocal VoiceXML, which is compliant with the W3C VoiceXML Version 2.0 Specification.
Audience
This document is for software developers using the BeVocal Caf development environment. It assumes you are familiar with the basic concepts of HTML.
Conventions
Italic font is used for: Introducing terms that will be used throughout the document Emphasis
Bold font is used for headings. Fixed width font is used for: Code examples Tags and attributes Values or text that must be typed as shown Filenames and pathnames
Italic fixed width font is used for: Variables Prototypes or templates; what you actually type will be similar in format, but not the exact same characters as shown
How to Use This Guide

Part I of this guide explains how to use VoiceXML features. A new application developer typically reads these chapters completely and in order. Chapter 1, Getting Started introduces VoiceXML and its major features. Chapter 2, Forms describes VoiceXML forms. Chapter 3, Event Handling describes events that can be thrown during the execution of a VoiceXML application and how events are handled. Chapter 4, Fetching and Caching Resources explains how an application can control the way VoiceXML documents and other resources are fetched and cached.
PREFACE
Chapter 5, Using Multiple-Recognition describes how the BeVocal VoiceXML interpreter can provide multiple recognition results.
Part II of this guide explains how to use Extended VoiceXML features. A new application developer typically reads those chapters which are relevant for his application. Chapter 6, Controlling Outbound Calls describes the BeVocal VoiceXML call-control features, an extension to VoiceXML. Chapter 7, Go-Back Facility describes the BeVocal VoiceXML go-back facility, an experimental extension to VoiceXML. Chapter 8, TTS and Recorded Voice Selection describes the BeVocal VoiceXML TTS and Recorded Voice Selection facility, an experimental extension to VoiceXML Chapter 9, Dynamic SSML describes the BeVocal VoiceXML Dynamic SSML facility, an experimental extension to VoiceXML Chapter 10, SOAP Client Facility describes the BeVocal VoiceXML SOAP Client facility, an experimental extension to VoiceXML
Part III of this guide provides reference descriptions of the various components of the VoiceXML language. Application developers typically do not read these chapters from start to finish, but instead use them to look up information about the various tags, properties, and so on. Chapter 11, Tags describes the tags that make up VoiceXML. Chapter 12, Properties describes the properties that can be set to control the behavior of a VoiceXML application. Chapter 13, Variables describes predefined variables that are available in VoiceXML applications. Chapter 14, JavaScript Functions and Objects describes predefined JavaScript functions that are available in VoiceXML applications.
References
For additional or related information, you can refer to: VoiceXML Version 2.0 Specification. VoiceXML Forum. (http://www.w3c.org/TR/voicexml20) VoiceXML Tag Summary. BeVocal. (http://cafe.bevocal.com/docs/vxml_summary/index.html) Grammar Reference. BeVocal. (http://cafe.bevocal.com/docs/grammar/index.html) JavaScript Quick Reference. BeVocal. (http://cafe.bevocal.com/docs/javascript_quick_reference/index.html)
PART 1
Using VoiceXML
This part explains how to use VoiceXML features: Chapter 1, Getting Started Chapter 2, Forms Chapter 3, Event Handling Chapter 4, Fetching and Caching Resources Chapter 5, Using Multiple-Recognition
Getting Started
VoiceXML is a markup language derived from XML for writing telephone-based speech applications. Users call applications by telephone. They listen to spoken instructions and questions instead of viewing a screen display; they provide input using the spoken word and the touchtone keypad instead of entering information with a keyboard or mouse. This chapter describes: VoiceXML User Interaction Flow of Execution Collecting Input and Playing Prompts
VoiceXML
Just as a web browser renders HTML documents visually, a VoiceXML interpreter renders VoiceXML documents audibly. You can think of the VoiceXML interpreter as a telephone-based voice browser. As with HTML documents, VoiceXML documents have web URIs and can be located on any web server. Yet a standard web browser runs locally on your machine, whereas the VoiceXML interpreter is run remotelyat the VoiceXML hosting site, for example. And you use your telephone to access the VoiceXML interpreter. Environment In order to support a telephone interface, the VoiceXML interpreter runs within an execution environment that includes a telephony component, a text-to-speech (TTS) speech-synthesis component, and a speech-recognition component. The VoiceXML interpreter transparently interacts with these infrastructure components as needed. For example: Text strings in output elements are rendered using TTS. Connection issues (picking up the incoming call, detecting a hang-up, transferring a call) are handled by the telephony component. Listening to spoken input from the user and identifying its meaning is handled by the speech-recognition component.
Tags and Elements VoiceXML uses markup tags and plain text. A tag is a keyword enclosed by the angle bracket characters (< and >). A tag may have attributes inside the angle brackets. Each attribute consists of a name and a value, separated by an equal sign (=) and the value must be enclosed in quotes.
GETTING STARTED
Tags occur in pairs; corresponding to the start tag <keyword> is the end tag </keyword>. Between the start and end tag, other tags and text may appear. Everything from the start tag to the end tag, is called an element. For example, the following three lines constitute a prompt element: <prompt> What is your telephone number? </prompt> If there are no other tags or text between the start and end tag, a syntactic shorthand is permitted. You can precede the closing angle bracket ( > ) of the start tag with a slash ( / ) and omit the end tag. For example, instead of writing a value element as: <value expr="result"></value> you can use the shorthand notation: <value expr="result"/> Because the syntax specifies the end of each element, the VoiceXML interpreter can check that the entire document has been received. If one element contains another, the containing element is called the parent element of the contained element. The contained element is called a child element of its containing element. The parent element may also be called a container. Although both HTML and VoiceXML use markup tags, the two languages use tags differently. Whereas the markup tags in HTML describe how to render the data, the markup tags in XML (and consequently in VoiceXML) describe the data itself. This allows an XML interpreter or browser to display the data in whatever way is appropriate. BeVocal VoiceXML generally complies with the VoiceXML 2.0 Specification. It also includes several handy extensions that you can use if you choose. VoiceXML Tag Summary lists any differences between BeVocal VoiceXML and the standard. Tip: VoiceXML conforms to XML standards; the formats for VoiceXML tags are more strictly defined than are the formats in HTML. If you are used to HTML and not XML, remember that all container elements require end tags and all attribute values must be in quotes.
Simple Example In VoiceXML, the <form> element is analogous to an HTML form that contains items for the user to enter. In VoiceXML forms, each logical piece of information to be collected from the user is identified with a <field> tag. The form in the following example collects one piece of information from the user. Once this information is obtained, execution proceeds to the fields <filled> element. Other tags used in the example include the following: The <script> tag specifies a block of client-side JavaScript code. The <var> tag declares a variable to be used within the form. The <prompt> tag produces audio output for the user. The <assign> tag assigns a value to a variable. The <value> tag evaluates an expression and produces spoken output of the result.
This example requests a number from the caller, computes the factorial of that number, and repeats the answer to the caller. <?xml version="1.0" ?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN"
VoiceXML
"http://cafe.bevocal.com/libraries/dtd/vxml2-0-bevocal.dtd"> <vxml version="2.0" xmlns="http://www.w3.org/2001/vxml">  <script> <![CDATA[ function factorial(n) {return (n <= 1) ? 1 : n * factorial(n-1);} ]]> </script>  <form id="computefactorial">  <var name="result"/>  <field name="num" type="number">  <prompt>please say a number </prompt>   <filled>  <assign name="result" expr="factorial(num)" />  <prompt>The factorial of <value expr="num"/> is <value expr="result"/> </prompt>  </filled> </field> </form> </vxml> VoiceXML contains no explicit instructions about how to present the prompt, please say a number or how to present the results. In theory, these could be presented textually on a different kind of browser. In practice, the example document is run as a telephone application and results in conversations such as the following. Application: User: Application: Please say a number. 4 The factorial of 4 is 12.
GETTING STARTED
Documents An executable VoiceXML file is called a document. The VoiceXML interpreter loads a document file to execute it. Every VoiceXML document must start with header information that conforms to the XML standard: <?xml version="1.0" ?> <!DOCTYPE vxml PUBLIC "-//W3C/DTD VoiceXML 2.0//EN" "http://www.w3.org/TR/voicexml20/vxml.dtd"> <vxml version="2.0" xmlns="http://www.w3.org/2001/vxml"> These headers describe the language in which the document is written: The first tag indicates that the document is an XML document. This tag is required. Always use this tag exactly as specified; it must be the very first characters in the document. To be a legal XML document, the first 4 characters of any XML file (including a VoiceXML document) must be: <?xm No characters, not even whitespace characters such as space or newline, can come before these 4 characters in a VoiceXML document. The second tag identifies the Document Type Definition (DTD), which is used to validate that the contents represent well-formed VoiceXML. This tag is optional. A DTD describes the format of the data that might appear in an XML document. That is, the DTD defines the valid tags by specifying what attributes each tag can have and what child tags or other content each tag can contain. If your document contains only standard VoiceXML elements, you can use the DTD shown above. If you use any of the BeVocal VoiceXML extensions to VoiceXML, youll need to use the correct DTD. In this case, you replace the DOCTYPE element with the following: <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" "http://cafe.bevocal.com/libraries/dtd/vxml2-0-bevocal.dtd"> You should include a DOCTYPE declaration during development, as it allows better error checking by the interpreter. You may remove it during deployment for performance. The third tag identifies the version of VoiceXML used in this document and the designated namespace for VoiceXML. This tag is required. For VoiceXML 2.0, this tag should always include these 2 attributes. It can also include optional attributes described in the section on the <vxml> tag. Apart from headers and possibly comments, all the content in a VoiceXML document is contained within a <vxml> element, that is, between the <vxml> start tag and the </vxml> end tag. Applications A VoiceXML application consists of one or more documents. Any multidocument application has a single application root document. Each document in an application identifies the application root document with the application attribute of the <vxml> tag: <vxml version="2.0" xmlns="http://www.w3.org/2001/vxml" application="myAppRoot.vxml" > Whenever the interpreter executes a document, it loads that document. If the document specifies an application root document, that document is also loaded.
VoiceXML
You can use an application root document for global items or interactions that you want to be active throughout the application. For example, suppose the application root document myAppRoot.vxml declares a variable named company that has an initial value of BeVocal: <vxml version="2.0" xmlns="http://www.w3.org/2001/vxml"> <var name="company" expr="BeVocal"> ... This variable has application scope. That is, any document in the application can use the variable. Dialogs Within a document, a user interacts with dialogs, in which the application produces auditory output, typically asking for information. The user provides input by speaking or pressing keys on the telephone. User speech must be recognized and its meaning interpreted. The telephone key input is interpreted as a sequence of tones in the Dual Tone Multifrequency (DTMF) signalling system. VoiceXML has two kinds of dialogs: forms and menus. A form interacts with the user to fill in a number of fields. Every field has an associated variable, called its input-item variable, or just input variable. Initially, the variable has a value of undefined. It is filled in when the speech-recognition engine recognizes a valid response in a user utterance. Note: In VoiceXML 1.0, an input-item variable was known as a field-item variable. A menu presents the user with a number of choices; it transitions to a different dialog based on the users selection.
Forms The VoiceXML <form> tag defines a form and the <field> tag defines a field in a form. You specify the name of the input variable with the name attribute of the <field> tag. You can use the input variables name in expressions to refer to the stored value. In the example in Simple Example on page 6, the input variable is named num: <field name="num" type="number"> When the user says the number, the number is stored in the num variable. Then the interpreter proceeds to execute the fields <filled> element. Here, the num variable in the <assign> element is evaluated before being passed as the parameter to the factorial function. <assign name="result" expr="factorial(num)" /> Menus The <menu> tag defines a menu; each choice consists of a <choice> element. The next attribute of a <choice> element specifies the destination dialog to which the interpreter should transition when the user selects that choice. If a <form> or <menu> element is to be the destination of a transition, the id attribute for the destination dialog should specify a unique identifier. For example, the following menu consists of three choices. <menu> <prompt> Please choose one of <enumerate/> </prompt> <choice next="#MovieForm"> local movies </choice> <choice next="localBroadcast.vxml#RadioForm">
GETTING STARTED
local radio stations </choice> <choice next="http://www.nationTV.org/tv.vxml"> national TV listings </choice> </menu> The prompt in this menu includes an <enumerate> tag. This tag lets you set up a template for an automatically generated description of the choices. By default, the <enumerate> template simply lists all the choices. In the above example, the prompt is Please choose one of local movies, local radio stations, national TV listings. The destination dialog specified by the next attribute can be in the current document or in a different document: If the user says local movies, the interpreter transitions to the dialog named MovieForm in the same document. If the user says local radio stations, the interpreter transitions to the dialog named RadioForm in the document localBroadcast.vxml. If the user says national TV listings, the interpreter transitions to the first dialog in the document tv.vxml in the national TV web site.
Properties You can set properties to customize the behavior of the interpreter. The <property> tag specifies the property to set and the value for that property. Various properties control how the interpreter behaves when prompting the user for input, recognizing speech or DTMF input, and fetching documents and other resources. For additional information, see Chapter 12, Properties. Grammars The speech-recognition engine uses grammars to interpret user input. See the Grammar Reference for details on creating and using grammars. Here, we only cover a portion of the relevant information. Each field in a form can have a grammar that specifies the valid user responses for that field. An entire form can have a grammar that specifies how to fill multiple input variables from a single user utterance. Each choice in a menu has a grammar that specifies the user input that can select the choice. A VoiceXML application can use built-in grammars and application-defined grammars.
10
VoiceXML
Built-in Grammars The following basic grammars are built into all standard VoiceXML interpreters: Grammar Type boolean currency date digits number phone time Description Recognizes a positive or negative response. Recognizes an amounts of money, in dollars. Recognizes a calendar date. Recognizes a sequence of digits. Recognizes a number. Recognizes a telephone number adhering to the North American Dialing Plan (with no extension). Recognizes a clock time.
BeVocal VoiceXML contains additional built-in grammars as an extension to standard VoiceXML: Grammer Type airport airline equity citystate stockindex street streetaddress Description Recognizes an airport name or code, such as DFW or Dallas-Fort Worth. Recognizeds an airline name or code, such as AA or American Airlines. Recognizes a company symbol or full name, such as IBM or Cisco Systems. Recognizes US city and state names, for example, Sunnyvale, California. Recognizes the names of the major US stock indexes, such as Nasdaq. Recognizes a street name (with or without street number). Recognizes a street name and a street number.
You can reference a built-in grammar in either of two ways: You can use a standard built-in grammar as the type attribute of a <field> element. The example in Simple Example on page 6 uses the built-in number grammar: <field name="num" type="number"> This means that the speech-recognition engine tries to interpret what the user says as a number. You can use any built-in grammar (standard or BeVocal VoiceXML extension) in a <grammar> element by specifying the src attribute with a URI of the form: builtin:grammar/typeName For example: <grammar src="builtin:grammar/boolean"/> Application-Defined Grammars Although the built-in grammars can be useful, you typically need to define your own grammars. An application-defined grammar can be specified in the following forms: Augmented BNF (ABNF) form of the W3C Speech Recognition Grammar Format XML form of the W3C Speech Recognition Grammar Format Nuance Grammar Specification Language (GSL) Java Speech Grammar Format (JSGF)
11
GETTING STARTED
A simple grammar can be defined in the document. An inline grammar is defined within the <grammar> element itself. For example, the following inline ABNF grammar matches the words add and subtract. <field name="operator"> <grammar> #ABNF 1.0; root $op; $op = add | subtract; </grammar> ... With this grammar, if the user says add, the input variable operator is set to add. More complex grammars can be written externally. An external grammar is defined in a file separate from the VoiceXML document file and is referenced by the src attribute of the <grammar> element. For example, the following field uses a grammar rule named Colors in an external XML grammar defined in the file partGrammar.grxml. <field name="part"> <grammar src="http://www.mySite/partGrammar.grxml#Colors"/> ... The named rule (Colors in the preceding example) is the one the interpreter will use to start recognition. The specified file may include other grammar rules, which may be used as subrules of the this rule. The grammar for a menu choice can be specified explicitly with a <grammar> child of the <choice> element. Alternatively, a grammar can be generated automatically from the choice text. If the accept attribute of the <menu> tag is set to approximate, the user can say a subset of the words in the choice text to select that choice. Adding this attribute to the preceding example allows the user to say TV listings or just TV to select the third choice: <menu accept="approximate"> ... <choice ...> national TV listings </choice> </menu> Note that the words must be spoken in the correct order; listings, TV would not be recognized. If you want some choices to be matched exactly and others to allow a subset of the words, you can specify the accept attribute on individual <choice> elements. Active Grammars The speech-recognition engine uses active grammars to interpret user input. A field grammar is active whenever the interpreter is executing that field. A menu-choice grammar is active whenever the interpreter is executing the containing menu. A form grammar is active whenever the interpreter is executing the containing form. A form grammar or the collection of choice grammars in a menu can optionally be made active at higher scopes: A grammar with document scope is active whenever the interpreter is executing any dialog in the document. A grammar with application scope is active whenever the interpreter is executing any document in the application.
12
VoiceXML
If the interpreter is executing one dialog and the users input matches an active grammar for a different dialog, control transfers to the latter dialog. If the grammar is in application scope, control might transfer to a dialog in a different document. Note that within a field, you can temporarily turn off grammars from higher scopes by setting the fields modal attribute to true. Events The VoiceXML interpreter can throw a number of predefined events based on errors, telephone disconnects, or user input. For example: A no-input event is thrown if the user does not respond to a question. A no-match event is thrown when the user does not respond intelligiblythat is, when the users utterance does not match any active grammar. A help event is thrown when the user requests help. An error event is thrown when any kind of error occurs.
An application can define additional events and can use a <throw> element to throw an event of a specified kind. An application can catch an event and take the appropriate response in an event handler. A <catch> element is a general-purpose event handler; its event attribute specifies the kinds of event that it handles. Additional event-handling tags are syntactic shorthand: <noinput>, <nomatch>, <help>, and <error>. Each of these shorthand tags catches one type of event, indicated by its name. For example, a <nomatch> element catches no-match events. When an event is thrown, the associated event handler, if it exists, is invoked. If the handler did not cause the application to terminate, execution resumes in the element that was being executed when the event was thrown. For more information, see Chapter 3, Event Handling. Links A link specifies a grammar that is independent of any particular dialog. A <link> element defines a link. Each <link> element contains a <grammar> element. A links grammar is active in the scope of the element that contains the link. For example, if the link is in a form, its grammar is active when the interpreter is executing that form. If a link is under a <vxml> element, its grammar has document scope; if the link is in the application root document, its grammar has application scope. Links in a <vxml> element can implement global behaviors. A link can specify one of two possible actions to take if the speech-recognition engine detects a match its grammar: The link can cause a transition to a different location; in that case, its next attribute specifies the destination of the transition. Links, like menu choices, can cause transitions to other dialogs or documents. The link can throw an event; in that case, its expr attribute specifies the event to throw. After the event is handled execution resumes with the element that was being executed when the link grammar was matched.
13
GETTING STARTED
For example, the following link is defined at document level; its grammar is active whenever the interpreter is executing any dialog in the document. If the user says operator, the link transfers control to a different document. <vxml version="2.0" xmlns="http://www.w3.org/2001/vxml"> <link next="operator_xfer.vxml"> <grammar type="application/x-nuance-gsl"> operator </grammar> </link> ... Universal Commands and Grammars A universal command is always availablethe user can give the command at any point in an interaction. A universal grammar specifies user utterances that can be recognized as a universal command. Predefined Universal Grammars The following predefined universal grammars are available to all applications: Grammar help exit cancel goback Description The user asked for help. The user asked to exit. The user asked to cancel the prompt that is playing. The user wants to retract the last response and go back to an earlier part of the interaction.
If one of these predefined universal grammars is activated and a user utterance matches the grammar, an event of the same name is thrown. For example, a help event is thrown when the user says help. Application-Defined Universal Grammars An application creates its own universal command by defining and enabling a new universal grammar and implementing its response to the command. To define a universal grammar, set the universal attribute in the <grammar> tag that defines the grammar for the command. The attribute value is a name that uniquely identifies the grammar among all universal grammars in the application. In the following example, the new universal grammar is named joke; the user utterance Tell me a joke will be a universal command when this universal grammar is activated. <grammar universal="joke" type="application/x-nuance-gsl"> (tell me a joke) </grammar> Activating Universal Grammars An application can activate any of the universal grammars to enable the corresponding universal commands. When a universal grammar is activated, a user utterance that matches the grammar is treated as a universal command. All universal grammars are deactivated by default. The application can activate some or all universal grammars by setting the universals property. This property specifies which of the universal grammars should be active; all other universal grammars are deactivated.
14
VoiceXML
Set the universals property to all to activate all universal grammars (both predefined and application-defined):  <property name="universals" value="all" />
Set the universals property to a space-separated list of grammars to activate those universal and deactivate others:  <property name="universals" value="help goback joke" />
Set the universals property to none to deactivate all previously activated universal grammars in the current scope.  <property name="universals" value="none" />
Note: (VoiceXML 1.0 only) If the <vxml> tags version attribute is 1.0, all universal grammars are activated by default. Responding to Application-Specific Universal Grammars A <link> element containing a universal grammar implements the applications response to the corresponding universal command. Your application can respond to the command in whatever manner is appropriate. Typically, the response is to throw an event or to transition to a different form. If you throw an application-specific event, you must provide an event handler to take the appropriate action. For example:  <link event="joke">  <grammar universal="joke" type="application/x-nuance-gsl"> (tell me a joke) </grammar> </link>  <catch event="joke"> <subdialog name="joker" src="telljoke.vxml"/> </catch> Procedural Logic You can use procedural logic, called executable content, within a few basic elements: <block>, <filled>, and event handlers. Within executable content, you can declare and assign values to variables, use simple conditional logic, perform iteration (a BeVocal VoiceXML extension), output speech or audio to the user, or run a JavaScript script. Variables Variables are declared by the <var> tag. Declarations can appear in a document, a form, or executable content. The <var> tag can optionally specify the variables initial value; if it doesnt, the variable is initialized to undefined.
15
GETTING STARTED
A variable has the scope of the element that contains its declaration: A variable has document scope if it is declared in a <vxml> element, or in a <block> or event handler that is a child of the <vxml> element. If the document is the application root document, then the variable has application scope. You can refer to a variable x with document scope either as x or document.x (for clarity or to resolve ambiguity). If the variable is in the application root document, then you can refer to it in other documents as application.x. A variable has dialog scope if it is declared in a <form> element, or in a <block> or <filled> element that is a child of a <form> element, or an event handler that is a child of a <form> or <menu> element. You can refer to a variable x with dialog scope either as x or dialog.x. A variable has an anonymous scope, local to a field, if it is declared in an event handler or <filled> element that is a child of a <field> element.
If a <var> element specifies a variable that is already in scope, it does not declare a new variable with the same name, but simply assigns a value to the existing variable. If the <var> element has an expr attribute, the variable is assigned the specified value; otherwise, the variable is assigned the value undefined. You can set a variables value with the <assign> tag. VoiceXML variables are in all respects equivalent to JavaScript variablesthey are part of the same namespace. For additional information, see Scripts on page 17. Conditional Logic You can use an <if> element to execute a block of code if a condition is satisfied. Within that element, you can use a sequence of <elseif> elements to execute alternative blocks of code if all previous conditions failed and the condition of the <elseif> element is satisfied. You can use an <else> element to execute and alternative block of code if all previous conditions failed. The conditions in <if> and <elseif> elements are expressed as Boolean-valued JavaScript expressions. Tip: If your JavaScript expression contains any of the characters <, >, or &, that character must be replaced with the corresponding escape sequence <, >, or &.
Iteration You can use the BeVocal VoiceXML extension <bevocal:foreach> to execute the contained elements once for each element of a specified array. Audio Output A <prompt> or <reprompt> element generates speech output; an <audio> element plays a prerecorded audio clip. The <value> tag evaluates an expression and produces spoken output of the result. Prompts can appear in executable contents as well as in elements for collecting user input. Anywhere a <prompt> is valid, text is interpreted as a prompt even if the enclosing <prompt> and </prompt> tags are omitted. An input item and the <initial> item of a mixed-initiative form has a prompt counter that lets you play different prompts if the user revisits the item several times. For example, you may want to play shorter descriptions after the first or second time the user is prompted for the same information. The prompt counters are reset on each form invocation.
16
User Interaction
Scripts A <script> element executes a JavaScript script, which is run in the scope of the parent element. A <script> element can also define functions that can be called by JavaScript expressions in the same scope. VoiceXML variables are equivalent to JavaScript variables and are part of the same namespace. VoiceXML variables can be used in a script just as variables defined in a <script> element can be used in VoiceXML. Declaring a variable using a <var> element is equivalent to using a var statement in a <script> element. If your JavaScript expression contains any of the characters <, >, or &, that character must be escaped. Inside a <script> element, you can do so in one of 2 ways. You can replace the individual characters with the corresponding escape sequence <, >, or &. This may result in code that is difficult to read. Alternatively, you can place the entire script inside a CDATA section. For example, either of the following is correct: <script> function factorial(n) { return (n <= 1) ? 1 : n * factorial(n-1); } </script> or <script> <![CDATA[ function factorial(n) { return (n <= 1) ? 1 : n * factorial(n-1); } ]]> </script> You might argue that the second is a little easier to read.
User Interaction
VoiceXML supports both application-directed and mixed-initiative interactions with a user. In an application-directed (or simply directed) interaction, the application prompts for the information it needs and the user supplies the requested information by answering the prompts. The application controls the interaction; the user cannot volunteer information. To be more accurate, the application does not understand volunteered information: If the application is executing a form, the only active grammar is the one for the current field of the form. The only valid user input is one that provides a value for the current fields variable. If the application is executing a menu, the only active grammars are the grammars of the menus choices. The only valid user input is one that selects a choice for the current menu.
In a mixed-initiative interaction, the user and the application both participate in determining what the application does next. A single utterance from the user may provide input for multiple input variables in a form. In response to a prompt in one dialog, the user may provide input that matches a grammar defined in a different form. When this happens, the interpreter transitions to that dialog and fills its input variables from the user input. Similarly, the user may provide input that selects a choice from a different menu or that matches a link grammar, causing a transition to the destination specified by that choice or link. If an application does not use links or grammars with document or application scope, it may still include mixed-initiative forms. A mixed-initiative form includes a form grammar. It can include an <initial> element to control the initial interaction in the form. This element can request user input or perform other non-interactive initialization tasks. In response to a prompt from the <initial> element, the user could
GETTING STARTED
provide input that fills in multiple input variables. If the form prompts for individual fields, any user input that matches the form grammar is valideven if that input does not fill in the field for which the user was just prompted. Note: Fewer speech-recognition errors occur in directed interactions than in mixed-initiative interactions.
Flow of Execution
Execution within a VoiceXML document flows in document order until a dialog (form or menu) is entered. Execution flows from the current dialog to a different dialog or document, based on either: An explicit transition statement in the current dialog. Speech recognition in the current dialog that causes a transition to a different dialog.
In addition, execution can temporarily leave the current dialog to execute a subdialog, returning to the current dialog when execution of the subdialog is complete. If the current dialog completes execution without transitioning to a different location, the application exits. In addition, you can use an <exit> element to end the application explicitly. Explicit Transition You can set up explicit transitions to other dialogs or documents in your application using <goto> or <submit> tags. These transition elements can be placed inside <block> or <filled> elements or event handlers. The <goto> element lets you transition to another input item in the current form, to another dialog in the current document, or to another document. When you make the transition to the new location, the local variables from the old form or document are lost. This happens even if you transition to the same form you were in before. However, the values of local variables are not affected when you use <goto> to transition between items within a form. The <submit> tag lets you pass values to another document using an HTTP GET or POST request. Since you use a URI to specify the next document, it need not be a VoiceXML document; for example, it could be a CGI script document. Recognition-Triggered Transition User input to a dialog may cause a transition to a different location: If the speech-recognition engine matches the grammar of a menus <choice> element that has a next or expr attribute, the interpreter transitions to the destination specified by that attribute. If the speech-recognition engine matches the grammar of a <link> element that has a next or expr attribute, the interpreter transitions to the destination specified by that attribute. If the speech-recognition engine matches a grammar with document or application scope that is defined in a different dialog, the interpreter transitions to that dialog.
Subdialogs A subdialog is a reusable VoiceXML dialog that you can pass data to and get return values from: The current dialog passes control to a subdialog with a <subdialog> element. It can pass data to the subdialog with <param> elements inside the <subdialog> element. A subdialog returns control to the calling dialog with the <return> element. It can pass values back using the namelist attribute of the <return> element.
18
Collecting Input and Playing Prompts
Collecting Input and Playing Prompts

At any moment, the VoiceXML interpreter is either waiting for input in an input item, such as a field, or transitioning between input items in response to some input. In this sense, input can be a spoken user utterance, a series of DTMF key presses, or an input-related event such as invalid input. What happens in the waiting and transitioning states is rather intertwined. While waiting for input (also referred to as being in a recognition state), the interpreter is listening for and attempting to match spoken utterances or DTMF key presses against the currently active grammars. When the interpreter listens for speech input, it constantly compares the incoming audio stream to all active grammars, looking for a match. At some point after the user stops talking, the interpreter decides whether the input is valid. The timing for this is controlled by several properties; the properties are different for spoken grammars and for DTMF grammars. For details on how these properties interact, see Chapter 12, Properties. While transitioning between input items, the interpreter completely ignores spoken utterances. If the property bevocal.dtmf.flushbuffer is set to false, then it does listen for DTMF key presses. It queues (or buffers) any key presses for the next recognition state and it keeps track of timing information for the key presses. The interpreter also queues asynchronously generated events that are not related directly to execution of the transition (such as the user hanging up). During this transitioning state, prompts and audio are queued to be played and a programs executable content is run. Prompts get played either at the start of the next waiting state or sometimes when the interpreter goes off to fetch a resource, such as another document. For details on fetching resources, see Chapter 4, Fetching and Caching Resources. At the beginning of a waiting state, there may be DTMF key presses queued during the previous transitioning state. By default, those key presses are not available for the waiting state to use for recognition. If you do want to use those keys, set the bevocal.dtmf.flushbuffer property to false.
19
GETTING STARTED
20
Forms
The main elements of a document (within the <vxml> element) are forms. VoiceXML forms are analogous to web forms; you use them to collect (voice) input from the user. This chapter describes: Form Items Form-Item Variables Execution of a Form User Interaction
Form Items
So far, the only form item weve discussed is the <field> element. However, forms can contain either input items or control items: Input items are elements for collecting user input or results. An input item is any one of the following: A field, defined with the <field> tag, asks the user for a piece of information. A record item, defined with the <record> tag, records what the user says (perhaps for a voicemail message); A subdialog, defined with the <subdialog> tag, invokes a reusable dialog. A transfer item, defined with the <transfer> tag, transfers the user to another telephone number.
Control items are tags that can contain procedural items for audio output or computation. A control item is either of the following: A block, defined with the <block> tag, is a container for procedural elements. An initial item, defined with the <initial> tag, controls the initial interaction of a mixed-initiative form.
Form-Item Variables
Each form item has an associated form-item variable. When a form is entered, all form-item variables are initially undefined. When a form item is visited, its variable is set to the result of interpreting that form item. For example, visiting a <block> element sets its form-item variable true. The form-item variable for an input item is also called an input-item variable (or simply input variable); after an input item is visited, its input-item variable is set to the value collected from the user. For more details on setting form-item and input-item variables, see the Grammar Reference.
FORMS
Execution of a Form
Within a form, the flow of execution is governed by the Form Interpretation Algorithm (FIA), a looping algorithm. On each iteration, the FIA selects the form item to visit next. A form items guard conditions determine whether it can be selected on a given iteration: The value of the form-item variable must be undefined. The value of any cond expressions contained in the form item must evaluate to true.
Both guard conditions must be met in order for a form item to be selected. The FIA examines the form items in document order, selecting the first one whose guard conditions are met. If the guard conditions for all form items fail, the form (and the application) exits. By default, every form-item variable has an initial value of undefined so every form item that does not specify a cond expression is eligible for selection. After the form item is visited, its variable is set to a value, which prevents the same form item from being selected again on the next iteration. You can explicitly control the execution of any form item if you give its variable a name and an initial value other than undefined. Doing so prevents the form item from being eligible for selection until you explicitly use the <clear> tag to reset its variable. Typically, input-item variables are given names but control-item variables are not.
User Interaction
User interaction with a form can be directed or mixed initiative. A directed form has no form grammar, only grammars for its individual fields. A directed form gives the user explicit directions about what to say and when. For example, a directed form might result in the following dialog: Application: User: Application: User: Would you like to buy, sell, or receive a stock quote? Get a quote. What stock or stocks would you like a quote for? Intel.
A form that includes its own grammar is a mixed-initiative form. The form grammar allows several input variables to be filled in as a result of a single user utterance. A mixed-initiative form allows the user to speak more naturally. For example, a mixed-initiative form might result in the following dialog. Application: User: Stock assistant here. How can I help you? Id like to get a quote for Intel.
One disadvantage of mixed-initiative forms is that form grammars are more complicated and can result in more recognition errors. The grammar for a field sets a value for the fields variable. For example, the grammar in the following field, specified in ABNF, assigns the value june to the variable month if the user says June. <field name="month"> <grammar> #ABNF 1.0;
22
User Interaction
root $mo; $month = june | july |august; </grammar> <field> The grammar for a form must specify both the input variable to be set by a grammar rule and the value for that variable. For example, the ABNF grammar in the following file, foo.gram, sets values for two variables, quantity and fruit: #ABNF 1.0; root $main; $main = [$amount] $fruit | $amount [$fruit] | $amount $fruit ; $amount = one { quantity=1 | two { quantity=2 | three { quantity=3 ; $fruit = (apple | apples) | (orange | oranges) ;
} } }
{ fruit=apples } { fruit=oranges }
This grammar is used by the following mixed-initiative form: <form id="foo"> <grammar src="foo.gram#main"/> <initial> <prompt>How many apples or oranges do you want?</prompt> </initial>   <field name="fruit"> <grammar src="foo.gram#fruit"/> <prompt>Do you want apples or oranges?</prompt> </field> <field name="quantity"> <grammar src="foo.gram#amount"/> <prompt>How many <value expr="fruit"/> do you want?</prompt> </field> <filled> <prompt> Ok, you want <value expr="quantity"/> <value expr="fruit"/> </prompt> </filled> </form>
23
FORMS
24
Event Handling
The VoiceXML interpreter can throw a number of predefined events based on errors, telephone disconnects or user requests. You can also throw events you define that are specific to your application. When an event is thrown, the associated event handler, if it exists, is invoked. Then execution resumes in the element that was being executed when the event was thrown. This chapter describes: Predefined Events Default Event Handlers Application-Defined Event Handlers Events in Subdialogs Throwing Events Application-Defined Events
Predefined Events
The following standard events are predefined: Event exit help noinput nomatch cancel connection.disconnect.hangup connection.disconnect.transfer Description The user asked to exit. The user asked for help. The user did not provide timely input. The user did not provide meaningful input. The user asked to cancel the prompt that is being played. The user hung up. New in VoiceXML 2.0. The users call was transferred. New in VoiceXML 2.0.
EVENT H ANDLING
The following additional events are defined as BeVocal VoiceXML extensions: Event goback Description User wants to retract the last response and go back to an earlier part of the interaction. See Chapter 7, Go-Back Facility. The number for an outbound telephone call was busy. An outbound telephone was disconnected because the called third party hung up. Outbound telephone calls are described in Chapter 6, Controlling Outbound Calls. An outbound telephone exceeded its maximum allowed duration. An outbound telephone call was not answered within the time allowed for making the connection.
connection.far_end.busy connection.far_end.disconnect
connection.far_end.disconnect.timeout connection.far_end.noanswer
The following standard errors are predefined: Event error.badfetch error.noauthorization error.semantic error.connection.baddestination error.connection.noauthorization Description An error occurred while the interpreter was fetching a document or resource. The user is not authorized to perform the requested action. A runtime error occurred in the VoiceXML code. The destination URI for an outbound telephone call was invalid. An attempt was made to place an unauthorized outbound telephone call, for example, one that exceeds the maximum allowed duration. An audio input or output resource is unavailable. An audio input or output resource is unavailable. The requested resource format is not supported. The requested element is not supported (for example, error.unsupported.subdialog).
error.connection.noresource error.noresource error.unsupported.format error.unsupported.element
The following additional errors are defined as BeVocal VoiceXML extensions: Event error.internal error.bevocal.maxdialogerrors_exceeded Description A serious internal error occurred in the interpreter. The maximum number of speech errors was exceeded in a particular execution of a particular form. The maximum number of speech errors was exceeded during the call.
error.bevocal.maxerrors_exceeded
Note: In a VoiceXML 2.0 document (when the value of the version attribute of the vxml tag is 2.0), the telephone.disconnect.* and error.telephone.* events have been changed to connection.disconnect.* and error.connection.*. See above.
26
Default Event Handlers
Backward Compatibility with VoiceXML 1.0. The following predefined events are still supported in VoiceXML 1.0: Event telephone.disconnect.hangup telephone.disconnect.transfer Description The user hung up. The users call was transferred.
The following predefined errors are still supported in VoiceXML 1.0: Event error.telephone.baddestination error.telephone.noauthorization Description The destination URI for an outbound telephone call was invalid. An attempt was made to place an unauthorized outbound telephone call, for example, one that exceeds the maximum allowed duration. A telephone resource is unavailable, for example because the application tried to make an outbound telephone call while another outbound call was active.
error.telephone.noresource
Default Event Handlers

The BeVocal interpreter provides the following default event handlers for the predefined events and errors: Event Handler exit help noinput nomatch cancel error connection.disconnect.hangup goback All others Description Exit the interpreter. Play a default audio help message and reprompt. The help message says: No help available right now. Play a default audio message and reprompt. The message says: Im sorry, I didnt hear you. Play a default audio message and reprompt. The says: Im sorry, I didnt understand you. Stop playing audio. Exit the interpreter. Exit the interpreter. New in VoiceXML 2.0. Undo whatever actions resulted from the last response, then prompt the user for a new response. Play a default audio error message and exit the interpreter.
Backward Compatibility with VoiceXML 1.0: The following predefined events handler is still supported for VoiceXML 1.0.
27
EVENT H ANDLING
Event Handler telephone.disconnect.hangup
Description Exit the interpreter.
Application-Defined Event Handlers

Although the system provides default handlers for the predefined events, you can override these handlers by providing your own event handlers in any element that can throw an event. The <catch>, <error>, <help>, <noinput>, and <nomatch> elements are event handlers. An element in which an event may be thrown also inherits event handlers defined in its ancestor elements. For example, an event thrown within a field element may be caught by a handler in that element, or in its form, or in its document, or in its application. This inheritance of event handlers allows you to provide consistency in event handling by defining handlers at a higher level. The method by which event handlers are inherited from ancestor elements is called as if by copy semantics in the VoiceXML 2.0 specification. It helps to think of the appropriate event-handler literally being copied into the scope of where the event was thrown. Variable references are resolved relative to the scope of the element where the event was thrown. And URL references are resolved relative to the document from which the event was thrown. For example, if you have a <catch> handler in an application root document, which is in a different directory from the main document which threw the event, URLs in the handler will be resolved to the directory of the main document. The change to URL resolution to the originating document is considered 2.0 behavior and applies only when the <vxml> tags version attribute is set to "2.0" or greater. Form items contain event counters that let you respond differently if the same event is thrown multiple times. For example, you may want to provide more details each time the user asks for help. The event counters are reset on each form invocation. When an event occurs, its counter is used to select applicable event handlers: 1. All handlers in the scope in which the event occurred and its containing scopes are considered. 2. A handler for the event is eligible if its count attribute is less than or equal to the events counter. 3. Those eligible handlers with the highest count are selected as applicable (more than one handler may have the same highest count). 4. The applicable handlers are ordered by scope, with the innermost event handlers first; within a given scope, the applicable handlers are examined in the order in which the occur in the VoiceXML document. The first applicable handler in this ordering is selected to handle the event. You can set up event handlers that catch all events with a given prefix (for example, error.unsupported). Note, however, that the interpreter selects a handler based on count, scope, and document order only. A more specific handler does not take precedence. For example, if an error.unsupported.format event is thrown and the first applicable handler is for all events beginning with the prefix error.unsupported, that handler will be invoked even if the next applicable handler is for the specific event error.unsupported.format. Within an event handler, the _event variable contains the name of the event currently being handled; the _message variable contains the message string that provides additional information about the event. If no message was supplied when the event was thrown, the _message variable is undefined.
28
Events in Subdialogs
Tips: Always set up default <help>, <nomatch>, and <noinput> messages of your own, at top level scope. For example: <vxml version="2.0" xmlns="http://www.w3.org/2001/vxml"> <help> I'm sorry. There's no help available here. <reprompt/> </help> <noinput> I'm sorry. I didn't hear anything. <reprompt/> </noinput> <nomatch> I didn't get that. <reprompt/> </nomatch> ... If you want to execute both an event handler in an inner scope and a handler for the same event in an outer scope, the inner handler can use a <rethrow> element to rethrow the event.
Events in Subdialogs
A subdialog must catch any event that is thrown while the subdialog is being executed. If no handler for the event is found in the subdialogs execution context, a fatal error occurs, causing the interpreter to exit. (VoiceXML 1.0 only) In VoiceXML 1.0, when a subdialog throws an event, the result depends on whether the subdialog is modal. Subdialogs are modal by default; a subdialog can be made non-modal by setting the modal attribute to false. If an event is thrown within a modal subdialog and no handler for the event is found in the subdialogs execution context, a fatal error occurs, causing the interpreter to exit. If an event is thrown within a non-modal subdialog and no handler for the event is found in the subdialogs execution context, the interpreter causes the subdialogs context to return and rethrows the event in the calling context, restarting the search for the event handler in that context.
Note: In VoiceXML 2.0, all subdialogs are modal.
Throwing Events
An application can throw events as follows: A <throw> element throws an event; it can occur within executable content, that is, in a block or <filled> element, or an event handler. A <link> element can specify an event to be thrown when the links grammar is matched. A <choice> element in a menu can specify an event to be thrown when the choices grammar is matched. A <return> element in a subdialog can specify an event to be thrown after control returns to the calling dialog.
29
EVENT H ANDLING
Application-Defined Events
An application can define additional events implicitly. If a tag that throws an event specifies an event other than one of the predefined events, it implicitly defines the specified event. For example, the following tag implicitly defines an event named myEvent and throws that event. <throw event="myEvent"/> An application can use a <catch> element to catch and handle an application-defined-event. For example: <catch event="myEvent"> ... </catch>
30
Fetching and Caching Resources
You can think of the VoiceXML interpreter as a telephone-based web browser. As with HTML documents, VoiceXML documents have Web URIs and can be located on any Web server. In addition to VoiceXML documents, a VoiceXML application can use several different types of files, including recorded audio data, speech and DTMF grammars, and XML data (this last is a BeVocal extension). Some data may be obtained from streaming sources, servers responding to form requests, CGI script output, and so on. We follow the World Wide Web Consortiums convention of using the term resources to refer collectively to files, streams, and other data sources. All of these resources can be accessed with standard Web URIs and can be located on any Web server. One significant difference between an HTML application intended for a standard Web browser and a VoiceXML application intended for your telephone is that a standard Web browser runs locally on your machine, whereas the VoiceXML interpreter does not run on your telephone but runs remotely, for example at the VoiceXML hosting site. On every call to a VoiceXML application, all of the resources needed by that application may need to be retrieved (or fetched) from a location other than where the VoiceXML interpreter runs. Those resources may then be stored locally at the hosting site (that is, cached) by the VoiceXML interpreter for later use by the same or a different application on another call. This chapter discusses concepts related to retrieval and caching of your applications resources: How Fetching and Caching Work What You Can Control from VoiceXML Prefetching Resources Handling Fetching Delays Controlling the Use of Cached Resources Submitting Complex JavaScript Objects
This chapter describes the mechanics of this process. For information on how take advantage of these features when designing your application for best performance, see the VoiceXML Performance Guide.
How Fetching and Caching Work

The BeVocal VoiceXML interpreter follows the VoiceXML 2.0 and Hypertext Transfer Protocol (HTTP) 1.1 standards for fetching and caching resources. The HTTP standard in particular provides a lot of flexibility in how fetching and caching can be implemented. This discussion does not describe all of the details, not even all of the details that apply to VoiceXML. It merely gives an overview of the basic model and the most common ways to use this model in VoiceXML. Note: Fetches made using secure HTTPS are not cached in the proxy server. Requests and Responses Basically, any application that follows the HTTP 1.1 standard for fetching and caching sends a request to a server for a particular resource and then gets a response back from that server. The request consists of a request type (typically a GET request for VoiceXML requests), a URI that specifies what resource is wanted, and various headers that specify details about what will be acceptable in the returned resource.
FETCHING AND CACHING RESOURCES
The response consists of a response code containing information about the type of response, various headers with details about what can be done with the resource, and possibly a body containing the actual resource. As anyone who has waited for the download of a graphic-intensive Web page knows, fetching resources over the Web can be a time-consuming process. Consequently, its a very good idea to avoid as much as possible going out over the Web to get new copies of things that have not changed since the last time you got them. This is where caching comes in; the fundamental idea of caching is to avoid sending a body in the response unless absolutely necessary. Because of the importance of caching to reduce the time involved in fetching resources, many of the headers (both request and response headers) contain information about how old the resource is, how old the resource can be and still be fresh or unexpired, and what restrictions, if any, there are on caching the resource. For standard HTML applications, you can directly set request and response headers. For VoiceXML applications, on the other hand, you still directly set response headers, but you use VoiceXML attributes and properties to set request headers. The VoiceXML interpreter translates the VoiceXML attributes and properties into the appropriate HTTP request headers. Using Multiple Caches Your standard Web browser typically sets aside some amount of disk space on your machine for caching Web pages. This means that as you explore back and forth amongst the multiple pages of a Web site, the browser does not have to download every page every time you visit it. This corresponds to a single phone call to a VoiceXML application. In theory, the VoiceXML interpreter could cache resources for a single phone call; however, single phone calls dont tend to be long enough for this to be worth the overhead. What is most assuredly worth the overhead is... If you open multiple Web browsers on your machine at the same time, you can visit the same or different Web pages in each Web browser. These multiple browser instances can share the same cache for Web pages. That is, pages that are cached by one instance are available for use by the other instances. Correspondingly, at a VoiceXML hosting site, a single VoiceXML Media Gateway may service multiple phone calls at the same time. All of those phone calls share the same VoiceXML interpreter cache. So, resources downloaded for an application on one phone call may be available to the next phone call to that or to a different application on the same Media Gateway. Many of us use the Web through our companys connection to it. Many companies are set up with proxy servers that sit between the companys machines and the outside world of the World Wide Web. A proxy server may store Web pages and then have those Web pages available to any machine that goes through that server out to the Web. Thus, even if you havent visited a site from your machine, if someone else in your company has visited that site, its pages may be stored in the proxy server cache and be available to you from that cache instead of going out over the Web. While not as fast as if it were stored on your machine, the proxy cache is frequently still faster than downloading the page from the Web. A BeVocal hosting site typically contains multiple VoiceXML Media Gateways, to facilitate handling more calls simultaneously. Just as your company uses a proxy cache for Web pages, a hosting site uses a proxy cache for VoiceXML resources. So, if anybody has used the Horoscopes application, its pages may be available in the persistent site-wide proxy cache. When a new call comes in for that application, that call may be able to use the pages from this cache, rather than going out and downloading the information again. The following diagram shows how requests flow from your application through the various levels of caching and finally out to other servers on the Web.
32
When a user interacts with your application, she typically calls a phone number that is associated with a particular BeVocal hosting site (such as the one that hosts BeVocal Caf applications). That site starts a local copy of the VoiceXML interpreter for your application and runs your application on one of its Media Gateways, retrieving resources as needed from wherever they live. As has been said earlier, the resources for your application may live on multiple servers throughout the Web. Even if your resources are hosted at the hosting site, however, they must be fetched by the interpreter when the user calls. Note: There is a naming ambiguity you should be aware of. Every phone call to a BeVocal hosting site gets its own local copy of the VoiceXML interpreter; it does not share that interpreter with other phone calls. However, the VoiceXML Interpreter Cache is a cache for an entire VoiceXML Media Gateway, not for an individual phone call into that Media Gateway. An individual instance of a VoiceXML interpreter (that is, a single phone call) does not have a completely separate cache. At any one time, a lot of different users may be using a lot of different applications all at the same BeVocal hosting site. Each call will use a different set of resources; sometimes these resources overlap and sometimes they do not. For example, if 10 users simultaneously call the same Horoscopes application, they all need access to common documents and audio files for that application. If at the same time another 5 people call a Sports application, that second group will need a similar set of resources for the Sports application, but these resources probably wont overlap much with the resources needed by the group calling the Horoscopes application. At its simplest, if all resources are fresh forever, the sequence would be as follows: The first time a call accesses a particular resource, the interpreter generates the request, looks in the local VoiceXML interpreter cache, then in the site-wide proxy cache, and finally goes out and retrieves the resource from the appropriate server on the Web.
)enihcam 1( yawetaG aideM LMXecioV
srevreS rehtO
3 ppA LMXV ot llaC esnopser / tseuqer ehcaC reterpretnI LMXecioV esnopser / tseuqer esnopser / tseuqer 2 ppA LMXV ot llaC
ehcaC yxorP ediw-etiS
esnopser / tseuqer
esnopser / tseuqer
)enihcam 1( yawetaG aideM LMXecioV
1 ppA LMXV ot llaC
esnopser / tseuqer
ehcaC reterpretnI LMXecioV
revreS ruoY )senihcam ynam( etiS gnitsoH lacoVeB

2 ppA LMXV ot llaC
beW ediW dlroW

33
When a successful response comes back from that server, the resource is first stored in the site-wide proxy cache, then passed to the local interpreter cache where it is also stored, before finally being used by the running application. Later in the same call, if the same resource is requested again, the interpreter gets it directly from the local interpreter cache, without having to download it again from the proxy cache, let alone from the original server. When that call ends, the copy remains both in the local VoiceXML interpreter cache for the VoiceXML Media Gateway and in the site-wide proxy cache. If another call comes in a few minutes later to the same Media Gateway and asks for the same resource, that call can use the copy in the local cache. If another call comes in to a different Media Gateway and asks for the same resource, that call does not have the resource in its local cache, but can use the resource copy in the proxy cache.
Sounds relatively simple, doesnt it? Things get complicated for a variety of reasons: Sometimes the server providing the resource wants to control for how long or even whether the information is cached. For example, for a resource that has security implications, the server may instruct the BeVocal platform to not cache the resource. Most resources are time sensitive, for some time period. For example, if the resource is the current stock price for some company, that resource may only be valid at the time requested. Or, if the resource is an audio file containing a daily horoscope, it may be valid for a single 24-hour period. On the other hand, if the resource is an audio file corresponding to the applications main menu, it may be theoretically valid for weeks or even months. Conversely, sometimes the VoiceXML application may want to control when to use cached information. For example, you may not have control over the server that stores your audio files and so cannot say when those audio files are no longer useful. In that case, youd want the VoiceXML application to provide this information.
Typically, you do not need to be concerned with the differences between the VoiceXML interpreters local cache and the proxy cache for the entire site. The facilities available for controlling caching within a VoiceXML application do not distinguish which cache you are controlling; they talk about whether a resource can be cached at all and for how long it can be cached. Those settings apply to all relevant caches within the BeVocal platform and to any relevant caches out on the Web, for example, at the applications Web server. Also note that this discussion only addresses the caches that are present on the VoiceXML side. The remote server may have its own caches and proxies for storing resources before sending them to your application. For simplicity, most of the rest of this document just talks about the cache, without distinguishing between the local interpreter cache and the proxy cache. Fundamentals of Controlling Caching As weve said, you control caching with the HTTP response and request headers; that is, either on the responding server or from the requesting application. You can specify control information in both of these places. Typically, however, unless youre a caching expert, for a single type of resource (VoiceXML document, grammar file, audio file, SSML file, or XML file), you should decide whether you want to control that type of resource from the server or from the application. Things can get complicated if you try to control a single resource from both ends at once. The primary concepts for controlling caching are the freshness of a resource (whether or not it has expired) and the time interval during which you can use a resource, based on its freshness or on when it was originally fetched.
34
Application Control If you control a resource completely from your VoiceXML application, the normal HTTP fetch sequence for a GET request is basically as follows: 1. The first time the resource is requested, the request goes to the origin server; that is, it goes to the server on which the resource resides. 2. The origin server returns a response. 3. The date and time of receipt (the fetch date) are recorded by the requestor, the resource is stored in the cache, and the resource is returned to the requesting application. The response always includes the Date header, indicating when the response was generated. It might include a Last-modified header, indicating when the resource was last changed on the server. The response may include an Etag header, which is a way of uniquely identifying the actual content of the resource. 4. Right now, were assuming that the server has not specified any expiration information, so the resource expires immediately. 5. By default, the very next time the resource is requested, the interpreter must make a new request for the resource. 6. The new request can include one or both of the If-Modified-Since header, set to the fetch time of the cached resource, and the If-None-Match header, set to the cached Etag. If both are sent, the If-None-Match header takes precedence; if neither is present, the request must be for a completely new copy of the resource. Etag and If-None-Match are both HTTP 1.1 features. HTTP 1.1 servers give them precedence over the If-Modified-Since header. However, HTTP 1.0 servers use If-Modified-Since. 7. The server uses these headers to determine whether a new copy of the resource needs to be sent in the body of the response or whether it should simply indicate that the requesting application can use the copy in its cache. 8. If the response includes a new copy of the resource, the new copy and its headers replace the old copy in the cache. 9. Even if the response does not include a new copy of the resource, the cached header information can change. For example, the server can, and often does, send a new expiration date for the resource. Even if the server does not update the expiration date, the VoiceXML cache updates the default expiration time to the new fetch time. That sequence of events also applies to a POST request, but only if the response includes an Expires or Cache-Control header. When controlling caching from your application, you change step 5 above, by specifying how stale the resource can be. That is, you can use the Cache-Control: max-stale request header to indicate a number of seconds after the expiration of a resource during which your application can use the expired resource. (Remember that you use VoiceXML attributes and properties to set this header information; see Maximum Stale Time on page 44.) Here it helps to remember the difference between the local VoiceXML interpreter cache and the site-wide proxy cache. If a particular call needs a resource and that resource is already in its local cache, fetched within max-stale seconds, the interpreter doesnt need to send a request anywhere. However, if the resource is either not in the local cache at all or the local copy is too old, a request is sent to the proxy cache, to see if it contains a copy fetched within max-stale seconds. If the proxy cache does not contain an appropriately dated resource, the proxy cache sends a request to the server for a new copy.
35
Server Control If you control a resource primarily from the server, the sequence is basically as follows: 1. The first time the resource is requested, the request goes to the origin server (as a GET request). 2. The origin server returns a response. 3. Unless the response header includes a Cache-control: no-cache or Cache-control: no-store header, the response headers are cached for later use by the requestor. The fetch date is recorded, the resource stored, and the resource is returned to the requesting application. Note: In HTTP 1.1, the relevent header is Cache-control, not pragma. HTTP 1.0 used pragma, but this directive is no longer understood by many servers. The response still includes the Date header and might include Last-modified or Etag headers. With server control, the response typically includes either an Expires header or a Cache-control: max-age header. If it contains both, the max-age header takes precedence over the Expires header. Expires indicates an exact date and time at which the resource expires. Cache-control: max-age indicates a number of seconds after the Date at which the resource expires. For example, these 2 sets of headers are equivalent: Date: 12 December 2002 15:34:00 GMT Expires: 12 December 2002 15:36:00 GMT or Date: 12 December 2002 15:34:00 GMT Cache-Control: max-age=120 4. The next time the resource is requested, if the resource was cached, the interpreter uses the appropriate combination of Expires, max-age, and Date headers to determine whether or not the resource has expired. 5. If the resource has not expired, the interpreter returns the cached copy. In our example, if the next request is at 12 December 2002 15:34:30 GMT, the cached copy is returned. 6. If the resource has expired, the interpreter makes a new request for the resource. In our example, if the next request is at 12 December 2002 15:36:30 GMT, a new request is sent to the origin server. 7. If available, the new request includes both the If-Modified-Since header, set to the Last-Modified header of the original request, and the If-None-Match header, set to the Etag of the original response. If both are sent, the If-None-Match header takes precedence; if neither is present, the request must be for a completely new copy of the resource. Etag and If-None-Match are both HTTP 1.1 features. HTTP 1.1 servers give them precedence over the If-Modified-Since header. However, HTTP 1.0 servers use If-Modified-Since. 8. The server uses these headers to determine whether a new copy of the resource needs to be sent in the body of the response or whether it should simply indicate that the requesting application can use the copy in its cache. 9. If the response includes a new copy of the resource, the new copy and its headers replace the old copy in the cache. 10. Even if the response does not include a new copy of the resource, the cached header information can change. For example, the server can, and often does, send a new expiration date for the resource. Even if the server does not update the expiration date, the VoiceXML cache updates the default expiration time to the new fetch time.
36
You may have no control over the expiration times sent by your server. As an example, you may even know that some resources are modified more frequently than the server indicates with its expiration times. In this situation, you have some options on how to change the caching behavior. You might choose not to request a new resource based on when the resource in your cache expires, but rather on how long ago you fetched the resource in your cache. You use the Cache-Control: max-age request header for this purpose. (Remember that you use VoiceXML attributes and properties to set this header information; see Maximum Age on page 44.) Note: In a response header, max-age indicates when the resource will expire, regardless of when it was fetched. In a request header, it indicates the opposite; max-age indicates how long ago the resource could have been fetched, regardless of when it will expire. You can specify both max-age and max-stale headers in the same request. If you do so, the max-age header takes precedence over the max-stale header. For VoiceXML document and grammar document resources, you have another alternative. You can use the http-equiv attribute of the <meta> tag to act as if you had set response headers for these resource types. When the VoiceXML interpreter parses the VoiceXML document or grammar, if it encounters a <meta> tag that specifies caching information, the interpreter must then go and modify the cache to include this information. Note: This method is not recommended because it can only affect the local VoiceXML cache. The overhead of having the intermediate site-wide proxy caches interpret every file would be prohibitive. The proxy caches do not interpret the contents of files, they only look at the headers. Consequently, proxy caches do not understand or implement the caching behavior specified in <meta> tags. Summary of HTTP Headers Remember that for your VoiceXML application, you do not create HTTP request headers manually. Rather you use the appropriate attributes or properties that are described later in this chapter. The VoiceXML interpreter translates these attributes and properties in a fairly obvious way to the HTTP request headers described here. The HTTP request headers most commonly relevant for caching with the BeVocal platform and VoiceXML interpreter are: Request Header User-Agent: BeVocal/ivers VoiceXML/vvers BVPlatform/pvers Description Every request contains a User-Agent header of this form. ivers is the version of the BeVocal VoiceXML interpreter, vvers is the version of VoiceXML used in the requesting document, and pvers is the BeVocal platform version. For example: User-Agent: BeVocal/2.4 VoiceXML/2.0 BVPlatform/1.8.0.4 If you need to detect the User-Agent for a request, you should probably ignore the last 2 digits in the platform version, as these are likely to change occasionally. If-Modified-Since: date If-None-Match: tag If the modification date on the requested resource is after this time, send a new copy. tag is a unique identifier for a resource. If the resource that would be provided for the request has the same identifier as tag, dont send a new copy.
37
Request Header Cache-Control: max-age=N
Description A number of seconds after which the resource must be fetched again from the origin server, regardless of whether or not it has expired. Note that max-age in a response header is quite different from max-age in a request header. In a response header, it lets the server specify a number of seconds during which the resource is still fresh. In a request header, it lets the application specify a number of seconds after which a resource must be refetched, even if the server says the resource is still fresh.
Cache-Control: max-stale=N
A number of seconds after the expiration time during which the application is still willing to use an expired resource.
If you have control over the response sent by the server, you can directly set response headers (or even configure the server itself). The HTTP response headers most commonly relevant for caching with the BeVocal platform and VoiceXML interpreter are: Response Header Etag: tag Description A unique identifier for this response. The details of how the server creates the identifier are server-specific; what is important is that the identifier indicates a particular version of the resource so the server can determine if it has been modified. The date and time the response was generated. This header is required for all HTTP 1.1 responses. The date and time on which this response expires. This is an HTTP 1.0 header, but is still widely used with HTTP 1.1. If Expires is set to 0, the interpreter does not cache the resource. The resource expires N seconds after generation. That is, to determine the expiration date, add N seconds to the date specified with the Date header. Note that max-age in a response header is quite different from max-age in a request header. In a response header, it lets the server specify a number of seconds during which the resource is still fresh. In a request header, it lets the application specify a number of seconds after which a resource must be refetched, even if the server says the resource is still fresh. Cache-Control: s-maxage=N For a shared cache (but not for a private cache), the maximum age specified by this directive overrides the maximum age specified by either the max-age directive or the Expires header. Do not store this resource in the cache. For VoiceXML applications, has the same effect as the no-cache directive. The cache must not use the resource after its expiration time, even if the request says that stale information is acceptable.
Date: date Expires: date
Cache-Control: max-age=N
Cache-Control: no-cache Cache-Control: no-store Cache-Control: must-revalidate
38
What You Can Control from VoiceXML
Response Header Cache-Control: private
Description All or part of the response is intended for a single user. The response must not be cached by a shared cache; it can be cached by a private (non-shared) cache. The response is cachable by any cache, even if it would normally be non-cachable. For VoiceXML applications, this is equivalent to the must-revalidate directive.
Cache-Control: public Cache-Control: proxy-revalidate
What You Can Control from VoiceXML

From your VoiceXML application, you can control various things about fetching and caching of resources. The various attributes and properties that provide this control are collectively referred to as the applications fetch policies. The fetch policies govern the following aspects of fetching: Prefetching ResourcesThe VoiceXML interpreter can try to start fetching resources before they are actually needed (prefetch them), in an attempt to have them already available when actually required. Handling Fetching DelaysNo matter what else you do, there inevitably will be noticeable delays between when a resource is requested and when it is available. Controlling the Use of Cached ResourcesThese are the policies that control request and response headers. See How Fetching and Caching Work on page 31 for how request headers affect caching.
Some fetch policies are set by a single property for all types of resource. Other fetch policies can be set separately for different types of resource. For these policies, there is usually a corresponding set of properties, one for each of these resource types: VoiceXML documents Recorded audio data Grammar files JavaScript source files SSML files (Extension) XML data files (Extension)
In addition, for all of the fetch policies, the appropriate VoiceXML tags support a corresponding attribute. For example, to optimize fetch operations, you can use the audiofetchhint, documentfetchhint, grammarfetchhint, scriptfetchhint, and ssmlfetchhint properties. In addition, the <audio>, <choice>, <data>, <dtmf>, <goto>, <grammar>, <link>, <script>, and <subdialog> tags all support the fetchhint attribute. All policies have default settings. An application can change any default setting with a <property> element that sets a property corresponding to the policy to be changed. Any tag that requests a fetch operation includes attributes that can be set to override the current policy settings during that one fetch operation: A property set in the <vxml> element of a single-document application or the application root document of a multidocument application sets the policy for fetching reosources from that document and the application, overriding the default setting. A property set in the <vxml> element of a non-root document of a multidocument application sets the policy for fetching reosources from that document, overriding the setting for the application. A property set in a <form> or <menu> element sets the policy for fetching reosources from that dialog, overriding the setting for the containing document.
39
A property set in a form item sets the policy for fetching reosources from that form item, overriding the setting for the containing form. An attribute set in an element that fetches a resource sets the policy for that fetch, overriding any other setting of the policy.
There are a couple of subtleties you need to be clear about: When you set fetch properties within the document foo.vxml, you are setting properties for resources that foo.vxml fetches; you are not setting properties that affect fetching foo.vxml itself. Fetch properties are always set by some VoiceXML document. For any given phone call to an application, the initial document of the application is not fetched from another VoiceXML document. Consequently, there is no way to set fetching policies for the initial document. The application root (for example, root.vxml) for document foo.vxml is set in the <vxml> tag of foo.vxml. The application root document is not directly fetched by another document and it inherits the fetch policies of the document of which it is the root. For example, if the document bar.vxml fetches foo.vxml, and bar.vxml sets maxage=120 for foo.vxml, then the VoiceXML interpreter also uses maxage=120 for root.vxml.
The following sections describe the policies and their default settings, and also list the properties that can be used to set each policy. For a detailed description of the various properties, see Chapter 12, Properties.
Prefetching Resources
The interpreter can attempt to optimize dialog interpretation by prefetching files that might be needed. The interpreter prefetches resources used by a document by starting to fetch them as soon as a document is loaded, rather than waiting until execution of the VoiceXML tags that reference those resources. Prefetching can improve an applications performance by allowing it to fetch resources during free time while the user is speaking or listening to a dialog. The interpreter prefetches resources in the order in which they appear in the document. Consequently, those near the top of the document are retrieved first, unless there are delays at the server, heavy Internet traffic, and so on. The interpreter prefetches several resources at once; a delay retrieving one resource does not affect others. Note: Prefetching resources can generate many simultaneous requests on your server. Prefetch Cache Details The VoiceXML interpreter prefetches resources separately for each phone call and for each document executed within that phone call. While the interpreter executes a single VoiceXML document on a single phone call, it has a queue of the resources that it can prefetch for that document. During execution of that document on that call, the interpreter prefetches as many resources as it can and puts them in the prefetch cache. During the execution of a document, the VoiceXML interpreter always checks the prefetch cache for a resource before initiating a new fetch operation. If the resource is in the prefetch cache, the interpreter uses it, even if the resource expires between when it is prefetched and when it is needed. When execution leaves the document (either by the call ending or by transitioning to another document in the same call), the interpreter flushes the prefetch queue and cache and starts over for the next document. Note that if there are multiple simultaneous calls to the same application, they may be executing the same document at the same time. However, the resources for each phone call will be in separate prefetch
40
Prefetching Resources
caches. This means that in some cases, the interpreter will fetch a new copy of a resource for one phone call even though another phone call is using an unexpired copy. To illustrate all this, assume that an application has documents d1.vxml and d2.vxml, both of which refer to the same audio file, foo.wav. If there are 2 simultaneous calls to this application, then at the same time foo.wav might be stored in the prefetch cache of d1.vxml for the first call and the separate prefetch cache of d1.vxml for the second call. Or, it might be in the prefetch cache of d1.vxml for the first call and the prefetch cache for d2.vxml for the second call. foo.wav cannot, however, be in the prefetch caches for d1.vxml for the first call and for d2.vxml for the first call, because 2 different documents on the same phone call cannot be executing at the same time and so cannot have active prefetch caches at the same time. Fetch Hints Prefetching is controlled by instructions called hints, that are specified by the fetchhint attribute, and also by the typefetchhint properties where type is a placeholder for the type of resource to be fetched. For example, the audiofetchhint property controls prefetching of audio files. The fetch hint policy can be set to one of the following values: prefetchFetch the resource when the page is loaded. safeFetch the resource only when it is needed.
Any tag that can fetch a resource has a fetchhint attribute that specifies how to fetch the resource. If this attribute is not set, the interpreter uses the current value of the appropriate typefetchhint property, where type is a placeholder for the type of resource to be fetched, as shown in the table below. Resource type Recorded audio data VoiceXML documents Tags that support the fetchhint attribute <audio> <choice> <goto> <link> <subdialog> <audio> <grammar> <dtmf> VoiceXML 1.0 only <script> <audio> <data> Property audiofetchhint documentfetchhint Default Value (for property) prefetch safe
SSML document Grammar files JavaScript source files SSML files (Extension) XML data files (Extension) Restrictions on Prefetching
ssmlfetchhint grammarfetchhint scriptfetchhint ssmlfetchhint datafetchhint
safe prefetch prefetch prefetch safe
Prefetching is disabled when the URI or other attributes of a tag are computed at runtime. In these cases, even if the fetching hints specify prefetch, the interpreter cannot fetch the resource until the tag is executed and the exact values of the attributes are determined. For example, programmers sometimes simplify their job by writing VoiceXML such as: <audio expr="audioURI('hello')"> where audioURI() is a JavaScript function that adds a prefix such as http://mycompany.com/audio/ and an ending such as .wav to the parameter, resulting in a complete URI of http://mycompany.com/audio/hello.wav. This technique saves some typing and
41
simplifies program maintenance. However, the interpreter cannot prefetch the audio file in this case, because the exact URI is not known until the tag is executed.
Handling Fetching Delays

Regardless of how well you set the various caching and prefetching policies, inevitably fetching resources from remote servers will sometimes generate delays. Various fetch policies control how the interpreter handles these delays. Timeouts By default, the interpreter waits up to one minute for a resource or document to be fetched. The application can control this behavior with the fetchtimeout attribute of a tag that fetches a resource. That attribute specifies how long the interpreter waits for a resource to arrive. If the resource does not arrive within the specified time, the interpreter throws an error.badfetch event. The value is a number representing the time in milliseconds. This attribute is available for all tags that fetch resources, specifically: <audio> <choice> <data> (Extension) <goto> <grammar> <link> <script> <subdialog> <submit> <dtmf> (VoiceXML 1.0 only) <send> (VoiceXML 1.0 only; Extension)
When this attribute is not specified, the interpreter uses the current value of the fetchtimeout property, whose default value is 60 seconds. Background Audio By default, the user does not hear any audio output while the interpreter is fetching a resource of any kind. The application can change this behavior with the fetchaudio policy, which specifies the URI of a background audio file to be played while the interpreter fetches a VoiceXML document or XML data file. Background audio can be helpful if the fetch operation may cause a noticeable delay in processing, such as when an on-line purchase is being verified and processed by a transaction server. The audio file can contain music, a please wait message, and so on. Background audio is never played while the interpreter fetches grammar, audio, or script files. It is only played when fetching VoiceXML documents or XML data files. The fetchaudio policy is controlled with the fetchaudio property and the fetchaudio attribute of the following tags: All tags that fetch VoiceXML documents: <choice>, <goto>, <link>, <subdialog>, and <submit>. Extension. The <data> tag, which fetches XML data files; this attribute is relevant only if the bevocal.fetchaudio.allfetches property is true. Extension; VoiceXML 1.0 only. The <send> tag, which submits values to a Web server.
If the fetchaudio attribute is not specified, the interpreter uses the current value of the fetchaudio property. This property does not have a default value; that is, by default no background audio is played.
42
Controlling the Use of Cached Resources
When a background audio file is specified for a fetch operation, the fetching of the background audio file itself is governed by the audiofetchhint, audiomaxage, audiomaxstale, and fetchtimeout properties that are in effect at the time of the fetch. (In VoiceXML 1.0 applications, the caching property is used in place of audiomaxage and audiomaxstale). Note: The interpreter plays the background audio file only once during a given fetch operation; it does not loop (repeat). Two properties govern the playing of the background audio clip: The interpreter does not start to play the audio file unless the time to fetch the resource exceeds a limit set by the fetchaudiodelay property. This can prevent the user from hearing very short audio clips when there are very slight delays in fetching resources. The default value of this property is 0. The value of the fetchaudiominimum property is the minimum time interval to play the fetchaudio source, once started, even if the fetch operation completes during play. The default value of this property is 0; with this default, the interpreter interrupts the audio playback as soon as the resource is fetched, and resumes normal processing. If set to a larger value, it prevents the user from hearing a short clip of background audio which is immediately cut off.
Queued Prompts when Fetching By default in VoiceXML 2.0, queued prompts are not played in the background during the execution of a fetch. However, for those tags which fetch data and for which background audio can be played during the fetch (<choice>, <goto>, <link>, <subdialog>, <submit>, and <data>), if background audio will be played (as described next), then queued prompts are played during the fetch and before the background audio is played. If background audio will not be played, but the bevocal.fetchaudio.flushqueue property is set to true, then queued prompts will still be played during the fetch.

After a resource expires, it remains in the cache although it is stale. If the same file is needed in the future, the interpreter takes one of the following actions: Uses the stale cached file as is. Revalidates the stale cached file with a Get-If-Modified request to the resources server. If the server replies that the resource has not been modified, the interpreter uses the cached copy. Refetches the resource unconditionally.
If the interpreter needs a resource and the cache contains a copy of that resource, there are 3 primary policies governing whether the interpreter uses the cached copy: The Maximum Age for the cached file The Maximum Stale Time for the cached file (VoiceXML 1.0 only) The Caching policy for the file
These policies all affect the headers sent when a resource is requested. In addition, you can set caching information that would normally be in the response header for a VoiceXML document or a grammar document. (See Mimicking Response Headers on page 46.) See Prefetch Cache Details on page 40 for how the prefetch cache interacts with these policies.
43
Maximum Age An application can specify that it will use a cached resource only if its time in the cache does not exceed a maximum age: If the cached copy is older than the maximum, it will be refetched with a Get-If-Modified header. If the cached copy is within the maximum age and has not expired, it will be used. If the cached copy is within the maximum age but has expired, the relevant maximum-stale-time policy determines whether the interpreter uses the expired cached file. See Maximum Stale Time on page 44.
Any tag that can fetch a resource has a maxage attribute that specifies the maximum age in seconds of a cached resource. If this attribute is not set, the interpreter uses the current value of the appropriate typemaxage property, where type is a placeholder for the type of resource to be fetched. For example, the audiomaxage property specifies the maximum age for audio files. In VoiceXML 1.0 applications, when no value is set for maxage, the caching attribute controls whether an unexpired cached file is used. Resource type Recorded audio data VoiceXML documents Tags that support the maxage attribute <audio> <choice> <goto> <link> <subdialog> <submit> <audio> <grammar> <script> <data> <audio> Property audiomaxage documentmaxage
SSML documents Grammar files JavaScript source files XML data files (Extension) SSML data (Extension)
ssmlmaxage grammarmaxage scriptmaxage datamaxage ssmlmaxage
No default is set for these properties, which means that any unexpired cached file will be used. If you set a maximum-age property to a non-zero value, you ensure that: The interpreter uses an unexpired resource whose age is less than or equal to the maximum agewithout doing a Get-If-Modified request to verify that the cached file is up to date. The interpreter fetches a fresh copy of a resource whose age is more than the maximum ageeven if the cached file has not yet expired.
For example, suppose you fetch a VoiceXML document file that expires in 60 seconds, and after 40 seconds you need the same file. If documentmaxage is set to 30, the application will refetch the document file; if documentmaxage is set to 60, it will use the cached file. You can set a maximum-age property to 0 to ensure that a fresh copy is fetched if the resource has been modified since it was last fetched. Maximum Stale Time An application can specify that an expired file that is not too stale can still be used. The maximum stale time for a file is the time by which its expiration time can be exceeded. Within this allowable stale time, an expired cached file will be used without being refetched. If an expired cached file is needed after its maximum stale time has been exceeded, the file will be refetched.
44
Any tag that can fetch a resource has a maxstale attribute that specifies the maximum time in seconds during which a stale (expired) cached resource may be used. If this attribute is not set, the interpreter uses the current value of the appropriate typemaxstale property, where type is a placeholder for the type of resource to be fetched. For example, the audiomaxstale property specifies the maximum stale time for audio files. The following table specifies the appropriate typemaxstale property for each resource type. Resource type Recorded audio data VoiceXML documents Tags that support the maxstale attribute <audio> <choice> <goto> <link> <subdialog> <submit> <audio> <grammar> <script> <data> <audio> Property audiomaxstale documentmaxstale Default value (for property) 300s 0s
SSML documents Grammar files JavaScript source files XML data files (Extension) SSML data (Extension)
ssmlmaxstale grammarmaxstale scriptmaxstale datamaxstale ssmlmaxstale
0s 0s 0s 0s 300s
The maximum stale time is relevant either when the expired file is within the maximum age or when no maximum age is set for the file. If the number of seconds since the cached file expired is less than or equal to the maximum stale time, the cached file is used. If the file has been expired for longer than the maximum stale time, the interpreter does a Get-If-Modified request to update the cached file, if necessary. Caching VoiceXML 1.0 only. In a VoiceXML 1.0 application, when the relevant maximum-age property is not set, the caching policy determines whether the interpreter uses an unexpired cached copy of a file: If the caching policy is fast (the default), the cached copy is used if it has not expired. If the caching policy is safe, the interpreter sends a Get-If-Modified request to the server, even if the resource is still in the cache. This ensures that the most recent copy of the resource is always used; however, it does introduce some extra delays, because the interpreter must contact the resources server. The safe setting is intended mainly for use during development and debugging, when documents and other files may be updated frequently; it is equivalent to maxage=0.
In VoiceXML 1.0, any tag that can fetch a resource has a caching attribute that specifies the caching policy for the resource: <audio> <choice> <data> (Extension) <dtmf> <goto> <grammar> <link> <script> <subdialog> <submit>
45
If this attribute is not set, the interpreter uses the current value of the caching property. Note that the default value for the property is fast. That is, fast is the normal condition for any tag that does not explicitly specify caching="safe". If the cached file has expired, the relevant maximum-stale-time policy determines whether the interpreter uses the expired cached file. Note: This attribute is used only in when all the following conditions are met: The version attribute of the <vxml> tag is 1.0. The maxage attribute does not have a value. The cache contains an unexpired copy of the resource.
Mimicking Response Headers You may not have direct control over the response headers sent by your server. If you do not, then for VoiceXML documents and XML grammar files, you can use the <meta> tag with its http-equiv attribute to mimic the use of HTTP response headers. For example: <meta http-equiv="Cache-control" content="max-age=10"/> <meta http-equiv="Expires" content="02 Feb 2002 23:59:59 GMT"/> When the VoiceXML interpreter parses a VoiceXML document or a grammar file in the XML format, it interprets these instances of <meta> tag as though the HTTP response had sent these response headers. The interpreter goes back to its cache and changes the associated information. Note: This method is not recommended because it can only affect the local VoiceXML cache. The overhead of having the intermediate site-wide proxy caches interpret every file would be prohibitive. The proxy caches do not interpret the contents of files, they only look at the headers. Consequently, proxy caches do not understand or implement the caching behavior specified in <meta> tags.
Submitting Complex JavaScript Objects

The following tags submit variables to a server: <subdialog> <submit> <data> (Extension) <send> (Extension; VoiceXML 1.0 only)
The tags namelist attribute specifies the variables to be submitted. If one of the specified variables is set to a complex JavaScript object, all component values in the object are submitted as separate variables. For example, the following <submit> tag submits the object foo: <script> var foo = new Object; foo[0] = 1; foo[1] = 7; foo[2] = "hello"; </script> <submit src="bar.jsp" namelist="foo"/> The interpreter submits three individual variables to the server with a URI of the form: bar.jsp?foo[0]=1&foo[1]=7&foo[2]=hello The URI includes the necessary encoding escapes around the bracket characters [ and ].
46
Submitting Complex JavaScript Objects
An arbitrarily complex object can be submitted in this way. The individual values at each level of the structure are submitted individually. For example, following <submit> tag submits the object top: <script> var subObj = new Object; subObj.A = 2; subObj.B = 4; var superObj = new Object; top.size = 2; top.name = "Test"; top.part = subObj; </script> <submit src="bar.jsp" namelist="superObj"/> The interpreter submits four individual variables to the server with a URI of the form: bar.jsp?top.size=2&top.name=Test&top.part.A=2&top.part.B=4 A server-side package that receives individual component variables in this form can put them back together into the appropriate objects.
47
48
Using Multiple-Recognition
This chapter describes how the speech-recognition engine used by the BeVocal VoiceXML interpreter can provide multiple recognition results. This capability consist of two features: N-best recognition and multiple interpretations of spoken input. This chapter describes: Multiple Recognition Features Using N-Best Recognition Using Multiple Interpretations Using Both Features Together
Multiple Recognition Features

A VoiceXML application typically obtains a single value as the result of each speech-recognition event. If the users response was not clear, the result is the one utterance that the speech-recognition engine judges to be the most likely. This utterance may provide values for several slots in a grammar, but it is still a single recognized utterance. If the application uses an ambiguous grammar and the utterance matches more than one rule, slots would be filled according to an arbitrary one of those rules. Two features improve recognition, providing multiple recognition results: N-best recognition: Instead of returning the single most likely utterance, the speech-recognition engine can return a list of the most likely utterances. Multiple interpretations: If any given utterance matches multiple grammar rules, the speech-recognition engine returns those alternative interpretations of the utterance.
Both these features are disabled by default. They can be enabled separately or jointly in applications that want to accept multiple recognition results. N-Best Recognition In some advanced voice applications, a single result may not be sufficient. For example, an airline reservation application might ask the user for destination and departure cities. If a speaker mumbles, the speech-recognition engine might not be able to distinguish between two possible utterances, Austin and Boston. Ideally, the application would obtain both these possible results so that it could prompt for clarification, Did you mean Austin, Texas; or Boston, Massachusetts? Using N-best recognition, the speech-recognition engine returns a list of different possible utterances whose confidence levels are high enough for consideration. See Using N-Best Recognition on page 51.
USING MULTIPLE-RECOGNITION
Multiple Interpretations In some applications, a single recognized utterance may have multiple interpretations, indicating that the utterance is ambiguous. For example, an application might include a GSL grammar with two rules that match the utterance Portland. Cities [ ... (portland ?maine) {<city Portland> <state ME>} (portland ?oregon) {<city Portland> <state OR>} ... ] If the user clearly says Portland, this utterance does not allow the speech-recognition engine to choose between the two possible interpretations. Ideally, the application would obtain both interpretations so it could prompt for more information, Do you mean Portland, Maine; or Portland, Oregon? Multiple interpretations lets an application access the different interpretations for a given recognized utterance. If multiple grammar rules match the recognized utterance, all resulting interpretations are returned. See Using Multiple Interpretations on page 55. Combining the Features The two multiple-recognition features can be used together. If both features are enabled, each possible utterance may have multiple interpretations. For example, an airline reservation application might both features for a field whose ambiguous GSL grammar includes two rules that match the utterance Austin. Cities [ ... (austin ?texas) {<city Austin> <state TX>} (austin ?california) {<city Austin> <state CA>} (boston ?massachusettes) {<city Boston> <state MA>} ... ] If the user mutters something that sounds like either Austin and Boston, the speech-recognition engine would find two possible results, Austin and Boston. The first of these results would have two possible interpretations: Austin, Texas and Austin, California. A sophisticated application could prompt the user, Did you mean Austin, Texas; Austin, California; or Boston, Massachusetts? Combining both features provides the maximum flexibility. See Using Both Features Together on page 59. Working with Multiple Recognition An application can selectively enable the two multiple-recognition features, specifying the maximum number of results to be returned. The following VoiceXML language features support recognition of multiple results: The property maxnbest controls whether N-best recognition is enabled. The property bevocal.maxinterpretations controls whether maximum interpretations is enabled. The read-only variable application.lastresult$ is set by the speech-recognition engine. It contains information about the result of the most recent speech-recognition event. If multiple recognition was enabled, this variable may contain more than one result. If only N-best recognition is enabled, the results represent different recognized utterances, each with a single interpretation. If only multiple interpretations is enabled, the results represent a single recognized utterance with a number of different interpretations.
50 VOICEXML PROGRAMMER S GUIDE
Using N-Best Recognition
If both features are enabled, the results can represent a number of different recognized utterances, some or all of which can have multiple interpretations. The two properties maxnbest and bevocal.maxinterpretations control how many results are returned from speech recognition. If only N-best recognition is enabled, maxnbest is the maximum number of results to be returned. If only multiple interpretations is enabled, bevocal.maxinterpretations is the maximum number of results to be returned. When both features are enabled, the two properties are used together. You can set these properties either to limit the total number of results without distinguishing whether a particular result is a different utterance or a different interpretation of a given utterance. Alternatively, you can specify the maximum number of distinct utterances and the maximum number of distinct interpretations for any given utterance.

N-best recognition can be invoked whenever the users spoken input is matched against a grammar. By default, N-best recognition is disabled. If you want to use this feature, you must explicitly enable it. After recognition in which this feature is enabled, you check to see whether more than one result was recognized. If so, you can prompt the user to select among the possible results. Enabling N-Best Recognition You enable N-best recognition by setting the maxnbest property to a value greater than one. If only N-best recognition is enabled, the value is the maximum number of distinct utterances that the speech-recognition engine should return. If multiple interpretations is also is enabled, the interpretation of value for maxnbest depends on the value of the bevocal.maxinterpretations property. This section describes using N-best recognition alone. Using Both Features Together on page 59 describes how to combine N-best recognition with multiple interpretations. By default, when you set maxnbest to a number greater than one, you enable both N-best recognition and multiple interpretations. To disable multiple interpretations, set the bevocal.maxinterpretations property to 1. When N-best recognition is enabled, the speech-recognition engine may find multiple utterances. The most common use of N-best recognition is in recognizing input in <field> and <initial> elements. It also can be used in recognizing input that matches a <link> or <choice> grammar. N-best recognition slows down the recognition process; you should enable this feature only when you need it. For example, you might enable it for a particular field or form in which you anticipate that user inputs might sound similar to more than one expected response. You should set maxnbest to a fairly small number and your application should be able to handle the specified number of results. Checking for Multiple Utterances After speech recognition occurs while N-best recognition is enabled, you should check whether multiple likely utterances were found. If recognition occurs in a <field> element, you check the results in the <filled> element of that field. If recognition occurs in a <initial> element, you check the results in the <filled> element of the containing form. If recognition occurs in a <link> or <choice> element, you check the results in a <block> at the top of the dialog or document to which the <link> or <choice> element sent you.
51
In the most common case, you check for multiple utterances to decide how to set input variables following speech recognition in a <field> or <initial> element. Whether or not N-best recognition is enabled, the most likely recognized utterance is used to set relevant input variables. If the most likely utterance matches more than one grammar rule, the relevant input variables are set according to an arbitrary one of those rules. To check whether more than one result was found, you examine the application.lastresult$ array, which may contain up to maxnbest elements; in most cases, fewer results are returned. The application.lastresult$ array contains at least one element, namely, application.lastresult$[0]. You can check application.lastresult$.length to see how many elements are in the array. For a given index i, application.lastresult$[i] is undefined if the array contains no object at that index. If you find that only one result was returned, you do not need to take special actions; you can use the results in the input variables just as if N-best recognition were disabled. Otherwise, you can ask the user to select among the various results. Selecting an Utterance Once you have determined that the application.lastresult$ array contains more than one result, your application can interact with the user to determine which result was intended. Each object in the array corresponds to one likely result; its utterance property is the recognized utterance and its interpretation property is the interpretation of that utterance. You can ask the user to select among the possible results. Each result corresponds to a different possible utterance; the utterances are ordered by speech-recognition engine confidence level. After the user selects a result, you can set input variables accordingly. For example, if the user selects the third recognition result, the interpretation is in: application.lastresult$[2].interpretation The interpretation has a property for each slot that is filled in by the matching grammar rule. You can access these properties to get the values for input variables. Typically, a slot name is identical to the name of an input variable. For example, the value for the city field of the interpretation is in: application.lastresult$[2].interpretation.city If speech recognition occurs in a <link> or <choice> element, you typically dont use the selected result to set input variables. Instead, you use it to decide which dialog or document to visit. Example This application allows a user to schedule a visit with one of the companys offices, identified by the city where the office is located. The grammar includes three cities whose names have somewhat similar sounds: Austin, Boston, and Houston. To allow for the situation in which the speech-recognition engine cannot distinguish among those names, the maxnbest property for the office field is set to 3. Note that N-best recognition is enabled only during interpretation of the office field. If the application receives more than one recognition result for the office field, it prompts the user to select a number corresponding to one of the possible utterances. It also lets the user start over (in case none of the possible utterances is correct). The application keeps track of the number of recognized utterances. If the user gives an inappropriate number when asked for clarification, the application prompts again.
52
Sample Interactions In this interaction, the users answer is clear. Application: User: Application: Which office would you like to visit? Denver. Scheduling a visit to the Denver office.
In this interaction, the application cannot distinguish among possible responses. Application: User: Application: User: Application: Which office would you like to visit? (Garbled) estin. Please answer 1 if you said Austin; 2 if you said Boston; 3 if you said Houston; if you want to start over, answer 0. Two. Scheduling a visit to the Boston office.
In this interaction, the user wants to give the city again instead of selecting one of the options. Application: User: Application: User: Application: User: Application: Which office would you like to visit? (Garbled) ahstin. Please answer 1 if you said Boston; 2 if you said Austin; if you want to start over, answer 0. Zero. Which office would you like to visit? Houston. Scheduling a visit to the Houston office.
In this interaction, the user enters an invalid selection when asked for clarification. Application: User: Application: User: Application: Which office would you like to visit? (Garbled) ahstin. Please answer 1 if you said Boston; 2 if you said Austin; if you want to start over, answer 0. Three. Unrecognized option. Please answer 1 if you said Boston; 2 if you said Austin; if you want to start over, answer 0. User: Application: Two. Scheduling a visit to the Austin office.
53
Application Code <?xml version="1.0" ?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" "http://cafe.bevocal.com/libraries/dtd/vxml2-0-bevocal.dtd"> <vxml version="2.0" xmlns="http://www.w3.org/2001/vxml"> <form> <var name="myResults"/>  <var name="choicePrompt"/>    <script> <![CDATA[ function listResults(allResults) { var promptmsg = "Please answer "; var promptIndex = 1; for (var i = 0; i < allResults.length; i++) { promptmsg = promptmsg + promptIndex + " if you said " + allResults[i].utterance + "; "; ++promptIndex; } promptmsg = promptmsg + "If you want to start over, answer 0." return promptmsg; } function findResult(allResults, strindex) { return allResults[strindex - 1].utterance; } ]]> </script> <field name="office" > <property name="maxnbest" value="3"/> <property name="bevocal.maxinterpretations" value="1"/> <grammar type="application/x-nuance-gsl"> <![CDATA[([ ( austin ) { <office austin> } ( boston ) { <office boston> } ( chicago ) { <office chicago> } ( denver ) { <office denver> } ( houston ) { <office houston> } ])]]> </grammar> <prompt>Which office would you like to visit?</prompt> <filled> <if cond="application.lastresult$.length > 1">  <assign name="myResults" expr="application.lastresult$"/>  <assign name="choicePrompt" expr="listResults(myResults)"/> <else/>  <assign name="choice" expr="0"/> </if> </filled> </field> <field name="choice" type="digits"> <prompt> <value expr="choicePrompt"/> </prompt>
Using Multiple Interpretations
<filled> <if cond="choice == 0">  <clear/> <elseif cond="choice > myResults.length"/>  <prompt>Unrecognized option.</prompt> <clear namelist="choice"/> <else/> <assign name="office" expr="findResult(myResults, choice)"/> </if> </filled> </field> <block> <prompt>Scheduling a visit with the <value expr="office"/> office</prompt> </block> </form> </vxml>

If the grammar that is used for a particular field or form is ambiguous, you can enable multiple interpretations when the users spoken input is matched against that grammar. By default, multiple interpretations is disabled. If you want to use this feature, you must explicitly enable it. After recognition in which this feature is enabled, you check to see whether more than one interpretation was found. If so, you can prompt the user to select among the possible interpretations. Enabling Multiple Interpretations You enable multiple interpretations by setting the bevocal.maxinterpretations property to a value other than one. If only multiple interpretations is enabled, the value is the maximum number of distinct interpretations that the speech-recognition engine should return. If N-best recognition is also is enabled, the value for bevocal.maxinterpretations is used in conjunction with the value of the maxnbest property to determine how many results to return. This chapter describes using multiple interpretations recognition alone. Using Both Features Together on page 59 describes how to combine N-best recognition with multiple interpretations. When multiple interpretations is enabled, if the users utterance match more than one rule in an ambiguous grammar, all corresponding interpretations are included in the recognition results. The most common use of ambiguous grammars is in recognizing input in <field> and <initial> elements. You typically should avoid using an ambiguous <link> or <choice> grammar. The remainder of this document assumes that multiple interpretations are found for spoken input in a <field> or <initial> element. You should enable multiple interpretations only when you need itnamely in a particular field or form in which you use an ambiguous grammar. Your application should be able to handle the specified number of interpretations.
55
Checking for Multiple Interpretations You should check for multiple interpretations in the <filled> element of a field or mixed-initiative form that has an ambiguous grammar. If the users response matched more than one grammar rule, the relevant input variables are set according to an arbitrary one of those rules. If multiple interpretations is enabled, you should check whether additional results were returned. The number of results returned from the speech-recognition engine does not necessary equal bevocal.maxinterpretations; in most cases, fewer results are returned. To check whether more than one result was found, you examine the application.lastresult$ array, which may contain up to bevocal.maxinterpretations elements; in most cases, fewer results are returned. The application.lastresult$ array contains at least one element, namely, application.lastresult$[0]. You can check application.lastresult$.length to see how many elements are in the array. For a given index i, application.lastresult$[i] is undefined if the array contains no object at that index. If you find that only one result was returned, you do not need to take special actions; you can use the values of the input variables just as if multiple interpretations were disabled. Otherwise, you can ask the user to select among the various interpretations. Selecting an Interpretation Once you have determined that the application.lastresult$ array contains more than one result, your application can interact with the user to determine which result was intended. Each object in the array corresponds to one likely result; its utterance property is the recognized utterance and its interpretation property is the interpretation of that utterance. You can ask the user to select among the possible interpretations. Each result corresponds to a different interpretations of the most likely utterance; the different interpretations are in an undefined order. After the user selects a result, you can set input variables accordingly. For example, if the user selects the third recognition result, the interpretation is in: application.lastresult$[2].interpretation The interpretation has a property for each slot that is filled in by the matching grammar rule. You can access these properties to get the values for input variables. Typically, a slot name is identical to the name of an input variable. For example, the value for the city field of the interpretation is in: application.lastresult$[2].interpretation.city Example This application prompts the user for an employee. The grammar allows the user to identify an employee by first name only, by first name and last name, by nickname, or by nickname and last name. The first name Robert is ambiguous: it could mean either Bob Smith or Rob Black. To allow for an ambiguous answer, the bevocal.maxinterpretations property for the employee field is set to 2. Multiple interpretations is enabled only during interpretation of the employee field. Sample Interactions In this interaction, the users answer is unambiguous. Application: User: Application: Which employee do you want to call? Alice. Placing call to Alice Brown.
56
In this interaction, the user gives an ambiguous name. Application: User: Application: User: Application: Which employee do you want to call? Robert. Please say 1 if you mean Robert Smith; 2 if you mean Robert Black. One. Placing call to Robert Smith.
In this interaction, the user enters an invalid selection when asked for clarification. Application: User: Application: User: Application: User: Application: Which employee do you want to call? Robert. Please say 1 if you mean Robert Smith; 2 if you mean Robert Black. Three. Unrecognized option. Please say 1 if you mean Robert Smith; 2 if you mean Robert Black. Two. Placing call to Robert Black.
Application Code <?xml version="1.0" ?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" "http://cafe.bevocal.com/libraries/dtd/vxml2-0-bevocal.dtd"> <vxml version="2.0" xmlns="http://www.w3.org/2001/vxml"> <form> <var name="myResults"/> <var name="nInterps"/> <var name="choicePrompt"/> <script> <![CDATA[ // Create a prompt for clarification, asking the user to choose // an interpretation of the most likely utterance function listInterps(allInterps) { var promptmsg = "Please say "; var promptIndex = 0; for (var i = 0; i < allInterps.length; i++) { if (allInterps[i].utterance == allInterps.utterance) { promptmsg = promptmsg + ++promptIndex + " if you mean " + allInterps[i].interpretation.employee + "; "; } } nInterps = promptIndex; return promptmsg; }   
57
// Return the interpretation chosen by the user function findInterp(allInterps, strindex) { return allInterps[strindex - 1].interpretation.employee; } // Count the number of recognition results whose utterance matches // the most likely utterance function countInterps(allResults) { var i, c = 0; for (i = 0; i < allResults.length; i++) { if (allResults[i].utterance == allResults.utterance) { c++; } } return c; } ]]> </script> <field name="employee" > <property name="bevocal.maxinterpretations" value="2"/>  <grammar type="application/x-nuance-gsl"> <![CDATA[([ ( alice ?brown ) {<employee "alice brown"> } ( robert ?smith ) {<employee "robert smith"> } ( bob ?smith ) {<employee "robert smith"> } ( robert ?black ) {<employee "robert black"> } ( rob ?black ) {<employee "robert black"> } ( joe ?jones ) {<employee "joseph jones"> } ( joseph ?jones ) {<employee "joseph jones"> } ])]]> </grammar> <prompt>Which employee do you want to call?</prompt> <filled> <var name="count" expr="countInterps(application.lastresult$)"/> <if cond="count > 1">  <assign name="myResults" expr="application.lastresult$"/>  <assign name="choicePrompt" expr="listInterps(myResults)"/> <else/>  <assign name="choice" expr="0"/> </if> </filled> </field> <field name="choice" type="digits"> <prompt> <value expr="choicePrompt"/> </prompt> <filled> <if cond="choice == 0 || choice > nInterps">
58
Using Both Features Together
 <prompt>Unrecognized option.</prompt> <clear namelist="choice"/> <else/> <assign name="employee" expr="findInterp(myResults, choice)"/> </if> </filled> </field> <block> <prompt>Placing call to <value expr="employee"/></prompt> </block> </form> </vxml>

You can enable both multiple-recognition features for a particular field or form that has an ambiguous grammar and in which you anticipate that user inputs might sound similar to more than one expected response. Enabling Both Features You enable N-best recognition by setting the maxnbest property to a value greater than one. You enable multiple interpretations by setting the bevocal.maxinterpretations property to a value other than one. Speech-recognition results are returned in the application.lastresult$ array. Each element corresponds to one interpretation of one likely utterance. The same utterance may have different interpretations, and two or more different utterances may have a common interpretation. The values of the two properties are used together to limit the number of results that are returned by the speech-recognition engine. If bevocal.maxinterpretations is undefined or less than one, up to maxnbest results are returned. If bevocal.maxinterpretations is greater than one, up to maxnbest distinct utterances are returned, each of which can have up to bevocal.maxinterpretations distinct interpretations. The maximum number of results is the product of maxnbest and bevocal.maxinterpretations.
For example, if maxnbest is 3 and bevocal.maxinterpretations is 0, a maximum of 3 results can be returned; if maxnbest is 3 and bevocal.maxinterpretations is 2, a maximum of 6 results can be returned. Checking for Multiple Results You should check for multiple results in the <filled> element of a field or mixed-initiative form for which both multiple-recognition features are enabled. As always, the most likely recognized utterance is used to set relevant input variables. If the most likely utterance matches more than one grammar rule, the relevant input variables are set according to an arbitrary one of those rules To check whether more than one result was found, you examine the application.lastresult$ array. The application.lastresult$ array contains at least one element, namely, application.lastresult$[0]. You can check application.lastresult$.length to see how many elements are in the array. For a given index i, application.lastresult$[i] is undefined if the array contains no object at that index.
59
If you find that only one result was returned, you do not need to take special actions; you can use the results in the input variables just as if multiple recognition were disabled. Otherwise, you can ask the user to select among the various results. Selecting a Result Once you have determined that the application.lastresult$ array contains more than one result, your application can interact with the user to determine which result was intended. Each object in the array corresponds to one likely result; its utterance property is the recognized utterance and its interpretation property is the interpretation of that utterance. You can ask the user to select among the possible result. You should assume that the different recognition results may correspond to different possible utterances as well as different interpretations of some utterances. Elements for different possible utterances are ordered by speech-recognition engine confidence level; elements for the different interpretations of a given utterance are in an undefined order. After the user selects a result, you can set input variables accordingly. For example, if the user selects the third recognition result, the interpretation is in: application.lastresult$[2].interpretation The interpretation has a property for each slot that is filled in by the matching grammar rule. You can access these properties to get the values for input variables. Typically, a slot name is identical to the name of an input variable. For example, the value for the city field of the interpretation is in: application.lastresult$[2].interpretation.city Simple Example This application allows a user to schedule a visit with one of the companys offices, identified by the city where the office is located. The grammar includes three cities whose names have somewhat similar sounds: Austin, Boston, and Houston. The grammar allows the user to identify an office by city only or by city and state. The city Austin is ambiguous; the company has offices in both Austin, Texas and Austin, California. To allow for the situation in which the speech-recognition engine cannot distinguish among similar sounding names, and/or it recognizes an ambiguous answer the maxnbest property for the office field is set to 4. The bevocal.maxinterpretations is not set; it is undefined by default, so multiple interpretations is also enabled and the maximum number of results in 4. Multiple recognition is enabled only during interpretation of the office field. If the application receives more than one recognition result for the office field, it prompts the user to select a number corresponding to one of the possible results. Any result whose confidence level is within 0.3 of the highest confidence level is considered. If the grammar included rules in which different similar-sounding utterances could produce the same interpretation, the application could ensure that it only asks the user about unique interpretations.
60
Sample Interactions In this interaction, the application cannot distinguish between two possible responses, one of which is ambiguous. Application: User: Application: User: Application: Which office would you like to visit? (Garbled) ahstin. Please say 1 if you mean Boston Massachusetts; 2 if you mean Austin Texas; 3 if you mean Austin California. If you want to start over, answer 0. Two. Scheduling a visit with the Austin Texas office.
Application Code <?xml version="1.0" ?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" "http://cafe.bevocal.com/libraries/dtd/vxml2-0-bevocal.dtd"> <vxml version="2.0" xmlns="http://www.w3.org/2001/vxml"> <form> <var name="myResults"/>  <var name="nInterps"/>  <var name="choicePrompt"/>        <script> <![CDATA[ function listInterps(allInterps) { var promptmsg = "Please say "; var index = 0; var promptIndex = 1; var maxConfidence = allInterps[0].confidence; while (allInterps[index] != undefined && maxConfidence - allInterps[index].confidence < 0.3) { promptmsg = promptmsg + promptIndex + " if you mean " + allInterps[index].interpretation.office + "; "; ++index; ++promptIndex; } nInterps = index; promptmsg = promptmsg + "if you want to start over, say 0." return promptmsg; } function findInterp(allInterps, strindex) { return allInterps[strindex - 1].interpretation.office; } ]]> </script> <field name="office" > <property name="maxnbest" value="4"/>
 <grammar type="application/x-nuance-gsl"> <![CDATA[([ ( austin ?texas ) {<office "austin texas"> } ( austin ?california ) {<office "austin california"> } ( boston ?massachusetts ) {<office "boston massachusetts"> } ( chicago ?illinois ) {<office "chicago illinois"> } ( denver ?colorado ) {<office "denver colorado"> } ( houston ?texas ) {<office "houston texas"> } ])]]> </grammar> <prompt>Which office would you like to visit?</prompt> <filled> <if cond="application.lastresult$.length > 1 && application.lastresult$[0].confidence application.lastresult$[1].confidence < 0.3">  <assign name="myResults" expr="application.lastresult$"/>  <assign name="choicePrompt" expr="listInterps(myResults)"/> <else/>  <assign name="choice" expr="0"/> </if> </filled> </field> <field name="choice" type="digits"> <prompt> <value expr="choicePrompt"/> </prompt> <filled> <if cond="choice == 0">  <clear/> <elseif cond="choice > nInterps"/>  <prompt>Unrecognized option.</prompt> <clear namelist="choice"/> <else/> <assign name="office" expr="findInterp(myResults, choice)"/> </if> </filled> </field> <block> <prompt>Scheduling a visit with the <value expr="office"/> office</prompt> </block> </form> </vxml> Generating a Subdialog The preceding examples asked the user to enter a number corresponding to the intended response. You can produce a more sophisticated interaction by generating a subdialog from the value of application.lastresult$ and using the subdialog to request disambiguation. This application consists of a mixed-initiative form that prompts the user for a city and state. As in the preceding example, the grammar is ambiguous and some possible city names sound similar. After the user
62
has filled in the city and state, the application checks application.lastresult$ to see whether multiple recognition results were found and, if so, whether the confidence levels of the first two results are within 0.3. If so, the application calls a subdialog, which is generated from the value of application.lastresult$ by a perl script. The perl script receives the array of recognition results as a POST parameter named results. Sample Interaction In this interaction, the application cannot distinguish between two possible responses, both of which are ambiguous. Application: User: Application: Please name a city and state. (Garbled) ahstin. I didnt quite get that. Please say that one when you hear the city you want. Boston, Massachusetts. (Pause) Boston, Maine. (Pause) Austin, Texas. User: Application: Yes. You chose Austin Texas.
Application Code <?xml version="1.0" ?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" "http://cafe.bevocal.com/libraries/dtd/vxml2-0-bevocal.dtd"> <vxml version="2.0" xmlns="http://www.w3.org/2001/vxml"> <form> <property name="maxnbest" value="5"/> <var name="results"/>   <grammar mode="voice" type="application/x-nuance-gsl"> <![CDATA[([ ( austin ?california ) { <city austin> <state california> } ( austin ?texas ) { <city austin> <state texas> } ( boston ?maine ) { <city boston> <state maine> } ( boston ?massachusetts ) { <city boston> <state massachusetts> } ( chicago ?illinois ) { <city chicago> <state illinois> } ( denver ?colorado ) { <city denver> <state colorado> } ( houston ?texas ) { <city houston> <state texas> } ])]]> </grammar> <initial> <prompt>Please name a city and state</prompt> </initial> <field name="city"> Choose a city <grammar type="application/x-nuance-gsl"> [ austin boston chicago denver houston] </grammar> </field>
<field name="state"> Which state? <grammar type="application/x-nuance-gsl"> [ california colorado illinois maine massachussetts texas] </grammar> </field> <filled namelist="city state" mode="all">    <log>Last result is <value expr="application.lastresult$"/> </log> <if cond="application.lastresult$.length > 1 && application.lastresult$[0].confidence application.lastresult$[1].confidence < 0.3"> <assign name="results" expr="application.lastresult$"/> <goto nextitem="disambig"/> </if> </filled>      <subdialog name="disambig" cond="false" src="disambig.pl" namelist="results" method="post" > <filled>   <assign name="city" expr="disambig.city"/> <assign name="state" expr="disambig.state"/> </filled> </subdialog> <block> <prompt>You chose <value expr="city"/>, <value expr="state"/>.</prompt> </block> </form> </vxml>
64
Perl Script #!/usr/local/bin/perl5 # # # # # # # # # # # # This sample Perl-based CGI demonstrates a server-side technique for disambiguating the multiple utterances or multiple interpretations from speech recognition. This script assumes the recognition was from a grammar that filled two slots: "city" and "state". The expected HTTP request parameters are: results.length - The number of results from the recognition. For i from 0 to results.length - 1: results[i].confidence - The confidence of this result, 0 to 1 results[i].interpretation.city - The city recognized for this result results[i].interpretation.state - The state recognized for this result
use CGI; # Print out the XML and VoiceXML headers # print "\n"; print "<?xml version=\"1.0\"?>\n\n"; print "<!DOCTYPE vxml PUBLIC \"-//BeVocal Inc//VoiceXML 2.0//EN\"\n"; print "\"http://cafe.bevocal.com/libraries/dtd/vxml2-0-bevocal.dtd\">\n\n"; print "<vxml version=\"2.0\">\n"; # Print the beginning of the disambiguation dialog # print " <form>\n"; print " <block>\n"; print " I didn't quite get that.\n"; print " Please say 'that one' when you hear the city you want.\n"; print " </block>\n"; # Create two variables to store the disambiguation results # print " <var name=\"city\"/>\n"; print " <var name=\"state\"/>\n"; # Figure out how many results we have to deal with $length = CGI::param("results.length"); # Save the maximum confidence of any result (which we know is always the first one) # $maxConfidence = ( CGI::param("results[0].confidence") ); for ($i = 0; $i < $length; $i++) { my my my my ($fname) ($city) ($state) ($level) = = = = ( ( ( ( "f$i" ); # field name CGI::param("results[$i].interpretation.city") ); CGI::param("results[$i].interpretation.state") ); CGI::param("results[$i].confidence") );
# If the confidence of this result is close enough to the maximum

# one, then use it in the disambiguation # if (($maxConfidence - $conf) < 0.3) { # Generate a field that prompts this city/state name # and then pauses briefly to let the user say "that one". # If the user remains silent, our custom <noinput> handler # simply goes to the next field. # print " <field name=\"$fname\">\n"; print " <grammar>[ yes ( that one ) ]</grammar>\n"; print " <prompt timeout=\"0.75s\">\n"; print " $city, $state\n"; print " </prompt>\n"; # If the user said "that one" (or "yes"), fill in the result variables. # The form-level <filled> block below will then return them. # print " <filled>\n"; print " <assign name=\"city\" expr=\"'$city'\"/>\n"; print " <assign name=\"state\" expr=\"'$state'\"/>\n"; print " </filled>\n"; # The user didn't say anything. Tell the interpreter to go to the # next field by setting this field's form item variable to true. # print " <noinput>\n"; print " <reprompt/>\n"; print " <assign name=\"$fname\" expr=\"true\"/>\n"; print " </noinput>\n"; print " </field>\n"; } # End if }; # End for # If we get here, it means the user didn't say "that one" on any of the fields. # Clear them all out and try again. # print " <block>\n"; print " <clear/>\n"; print " </block>\n"; # This form-level filled block returns the city and state variables # as soon as any of the fields are filled by the user saying "that one" # print " <filled mode=\"any\">\n"; print " <return namelist=\"city state\"/>\n"; print " </filled>\n"; print " </form>\n"; print "</vxml>\n"; print "\n"; exit 0;
66
PART 2
VoiceXML Extended
This part explains how to use BeVocal VoiceXML extensions: Chapter 6, Controlling Outbound Calls Chapter 7, Go-Back Facility Chapter 8, TTS and Recorded Voice Selection Chapter 9, Dynamic SSML Chapter 10, SOAP Client Facility
68
Controlling Outbound Calls
A session begins when a call is either received or placed, and then associated with an application via the BeVocal VXML Interpreter. The application can then initiate an outbound call to a third party, allowing the user to talk to the called third party: The standard VoiceXML tag <transfer> transfers the users call to the third party, either terminating execution of the application or temporarily suspending execution of the application (except for recognition of speech that matches the transfer grammar). The BeVocal VoiceXML extension <bevocal:dial> initiates an outbound call to a third party and continues executing the application. Additional extension tags can modify the outbound call in any of the following ways: The outbound call can be put on hold. The application can listen to the user, waiting for a recognized utterance. The application can put the user on hold and play audio output to the called third party. This chapter describes interactions among the user, the application, and the called third party during outbound calls: Call Status Limitations on Outbound Calls Interactions Without an Outbound Call Interactions During a Transfer Interactions During a Dialed Call Putting a Dialed Call on Hold Listening to a Dialed Call Interrupting a Dialed Call VoIP and Outbound Calls
Note: Committees are currently working to standardize call-control features for VoiceXML, and their current approach is different from the BeVocal VoiceXML implementation. Because the approval of any call-control standards will be quite some time in coming, however, BeVocal VoiceXML contains this extension to allow developers to start taking advantage of call-control features. BeVocal will continue to monitor the development within these committees. If the call-control features become a part of VoiceXML or a separate call-control standard, BeVocal VoiceXML will implement that standard. We will then deprecate the current extension and provide developers with information on how to convert their applications to the new standard.
Call Status
An call is in progress from the time a successful connection is made, to the time the call is terminated. A call is active from the time the call is connected until the Interpreter detects a hangup.
CONTROLLING OUTBOUND C ALLS
Limitations on Outbound Calls

Only one outbound call can be in progress at a given time: During an outbound call placed by a <transfer> tag, execution of the application is suspended. Consequently, it is not possible to try to place a second outbound call at the same time. After an outbound call is placed by a <bevocal:dial> tag execution continues; that call must be terminated before another <bevocal:dial> tag or a <transfer> tag is executed. Note: Putting the outbound call on hold is not sufficient; it must be terminated before another outbound call can be placed successfully. Additional restrictions apply to outbound calls placed by a Caf customers VoiceXML application: The application may place a local or long-distance outbound call, but not an international call. The maximum duration of the outbound call is 60 seconds. Inbound calls using VoIP cannot currently make outbound calls.
These restrictions do not apply to a hosting customers VoiceXML applications.
Interactions Without an Outbound Call

When no outbound call is in progress, the user speaks and sends DTMF signals to the application; the application listens to the user and plays audio output to the user. User VoiceXML Platform speech, DTMF
audio output
VoiceXML Application listening, executing
Inbound Call
Interactions During a Transfer

A <transfer> element places an outbound call to a third party. During the outbound call, the user and the called third party talk to each other; either the application is quiet or it is completely terminated. BeVocal VoiceXML currently supports three transfer methods: bridge, blind and supervised. During a bridging transfer, execution of the application is suspended. When the outbound call terminates, execution resumes. At that time, child elements of the <transfer> element (for example, <filled>) are executed, if appropriate. Then, the interpreter proceeds as usual, looking for the next form item to execute.
70
Interactions During a Transfer
The application may participate in the outbound call, depending on whether the <transfer> element includes child grammars. BeVocal VoiceXML also supports blind and supervised transfers. In a blind transfer, as soon as the session starts the outbound call, the VoiceXML session ends. Regardless of the success or failure of the outbound call, control does not return to the VoiceXML application. In a supervised transfer, once the outbound call is successfully connected, the VoiceXML session ends and control cannot return to the VoiceXML application; however, if the outbound call is not successfully connected (for example, due to no answer or line busy), then control returns to the VoiceXML application. Note: To use blind or supervised transfers, contact BeVocal Customer Support. Without a Transfer Grammar If the <transfer> element has no child grammar, the VoiceXML application simply waits for the outbound call to terminate. The application does not listen to the user; execution is suspended, so the application cannot play audio output to the user. Because the inbound communication channel is not used in this situation, the inbound call is inactive. Outbound Call
User speech (and possibly DTMF) speech (and possibly DTMF)
Called Third Party
VoiceXML Platform
VoiceXML Application waiting
Inbound Call (inactive)
71
With Transfer Grammars If the <transfer> element includes any child grammars, the application listens to the user. If a user utterance matches a child grammar, the outbound call is terminated. Outbound Call
User speech (and possibly DTMF) speech (and possibly DTMF) VoiceXML Platform
Called Third Party
VoiceXML Application listening, waiting
Inbound Call
Interactions During a Dialed Call

A <bevocal:dial> tag places an outbound call to a third party. After the call is placed, the user and the called third party talk to each other. Execution of the application continues as soon as the call is placed.
72
Interactions During a Dialed Call
The inbound communication channel remains open, and the application continues to listen to the user. If a user utterance matches an active grammar, the application responds to the recognized utterance as usual. Outbound Call
Called Third Party
audio output
Inbound Call The name attribute of the <bevocal:dial> tag is required; its value is the name of a variable. If the specified name does not match an existing variable name, a new variable is created. When the call is placed successfully, the specified variable is set to a JavaScript object referring to the call. That variable can be used in the call attribute of other tags to terminate the call or to modify interactions during the call. The outbound call continues until the called third party hangs up, the application executes a <bevocal:disconnect> tag, or the call exceeds its maximum allowed duration. If the inbound call terminates (either because the user hangs up or because the application executes a <disconnect> tag), the outbound call is also terminated and the session ends. The interactions during a dialed call can be modified in any of the following ways: If the destination phone number specifies an extension as post-dial digits, those digits are sent to the called third party as DTMF signals after the call is answered. If the onhold attribute of the <bevocal:dial> tag is true, the outbound call is put on hold as soon as the connection is made. In addition, an active call is put on hold when a <bevocal:hold> tag is executed. See Putting a Dialed Call on Hold on page 74. A <bevocal:listen> element allows the application to suspend execution temporarily during the outbound call. See Listening to a Dialed Call on page 75. A <bevocal:whisper> element puts the user on hold, allowing the application to play audio output to the called third party. See Interrupting a Dialed Call on page 76.
73
Putting a Dialed Call on Hold

A dialed outbound call can be put on hold in either of two ways: A new call is put on hold as soon as the connection is made if the onhold attribute of the <bevocal:dial> tag is true; otherwise, the call remains active. An active outbound call is put on hold if the interpreter executes a <bevocal:hold> tag.
While the outbound call is on hold, communications are like those when no outbound call is in progress. Namely, the user speaks and sends DTMF signals to the application; the application listens to the user and plays audio output to the user. For SIP calls, you can use the transferaudio attribute to play audio to the called third party. Otherwise, the called third party hears silence. Neither the user nor the application hears the third party. Outbound Call (inactive)
User
Called Third Party speech, DTMF VoiceXML Platform
audio output
Inbound Call Note: You cannot place a new outbound call while a prior call is on hold. Placing a call on hold does not extend its maximum allowed duration. The maxtime attribute of <bevocal:dial> limits the total time that the call can be in progress (not the time the call can be active). An outbound call remains on hold until the interpreter executes a <bevocal:connect> tag to reconnect it; at that time, the dialed call becomes active once again. Alternatively, a call that is on hold can be terminated without becoming active.
74
Listening to a Dialed Call
Listening to a Dialed Call

While a dialed call is active, a <bevocal:listen> tag can temporarily suspend execution of the application. The application may listen to the user while execution is suspended, depending on whether the <bevocal:listen> element includes child grammars. A <bevocal:listen> element is an input item; if a user utterance matches a child grammar, its input variable is set to the recognition result. After successful recognition, execution of the application resumes and the outbound call continues. At that time, child elements of the <bevocal:listen> element are executed, if appropriate. Then, the interpreter proceeds as usual, looking for the next form item to execute. If you want the outbound call to end when recognition occurs, you must terminate the call explicitly in a <filled> element. If the outbound call terminates with no recognition, an event is thrown and execution continues. In this case, the input variable is not set. Note: An outbound call must be active when the <bevocal:listen> element is executed. If you want to prevent the <bevocal:listen> element from being selected for execution after the outbound call terminates, you can set the input variable explicitly in an event handler. With Listen Grammars If a <bevocal:listen> element includes any child grammars, the application listens to the user, suspending execution until recognition occurs or the outbound call terminates. Outbound Call
Called Third Party
VoiceXML Application listening, waiting
Inbound Call
75
Without a Listen Grammar If a <bevocal:listen> element has no child grammars, the VoiceXML application suspends execution until an event is thrown indicating that the outbound call has terminated. The application does not listen to the user; execution is suspended, so the application cannot play audio output to the user. Because the inbound communication channel is not used in this situation, the inbound call is inactive. Outbound Call
User speech (and possibly DTMF) speech (and possibly DTMF)
Called Third Party
VoiceXML Platform
VoiceXML Application waiting
Inbound Call (inactive)
Interrupting a Dialed Call

While a dialed call is active, a <bevocal:whisper> tag can interrupt the call. The user is put on hold and the application is connected to the called third party. The application can then play audio output to the third party. While the call is interrupted, the third party can hear the application, but cannot hear the user. The application ignores all speech and DTMF signals, whether from the user or from the called third party. For
76
VoIP and Outbound Calls
SIP calls, you can use the transferaudio attribute to play audio to the user. Otherwise, the user hears silence. Outbound Call
User
Called Third Party
VoiceXML Platform audio output VoiceXML Application executing
Inbound Call (inactive) The user is reconnected to the third party when execution of the <bevocal:whisper> element terminateswhether execution terminates successfully or because an event it thrown.
VoIP and Outbound Calls

The BeVocal Platform is capable of placing outbound calls to both PSTN and SIP destinations. However, calls to SIP destinations must originate on a VoIP gateway. Outbound calls are not currently supported in the BeVocal Caf developement environment. For more information, see the BeVocal Voice Over IP (VoIP) Support Quick Reference.
77
78
Go-Back Facility
The BeVocal VoiceXML go-back facility allows the user to retract the last response or to transition back to the last location in an application. This chapter describes: Retracting User Responses Go-Back Stack Go-Back Destinations Enabling the Go-Back Facility Controlling Go-Back Behavior Using the Go-Back Facility
Note: The go-back facility is an experimental extension to VoiceXML; its implementation and behavior are subject to change. The current BeVocal VoiceXML implementation contains the feature before it has been standardized so that developers may provide feedback. If this capability becomes a standard part of a future version of VoiceXML, the BeVocal VoiceXML implementation will change as necessary to match the VoiceXML standard.
Retracting User Responses

If the go-back facility is enabled, the user can retract the last response to a VoiceXML application by saying go back. After the interpreter removes the users response, it prompts for the information again. For example, the following form asks for the users home and work phone numbers: <form> <field name="home" type="phone"> <prompt> What is your home phone number? </prompt> </field> <field name="work" type="phone"> <prompt> What is your work phone number? </prompt> </field> </form> Suppose a user inadvertently gives the work number when asked for the home number. The go-back facility would allow the user to correct this mistake. Application: User: Application: What is your home phone number? 408-555-3200. What is your work phone number?
GO-BACK FACILITY
User: Application: User: Application: User:
Go back. What is your home phone number? 408-555-3042. What is your work phone number? 408-555-3200.
The go-back facility also allows users to change their minds after requesting one of several alternatives. For example, it would permit the following interaction.
Application: User: Application: User: Application: User:
Would you like News, Weather, or Traffic? Weather. What city? Go back. Would you like News, Weather, or Traffic? Traffic.
Go-Back Stack
When user says go back, the interpreter undoes whatever actions resulted from the last response, then it prompts the user for a new response. The user can retract a sequence of responses by saying go back repeatedly. Each request for user input is called a go-back destination. When the user provides the requested input, the interpreter saves information about the go-back destination as an entry on its go-back stack. If the user says go back, the interpreter uses the saved information for the most recent go-back destination on the stack to undo the actions that resulted from the users response. It then goes back to that go-back destination, popping the corresponding entry off the stack. Stack Entries Each entry on the go-back stack saves information about one step the interpreter performed during the execution of the application. Go-Back Entries The entries corresponding to go-back destinations are called go-back entries; they correspond to the user-visible steps in the interaction. As the user retraces these steps, the interpreter goes back to the appropriate elements within the VoiceXML application, transparently moving between dialogs and documents as necessary. For example: After the user fills the last field in a form, the form may transition to a different form in a different document. If the user says go back to the first question on the new form, the interpreter returns to the first form in the original document. It clears the last field in that form, but restores the values of all other form-item variables in the form. A users response may match a link grammar that transitions to a different form or that throws an event that causes a transition. Or a users response may match a document-scoped grammar in a different form, causing a transition to that form. If the user says go back to the first question in the new
80
Go-Back Destinations
location, the interpreter returns to the form and field that was being visited at the time of the users last response. Internal Entries In addition to the go-back entries, the go-back stack saves internal entries, which correspond to non-user-visible steps, such as transitions between forms. When the interpreter goes back to the most recent go-back destination, it also undoes each non-user-visible step that occurred after the last go-back destination and pops the corresponding internal entry off the stack. A <block> form item does not request user input and so is not a possible go-back destination. However, any block items that are executed between one input request and the next are saved as internal stack entries that can be undone when the interpreter goes back to the preceding input request.
Go-Back Destinations
A VoiceXML application can request user input in a menu, in the initial item of a mixed-initiative form, and in an input item. These elements, therefore, can be go-back destinations. Menus A <menu> element asks the user to select a choice. The <menu> element is the go-back destination for the users response. If the user says go back after selecting a menu choice, the menu is executed again. Mixed-Initiative Forms The <initial> element of a mixed-initiative form asks the user for initial input to the form. This element is the go-back destination for the users response. If the user says go back after providing initial input, the initial element is executed again. The users answer to the initial prompt may provide values for several of the forms input variables. When the interpreter undoes an initial element, it clears not only the initial form-item variable, but also any input variables that were set by the users response. Input Items For the purposes of the go-back facility, input items can be classified as follows: The <field> and <record> items accept a single user input. These input items appear to the user as a single request for information. The <transfer> item may involve a long interaction between the user and a third party. It provides the application with a single piece of information, namely the result of the transfer. During a transfer, however, the user may provide various pieces of information to the third party and may later want to retract some or all of that information. The <subdialog> item may accept multiple user inputs and so may appear to the user as multiple requests for information.
Single-Input Input Items A <field> item asks the user for the value of its input variable. A <record> item asks the user for input to be recorded. These input items are go-back destinations. If the interpreter goes back to one of these items, it clears the corresponding input variable and executes the item again. Going back to a <field> item allows the user to give a different answer; going back to a <record> item allows the user to provide different input to be recorded.
81
GO-BACK FACILITY
Transfer Items A <transfer> item transfers the user to another destination, allowing the user to carry on a conversation with a third party. At the end of a bridging transfer, the interpreter resumes execution of the form containing the transfer item. The user might then say go back to the next request for input. In a blind or supervised transfer, the current session terminates when the transfer is made; the user has no opportunity to invoke the go-back facility at the end of the call.
Note: Currently, the BeVocal interpreter supports bridging transfers only. A transfer item is a go-back destination. If the interpreter goes back to a transfer item, the transfer call is repeated. Any change in the information exchanged during the original transfer and during the repeated transfer is determined by the users conversation with the third party and does not affect the VoiceXML application. For example, the original transfer might place a call in which the user orders a pizza. After that call, the user might say go back, and add a salad to the original order. Subdialogs A <subdialog> item invokes another dialog as a subdialog of the current one. Each request for input made by the subdialog is a go-back destination. The <subdialog> element itself is also a go-back destination. If the user says go back to a request for input inside the subdialog, the go-back behavior is the same as in any other form. Within the subdialogs execution context, the go-back stack is initially identical to the go-back stack in the calling dialogs execution context. As each new input is requested, another go-back destination is pushed onto the stack: If the user says go back to the first input request in the subdialog, the interpreter returns to the last go-back destination in the calling dialog. If the user says go back to a subsequent input request in the subdialog, the interpreter returns to the preceding go-back destination in the subdialog
Once the subdialog returns to the calling dialog, however, the subdialogs execution context terminates. The go-back stack in the calling dialogs execution context does not contain any go-back destinations for the input requests made by the subdialog. A new go-back destination is added for the subdialog itself. If the subdialog requests a single user input, the go-back behavior is the same as for any other single-input input item. If a subdialog requests more than one user input, however, the go-back behavior may not be what the user expects. For example, suppose a document contains the following forms: <form id="main"> <field name="A">...</field> <subdialog name="B" src="#sub"> ... </subdialog> <field name="F">...</field> </form> <form id="sub"> <field name="C">...</field> <field name="D">...</field> <field name="E">...</field> <filled> <return namelist="A B C"/> </filled> </form>
82
Enabling the Go-Back Facility
A user who says go back when prompted for field F, might expect to provide a different answer for field E in the subdialog. However, the interpreter goes back to subdialog B. It executes the subdialog from the beginning, prompting again for fields C, D, and E. Note: In a future release, you may be able to specify whether go back will go back into the subdialog (to ask for field E in the preceding example) or to the beginning of the subdialog (as currently happens).
Enabling the Go-Back Facility

To enable the go-back facility, you must set two properties to: Activate the goback universal grammar Specify the minimum size of the go-back stack
You use the <property> tag to set both properties. Activating the Universal Grammar The goback universal grammar recognizes the spoken go back request. Like all universal grammars, it is deactivated by default. See Universal Commands and Grammars on page 14. When this grammar is deactivated, the speech-recognition engine does not recognize the input go back. If the user says go back to the prompt for a field, a no-match event is thrown. You set the universals property to activate or deactivate the goback grammar, either in the entire application, or in particular documents, forms, or fields. The go-back facility is activated in the following document, but deactivated during the execution of the first form. <vxml version="2.0" xmlns="http://www.w3.org/2001/vxml">  <property name="universals" value="goback help exit"/> ... <form>  <property name="universals" value="help exit"/> ... </form> ... </vxml> Setting the Minimum Stack Size The bevocal.mingoback property specifies the minimum size of the go-back stack. The interpreter keeps at least this many entries on the stack, except at the beginning of the call when fewer steps have been executed, and after the user has said go back so many consecutive times that the stack has been depleted. By default, this property is set to 0, which means that the go-back stack is always empty and the go-back facility is effectively disabled. If the user says go back, the go-back facility responds, Im sorry, I dont know where to go back to. If you want to allow the user to go back, you must set the bevocal.mingoback property to 1 or more. For example, the following application sets the minimum stack size to 20 entries. <?xml version="1.0" ?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN"
83
GO-BACK FACILITY
"http://cafe.bevocal.com/libraries/dtd/vxml2-0-bevocal.dtd"> <vxml version="2.0" xmlns="http://www.w3.org/2001/vxml">  <property name="universals" value="goback help"/>  <property name="bevocal.mingoback" value="20"/> <form> <field name="home" type="phone"> <prompt> What is your home phone number? </prompt> </field> <field name="work" type="phone"> <prompt> What is your work phone number? </prompt> </field> </form> </vxml>
Controlling Go-Back Behavior

You can control the applications use of the go-back facility in the following ways: You can prevent the user from retracting certain inputs. You can customize the applications response to a go-back request from the user. You can deactivate the go-back facility in the entire application, in a particular document, in a particular dialog, or in a particular go-back destination.
Suppressing Retraction You can prevent the user from retracting certain inputs by setting the bevocal.goback property. This property controls whether requests for user input are legal go-back destinations. By default, the property is set to true and each request for input is a legal go-back destination. When the user provides the requested input, the interpreter pushes a go-back entry for the request onto its go-back stack. If the bevocal.goback property is false, however, a request for input is a not a legal go-back destination. When the user provides the requested input, the interpreter pushes an internal entry for the request onto its go-back stack. The internal stack entry enables the interpreter to undo the information request if the user returns to an earlier go-back destination; however, it prevents the user from going back to the request itself. A users response is called retractable if a corresponding go-back entry is added to the stack; if, instead, an internal entry is added to the stack, the response cannot be retracted. If you set the bevocal.goback property to false in a field, the users input for the field is not retractable. The user cannot go back to that field, but may skip back to retract the preceding retractable input. If you set this property to false in a form, you prevent the user from retracting any input to that form.
84
Controlling Go-Back Behavior
When several fields are treated as a single conceptual unit, you may want to suppress retraction of all but the first field. For example, the go-back facility treats the city and state fields as a unit in the following form: <form> <field name="city"> <prompt>Choose a city</prompt> <grammar>...</grammar> </field> <field name="state"> <property name="bevocal.goback" value="false"/> <prompt>What state?</prompt> <grammar>...</grammar> </field> <field name="first" type="boolean"> <prompt> Do you want to fly first class? </prompt> </field> </form> The user cannot retract an answer to the question about state, but can skip past it to retract the city, as illustrated in the following interaction. Application: User: Application: User: Application: User: Application: Customizing Go-Back When the speech-recognition engine matches the goback grammar, a goback event is thrown. The default handler undoes entries on the go-back stack until it reaches the most recent go-back entry, corresponding to the users last retractable response. If the go-back stack is empty, the default handler plays an audio message that says Im sorry, I dont know where to go back to. If you want the application to take different actions, you can add your own event handler for go-back events. For example, an application might keep information about each users default location. If the user requests a traffic report from the main menu, the traffic form might start to fetch the report for the users default location without requesting the users city. The application could use the go-back facility to allow the user to provide a different location. <form id="traffic"> <catch event="goback"> <clear/> </catch> <field name="city" expr="document.defaultCity"> <prompt>What city?</prompt> <grammar>...</grammar> </field> <block> Choose a city. Albany What state? Georgia Do you want to fly first class? Go back. Choose a city.
85
GO-BACK FACILITY
<prompt> Retrieving traffic data for <value name="city"> Say Go Back to choose another city. </prompt>  </block> </form> An interaction with the application might proceed as follows. Application: User: Application: User: Application: User: Application: Would you like news, weather, or traffic? Traffic Retrieving traffic data for San Francisco. Say Go Back to choose another city. Go back. What city? San Jose. Retrieving traffic data for San Francisco. Say Go Back to choose another city.
In this case, saying go back takes the user to a question that has never been asked before. If the applications go-back handler needs to take some actions and then proceed as normal to undo the users response, it can perform the appropriate actions and then rethrow the event to the default handler: <catch event="goback"> ... <rethrow/> </catch>
Using the Go-Back Facility

This section contains guidelines for using the go-back facility Selecting the Minimum the Stack Size You need to ensure that go-back stack can grow large enough to enable a user to retrace as many steps as you think are likely; see Setting the Minimum Stack Size on page 83, The size of the stack limits the number of consecutive times the user can say go back. Remember that the stack must be large enough to accommodate internal entries as well as go-back entries. When you set the stack size, you should allow for a few internal entries for each go-back entry. Using Blocks You can safely put blocks between go-back destinations in a form. For example, in the following form, if the user goes back to the home field, the interpreter undoes the subsequent block, clearing its item variable and allowing the block to be visited again after the user provides an new answer for the home field: <form> <field name="home" type="phone"> <prompt> What is your home phone number? </prompt>
86
Using the Go-Back Facility
</field> <block> Your home number is <value expr="home"/> </block> <field name="work" type="phone"> <prompt> What is your work phone number? </prompt> </field> </form> The interaction with the user might proceed as follows. Application: User: Application: User: Application: User: Application: User: What is your home phone number? 408-555-3200. Your home number is 408-555-3200. What is your work phone number? Go back. What is your home phone number? 408-555-3042. Your home number is 408-555-3042. What is your work phone number? 408-555-3200.
A <block> in a form is saved as an internal stack entry only if it occurs after the first go-back destination in the form. If the forms first item is a block containing a welcoming prompt, no internal stack entry is saved for the block, so it will not be revisited if the go-back facility returns to the first input request in the form.
Tip: In a mixed-initiative form, put any welcoming prompt in the <initial> element, not in a separate <block> element.
The internal stack entry for a block is undone and redone only if the interpreter returns to a go-back destination before the block. As a consequence, a block that is used to prompt for information in the subsequent field is not redone if the interpreter goes back to the field. In the following form, if the user says go back when asked for a work phone number, the request for the home phone number would not be replayed. <form> <block>What is your home phone number?</block> <field name="home" type="phone"></field> <block>Your home number is <value expr="home"/></block> <field name="work" type="phone"> <prompt> What is your work phone number? </prompt> </field> </form>
87
GO-BACK FACILITY
Tip: Be sure to put the prompt for a field value inside the <field> element and not in a separate <block> element.
Using Subdialogs To avoid any confusion that can occur if the user says go back after returning from a subdialog, try to limit your use of subdialogs to requests for confirmation or disambiguation. In addition, you should prevent the subdialog itself from being a legal go-back destination by setting the bevocal.goback property to false inside the <subdialog> element. If the user says go back after the subdialog returns, the interpreter will go back to the question preceding the subdialogpresumably the question whose answer required confirmation or clarification. Using Variables When the interpreter returns to a particular go-back destination in a form, it clears form-item variables for every block, initial item, and input item that needs to be undone. However, it does not change the values of any other variables declared in dialog, document, or application scope. If the interpreter undoes a transition, going back to a different form, it does not restore the variables declared in the form to the values they had when that transition left the form. If you use the go-back facility, you should avoid saving state information in variables that cannot be reset by a go-back operation. In limited circumstances, you may be able to reset variables in an error handler for go-back events. In general, however, the event handler will not have enough context to know what variables need to be reset because the event is thrown at the location where the user says go back, not at the go-back destination.
88
TTS and Recorded Voice Selection
The TTS and Recorded Voice Selection facility of BeVocal VoiceXML allows the voice developer to specify TTS and recorded voices within their VoiceXML programs. This chapter describes: Specifying TTS Voices Specifying Recorded Voices Lists of Fallback Voices Overriding Recorded Voices
Note: The TTS and Recorded Voice Selection facility is an experimental extension to VoiceXML; its implementation and behavior are subject to change. The current BeVocal VoiceXML implementation contains the feature before it has been standardized so that developers may provide feedback. If this capability becomes a standard part of a future version of VoiceXML, the BeVocal VoiceXML implementation will change as necessary to match the VoiceXML standard.
Specifying TTS Voices

Text To Speech (TTS) output can occur in many different places in a VoiceXML application. Some examples where TTS can be played are within <block>, <prompt>, <audio>, and <say-as> tags, and there are others. Supported TTS Voices The following are the currently supported TTS voices and their characteristics: Voice name jennifer julie katarina laurie maria mark reed Characteristics Female, American English. Female, Vocalizer French Canadian Female, RealSpeak German Female, Vocalizer English Female, Vocalizer Spanish Male, American English. Male, Vocalizer American English
Note: If not specified, the default TTS voice is jennifer. TTS voices can be specified in two ways, through a VoiceXML property with name bevocal.voice.name, and the <voice> tag using the name attribute. The same syntax is used both. Property syntax <property name="bevocal.voice.name" value="TTS_voice_name"/>
TTS
AND
R ECORDED VOICE SELECTION
Property Description BeVocal VoiceXML has introduced a property, bevocal.voice.name which allows you to define a voice name. A TTS voice defines the characteristics of the TTS played by a TTS engine. A TTS voice may correspond to a single TTS engine with certain parameters set. Because the BeVocal interpreter can support multiple TTS engines, two TTS voices may correspond to different TTS engines. As with all properties the bevocal.voice.name property is taken from the innermost property scope which applies to the TTS in question. When no property is specified the default TTS voice is used. Property Example The following statement would specify the mark TTS voice for the prompts within the field, including the prompts "Please say a phone number" and "You said 408-555-1212": <field name="myphone"> <property name=bevocal.voice.name" value="mark"/> <grammar src="builtin:grammar/phone"/> <prompt>Please say a phone number</prompt> <filled> You said <prompt><say-as type="telephone"> <value expr="myphone"/></prompt> </say-as> </filled> </field> See the <say-as> tag for more details. When specifying TTS voices, it is important to know that some <say-as> types do extra processing on the TTS content of the <say-as> tag, before giving it to the TTS engine, or set parameters on the TTS engine to output the result more naturally. In the case of the telephone type used above, inserting appropriate pauses in 10-digit US numbers helps the TTS output sound more natural. It is usually a good idea to use <say-as> when outputting a type which is supported by <say-as>. In a very special case you can parse the result in ECMAScript for example and put in the appropriate SSML or other TTS markup yourself. Errors An error.noresource event is thrown when an invalid TTS voice is specified. Voice tag Syntax The TTS voice can alternatively be specified using the <voice> tags name attribute: <voice name="mark"> Say Hello using Marks voice! </voice> Voice tag description The voice tag syntax allows you to specify TTS Voices at a finer -level of granularity, in SSML, wherever <voice> tags are allowed. With just two TTS voices, one female and one male, the use-cases for switching voices within SSML are few. In general it is probably not a good practice to do so. However, as an illustration, maybe you are reading out a plot from a gripping novel: <voice name="jennifer"> Would you like an apple or a banana? </voice> <voice name="mark"> Didnt you say you had oranges?
90
Specifying Recorded Voices
</voice> A more typical use-case is using a more natural sounding voice for most TTS, but using a more intelligible TTS voice to read-out some email. The BeVocal interpreter can support multiple TTS engines and would surface these as TTS voices.
Specifying Recorded Voices

A recorded voice is defined as a set of interpreter prompts recorded by a human voice talent. A recorded voice usually has a TTS voice fallback. All recorded voices have TTS voices as fallbacks. A recorded voice is output when using the <say-as> tag and the attribute bevocal:mode="recorded". Supported Recorded Voices Voice name bv_ann_en_us bv_adam_en_us bv_ben_en_us bv_cecelia_es_us Characteristics Female, American English. Male, American English Male, American English Female, American Spanish Types all equity citystate all (except time) TTS fallback jennifer mark mark maria
You specify the recorded voice name in the exact same way as the TTS voice name, using the bevocal.voice.name property. The currently available set of recorded voices are itemized in the table. Note: Currently the recorded voice can not be set using the SSML <voice> tag. Example The following example uses the bv_adam_en_us recorded voice within a field which prompts for a stock name and outputs the result using a recorded prompt: <field name="mystock"> <property name=bevocal.voice.name" value="bv_adam_en_us"/> <grammar src="builtin:grammar/equity"/> <prompt>Please say a stock name</prompt> <filled> You said <prompt><say-as type="equity" bevocal:mode="recorded"> <value expr="mystock"/></prompt> </say-as> </filled> </field> In the above example the equity will be read out in the bv_adam_en_us recorded voice unless it is not available; in that case, it will fallback to TTS in the mark voice. This could happen for example if that equity has not been recorded in this voice yet. Note that if the recorded voice is not available for a particular <say-as> type, then the TTS fallback is used. As you see in the table only the equity type is supported for bv_adam_en_us, so if any other type is specified the output would fallback to the TTS mark voice. Important: Again, see the <say-as> tag for more details. When specifying a specific <say-as> types and recorded voices, the exact format of the content of the <say-as> tag is important.
91
TTS
AND
Lists of Fallback Voices

Providing lists of voices enables extensibility and flexibility as the BeVocal interpreter supports more recorded and TTS voices. Also it provides a mechanism for BeVocal partners to create applications to override recorded voices for any type. Rationale There are some cases in which a recorded output may not be available for a recorded voice: Recorded voices may be available for only a subset of the possible <say-as> types. For example the bv_adam_en_us voice is currently only available for the equity type. Within a given <say-as> type there may be some values of content which are not available in a specific recorded voice. For example, there may be some street names which are not available when specifying the type street and the bv_ann_en_us recorded voice.
Also note that the interpreter may in the future provide support for: additional types for existing recorded voices. more coverage within a type for existing recorded voices. new recorded voices supporting a subset of available types. new recorded voices with no TTS voice.
For these reasons and more, the BeVocal interpreter supports a space-delimited list of recorded and TTS voices within the bevocal.voice.name property. Remember that a voice may be: recorded voice with a TTS fallback voice defined. recorded voice only TTS voice only.
Algorithm for Selecting the Voice. For any space delimited list of voices, the following algorithm is used. 1. Recorded: use the first (leftmost) recorded voice in which the content is available. 2. TTS: Use the first (leftmost) TTS voice specified, even if it is a fallback of a recorded voice, irrespective of the recorded voice actually used in 1. Selecting Voice Example For example, lets assume there is a newly available voice called bv_joe_en_us which supports all types, but the bv_adam_en_us has more up-to-date equity coverage. The following property declaration: <property name=bevocal.voice.name" value="bv_joe_en_us bv_adam_en_us"/> and the subsequent <say-as> statement: <say-as type="equity" bevocal:mode="recorded"> <value expr="mystock"/> </say-as> would output the equity value in the variable mystock in bv_joe_en_us voice if possible but would fallback to bv_adam_en_us voice, if mystock was not available in the first. This assumes that you prefer Joe for consistency with the other types in your application but want to fallback to Adam to ensure a recorded male voice for equity.
92
Overriding Recorded Voices
What about TTS fallback in this case? If the bv_joe_en_us voice has a TTS fallback voice it will be used, since it is the first. This is true even if bv_adam_en_us was selected for the recorded voice. What if you didnt like the TTS fallback for the Joe voice, and you wanted to override it? Then you could use a statement like the following: <property name=bevocal.voice.name" value="bv_superman_TTS bv_joe_en_us bv_adam_en_us"/> Or alternatively the if the Joe voice didnt have a TTS fallback but you didnt want Adams TTS fallback used, you could use the same statement above. Best Practices for Voice Selection In most cases you should observe the following principles in your voice selection. Specify one recorded voice, and one TTS voice per application for consistency in the interface. If you specify multiple voices they should have similar characteristics.
For BeVocal VoiceXML recorded voices: All recorded voices will have TTS fallbacks. Similar voices should have the same TTS fallback.
Overriding Recorded Voices

BeVocal provides a mechanism to allow special BeVocal partners, carrier and enterprise customers to override interpreter voices for certain types or all types. This can be combined with BeVocal Services for recording Voice talent. If you are interested in these services and capabilities, send email to CafePartners@bevocal.com.
93
TTS
AND
94
Dynamic SSML
The BeVocal VoiceXML Dynamic SSML facility enables the loading of Speech Synthesis Markup Language (SSML) documents from URIs. This allows VoiceXML developers to generate SSML documents based on parameter values and thus generate complex prompts based on a variety of input criteria. This chapter describes: Introduction Using Dynamic SSML Examples and Notes Errors SSML Document Extensions to the SSML spec
Note: The Dynamic SSML facility is an experimental extension to VoiceXML; its implementation and behavior are subject to change. The current implementation of BeVocal VoiceXML contains the feature before it has been standardized so that developers may provide feedback. If this capability becomes a standard part of a future version of VoiceXML, the BeVocal VoiceXML implementation will change as necessary to match the VoiceXML standard. For the latest W3C working draft of the Speech Synthesis Markup Language spec, see http://www.w3c.org/TR/speech-synthesis/.
Introduction
Purpose and Scope This document describes the support in the BeVocal VoiceXML interpreter for dynamic SSML. The <audio> tag has been extended with extended attributes to reference the URIs of SSML documents. The SSML document can be generated based on parameters of the URI used to reference it. Goals The goals of this design are: To provide an extension to the <audio> element so that the developers can have a flexible mechanism for dynamically generating simple as well as complex, layered prompts. To provide support for interpreting and executing an SSML document (compliant with the SSML spec) from within a VoiceXML context.
DYNAMIC SSML
Using Dynamic SSML

In order to have dynamic SSML generated and executed from within a VoiceXML document, the <audio> tag has been extended with two new attributes. bevocal:ssml - the URI which generates the dynamic SSML. bevocal:ssmlexpr - A JavaScript expression which resolves to the value of bevocal:ssml.
Only one of bevocal:ssml or bevocal:ssmlexpr should be used. The resource that resides at the URI should be an SSML document, compliant with the SSML spec. The interpreter downloads the resource, identified by the URI above, parses it as an SSML document and then places the contents of the SSML documents <speak> tag inline to replace the <audio> tag which references the SSML document. If there was a problem fetching the resource from the above URI or if there was a parse error in the downloaded SSML document, then an error.badfetch is thrown. The alternate text for <audio> is ignored in this case. It is as if the <audio> tag is replaced by the SSML. This behavior is semantically different than the normal <audio> behavior where if alternate text is specified and an error.badfetch forces the alternate text to be played. Note: if the <audio> element inside an SSML document contains either the bevocal:ssml or bevocal:ssmlexpr, then an error.badfetch is thrown at the SSML document parse time. Caching The SSML document is cached according to the current caching properties that were relevant for the <audio> element, which resulted in the SSML download.
Examples and Notes

<audio> with bevocal:ssml attribute The following vxml source shows how to use the bevocal:ssml attribute with <audio>: <?xml version="1.0"?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" "http://cafe.bevocal.com/libraries/dtd/vxml-bevocal.dtd"> <vxml version="2.0" xmlns="http://www.w3.org/2001/vxml"> <form> <block> <prompt> <audio bevocal:ssml="http://www.foo.com/ssml/foo.ssml" /> </prompt> </block> </form> </vxml> The contents of foo.ssml are: <?xml version="1.0" encoding="ISO-8859-1"?> <!DOCTYPE speak PUBLIC "-//W3C//DTD SYNTHESIS 1.0//EN" "http://www.w3.org/TR/speech-synthesis/synthesis.dtd">
96
Examples and Notes
<speak version="1.0" xmlns="http://www.w3.org/2001/10/synthesis" > <voice name="mark"> How are You ? Please say one of <audio src="http://cafe.bevocal.com/libraries/audio/female1/en_us/calculator/add.wav" >add</audio> or <audio src="missing.wav">multiply</audio> or <audio src="http://cafe.bevocal.com/libraries/audio/female1/en_us/calculator/divide.w av">divide</audio> </voice> </speak> When the above vxml code is executed then the foo.ssml is downloaded and parsed. The resulting SSML elements are then interpreted and added to the prompt queue in place of the containing <audio> tag as if the original <audio> element in the vxml source code was something like ... <prompt> <voice name="mark"> How are You ? Please say one of <audio src="http://cafe.bevocal.com/libraries/audio/female1/en_us/calculator/add.wav" >add</audio> or <audio src="missing.wav">multiply</audio> or <audio src="http://cafe.bevocal.com/libraries/audio/female1/en_us/calculator/divide.w av">divide</audio> </voice> </prompt> ...
97
DYNAMIC SSML
Errors
Syntax or Fetch errors downloading from the SSML URI: If there are errors downloading from the URI specified by the bevocal:ssml attribute of <audio>, then an error.badfetch is thrown <?xml version="1.0"?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" "http://cafe.bevocal.com/libraries/dtd/vxml-bevocal.dtd"> <vxml version="2.0" xmlns="http://www.w3.org/2001/vxml"> <catch event="error.badfetch"> SSML download failed </catch> <form> <block> <prompt> <audio bevocal:ssml="http://www.foo.com/ssml/foo.ssml" /> </prompt> </block> </form> </vxml> thrown which can be caught with a <catch> handler and appropriate action can be taken. No recursive use of bevocal:ssml or bevocal:ssmlexpr The <audio> elements in an SSML document cannot contain bevocal:ssml or bevocal:ssmlexpr attributes. The following SSML document would result in an error.badfetch at the document parse time. <?xml version="1.0" encoding="ISO-8859-1"?> <!DOCTYPE speak PUBLIC "-//W3C//DTD SYNTHESIS 1.0//EN" "http://www.w3.org/TR/speech-synthesis/synthesis.dtd"> <speak version="1.0" xmlns="http://www.w3.org/2001/10/synthesis" > <audio bevocal:ssml="error.ssml"/> </speak>
98
SSML Document
SSML Document
The SSML document that results from the execution of the <audio> tag with the bevocal:ssml or bevocal:ssmlexpr should be an SSML document compliant with the SSML spec. The root element should be <speak>. A sample SSML document would look like <?xml version="1.0" encoding="ISO-8859-1"?> <!DOCTYPE speak PUBLIC "-//W3C//DTD SYNTHESIS 1.0//EN" "http://cafe.bevocal.com/libraries/dtd/ssml-bevocal.dtd"> <speak version="1.0" xmlns="http://www.w3.org/2001/10/synthesis" > <voice name="mark"> How are you ? <prosody rate="slow"> I am fine </prosody> </voice> </speak> Developers can refer to the SSML DTD via the PUBLIC id "-//W3C//DTD SYNTHESIS 1.0//EN" as shown in the above SSML document. The location of the SSML document serves as the Base URI for resolving any relative URI references inside the SSML document. Also the xml:base attribute on the <speak> element can modify the Base URI, as in VoiceXML 2.0.
Extensions to the SSML spec

BeVocal VoiceXML adds extended attributes to two SSML elements, in addition to what is specified in the SSML spec. The two elements and their new attributes are: Add vxml 2.0 caching attributes to <audio> The <audio> elements inside the downloaded SSML document can specify all the vxml 2.0 caching attributes, like fetchhint, maxage, maxstale, and so on, in order to control how the interpreter can cache the audio resource. Note that specifying vxml 1.0 caching attributes like caching would result in a parse error as only vxml 2.0 style caching attributes are supported. Add bevocal:mode to <say-as> The <say-as> element can specify the BeVocal VoiceXML extended attribute bevocal:mode to specify if the <say-as> should render the output in a TTS voice (default) or in a recorded voice.
99
DYNAMIC SSML
100
10
SOAP Client Facility
This chapter describes the mechanism for calling SOAP services from the BeVocal VoiceXML interpreter. It describes the binding between the JavaScript interpreter in VoiceXML and the SOAP services. It also documents the method by which the interpreter locates SOAP services. The interpreter's SOAP API takes advantage of the dynamic nature of JavaScript. There is no need to create separate proxy classes for each SOAP service you wish to call, because the interpreter can perform the conversion between JavaScript and SOAP types and methods on the fly. This makes SOAP much easier to deal with than in most other languages. This chapter has the following sections: Locating and Identifying SOAP Services Calling SOAP Methods
Note: The SOAP Client facility currently supports SOAP services in which messages are RPC-oriented. The facility does not currently support SOAP services in which messages are document-oriented; this will be supported in a future release. For a description of how these message styles differ, refer to http://www.w3.org/TR/wsdl#_soap:body. Note: The SOAP Client facility is an experimental extension to VoiceXML; its implementation and behavior are subject to change. The current BeVocal VoiceXML implementation contains the feature before it has been standardized so that developers may provide feedback. If this capability becomes a standard part of a future version of VoiceXML, the BeVocal VoiceXML implementation will change as necessary to match the VoiceXML standard.
Locating and Identifying SOAP Services

There is a small API for creating JavaScript objects that act as proxies for SOAP services. It is exposed through the JavaScript object bevocal.soap, which resides in the session scope and has 3 methods. For details on these methods, see Chapter 14, JavaScript Functions and Objects. The first two methods are most commonly used with SOAP services accessible from the general internet or intranet where the BeVocal platform is installed. The last method is used only for BeVocal SOAP services that are exposed via a service registry and service ID. Briefly, the methods are as follows: bevocal.soap.serviceFromWSDLGiven the URL for a Web Service Definition Language (WSDL) file, create an object to act as a proxy for one of the services described in the file. (This is the preferred way to create a service proxy object, because it provides better type checking.) bevocal.soap.serviceFromEndpointGiven the name and endpoint URL of a SOAP service, create an object to act as a proxy for the service. (This method should be used only when the WSDL for a service is not available.) bevocal.soap.locateServiceGiven the service identifier and possibly a version, use the BeVocal SOAP service registry to locate either the standard BeVocal SOAP service or a particular version of that service.
Note: Currently, no BeVocal SOAP services are exposed in the BeVocal SOAP service registry.
SOAP CLIENT FACILITY
There are a number of errors that can happen while locating SOAP services. If an error occurs while creating a SOAP proxy object, an exception of type bevocal.soap.SoapException is thrown. For example, assume want to use a temperature service available from www.xmethods.net. At that site, you discover that the WSDL for this service is at http://www.xmethods.net/sd/2001/TemperatureService.wsdl. This WSDL looks like: <?xml version="1.0"?> <definitions name="TemperatureService" targetNamespace="http://www.xmethods.net/sd/TemperatureService.wsdl" xmlns:tns="http://www.xmethods.net/sd/TemperatureService.wsdl" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:soap="http://schemas.xmlsoap.org/wsdl/soap/" xmlns="http://schemas.xmlsoap.org/wsdl/"> <message name="getTempRequest"> <part name="zipcode" type="xsd:string"/> </message> <message name="getTempResponse"> <part name="return" type="xsd:float"/> </message> <portType name="TemperaturePortType"> <operation name="getTemp"> <input message="tns:getTempRequest"/> <output message="tns:getTempResponse"/> </operation> </portType> <binding name="TemperatureBinding" type="tns:TemperaturePortType"> <soap:binding style="rpc" transport="http://schemas.xmlsoap.org/soap/http"/> <operation name="getTemp"> <soap:operation soapAction=""/> <input> <soap:body use="encoded" namespace="urn:xmethods-Temperature" encodingStyle="http://schemas.xmlsoap.org/soap/encoding/"/> </input> <output> <soap:body use="encoded" namespace="urn:xmethods-Temperature" encodingStyle="http://schemas.xmlsoap.org/soap/encoding/"/> </output> </operation> </binding> <service name="TemperatureService"> <documentation>Returns current temperature in a given U.S. zipcode </documentation> <port name="TemperaturePort" binding="tns:TemperatureBinding"> <soap:address location="http://services.xmethods.net:80/soap/servlet/rpcrouter"/> </port> </service> </definitions>
102
Calling SOAP Methods
The bold lines in the defintion provide the rest of the information you need to create a proxy object for this service in your VoiceXML application: <script> <![CDATA[ var service = bevocal.soap.serviceFromWSDL( // WSDL URL "http://www.xmethods.net/sd/2001/TemperatureService.wsdl", // name attribute of port "TemperaturePort", // targetNamespace attribute of definitions "http://www.xmethods.net/sd/TemperatureService.wsdl", // name attribute of service "TemperatureService", // location attribute of soap:address child of port element "http://services.xmethods.com:80/soap/servlet/rpcrouter" // ); ]]> </script>

Once you have a JavaScript proxy object for a SOAP service, calling the service's methods is easy. For example, if you have a service that performs temperature conversions, the code might look like this: <script> var converter = bevocal.soap.locateService("temp_converter", 1.0); var freezing = converter.fahrenheitToCelsius(32.0); </script> The proxy object automatically converts the JavaScript fahrenheitToCelsius function into a call to the SOAP method with the same name. It converts the sole parameter, 32.0, to the SOAP encoding and passes it as a float. If the converter proxy had access to the WSDL for the service when it was created, it also knows the proper name for the argument and encodes that in the SOAP message as well. Type conversion When calling a SOAP method, there are currently some limitations regarding streaming out the arguments. A fixed mapping of Java classes to XML element types is used during serialization. Currently, a method's WSDL definition is not used when streaming out the method's arguments; the WSDL is used only when pre-flighting the method call. This means that the interpreter cannot always stream out complex types with the correct XML type. Most SOAP servers can deal with this, but a few cannot and will return errors. The biggest problem is with arrays of complex types. In a SOAP array, an attribute of the Array element must contain the type of the contained object. For example:
103
<SOAP-ENC:Array SOAP-ENC:arrayType="xyz:Order[1]"> <Order> <Product>Apple</Product> <Price>1.56</Price> </Order> </SOAP-ENC:Array> In the current implementation, the interpreter is unable to determine the type contained within the array, so it cannot decide what to use as the value of the SOAP-ENC:arrayType attribute. As a default, the interpreter uses xsd:anyType, which causes an error on some servers. The type mappings currently performed are as follows: JavaScript String SOAP xsd:string Description This mapping automatically works in both directions. If a string is an argument to a SOAP call, the interpreter serializes it as an xsd:string, and if an xsd:string is in a response the interpreter converts it into a String in the JavaScript result object. Depending on their values, JavaScript Numbers or numeric literals in arguments to SOAP methods are converted into one of these three numeric types from the Schema specification. Any of these three types is converted to Number when received in a SOAP response. When the interpreter sees a complex JavaScript object, that is, an object with properties, it serializes it in the SOAP message as a reference to another object. For example, an <item href="..."/> becomes a <multiRef>. Each property of the JavaScript object is encoded as a separate element under the <multiRef> element. When deserializing SOAP replies, the interpreter does this in reverse. A compound object in a SOAP message is converted to a JavaScript object. Each sub-element in the SOAP value is added as a property of the JavaScript object. The local portion of the SOAP element's name (without the namespace qualifier) is used as the JavaScript property name. Array SOAP-ENC:Array JavaScript arrays are serialized as compound objects. The root element is of type SOAP-ENC:Array and SOAP-ENC:arrayType="xsd:anyType[...]". Each element in the compound object must be the same type. On input, a SOAP-ENC:Array is converted into a JavaScript Array. xsd is the XML Schema namespace. The interpreter uses the namespace specified in the WSDL. For services without WSDL information, on output, it uses http://www.w3.org/2001/XMLSchema; on input, it accepts any of the following: http://www.w3.org/2001/XMLSchema http://www.w3.org/1999/XMLSchema http://www.w3.org/2000/10/XMLSchema
Number
xsd:int xsd:negativeInteger xsd:float
Object with properties
<multiRef>
104
SOAP Headers Occasionally, you may need to add information to the header sent with your SOAP requests. For example, if you use any of the BeVocal platform services in your VoiceXML application, you must add security information to the header. To add headers, each JavaScript proxy object supports the _addHeader method. This method allows you to set an additional SOAP header for all SOAP requests. Once you use the _addHeader method on a proxy object, all future requests sent using that object have the added header. If later in the same application you need to send a request to the same service that does not include that header, you must create a different proxy object to service the request. SOAP Methods are JavaScript objects In JavaScript, a method of an object is really just an object property that points to a function. This is true for the interpreter's SOAP proxy objects as well, allowing you to use them as "functors". For example: <script> var converter = bevocal.soap.lookupService("temp_converter", 1.0); var conversion = converter.fahrenheitToCelsius; ... // Someone else told me which conversion to use var temperature = conversion(123); </script> Error Handling For WSDL-Based Services When attempting to call a function on a SOAP proxy which has WSDL information available, the interpreter can detect many errors before making the actual SOAP call. This leads to earlier and better error detection and the ability to distinguish different types of errors more easily. Call to Missing Method Remember that in this JavaScript binding, SOAP functions map to JavaScript objects. Consider this statement: result = service.method(arg1, arg2); It first retrieves the method property of the object service, then calls it as a function with the arguments arg1 and arg2, assigning the result to result. When you refer to a nonexistent method of a service proxy that is backed by WSDL, such as: var service = bevocal.soap.serviceFromWSDL(wsdlURL, ...); var result = service.missingmethod(1, 2); the service proxy will return the JavaScript constant undefined when asked for its missingmethod property, and will print a warning to the log. The JavaScript interpreter will then try to treat the undefined value as a function, immediately causing a runtime error with the message. The log will look like this: JavaScript warning on line 29 of 'soap.vxml: Cannot find operation: missingmethod - none defined ERROR error.semantic: JavaScript error on line 29: missingmethod is not a function. The implementation returns undefined rather than throwing an exception immediately. However, this is an advantage in certain cases, because it allows you to write code like the following: // Get the highest 1.x version of "myservice" var service = bevocal.soap.lookupService("myservice", "1"); if (service.method2 != undefined) { // Call the new and improved method
service.method2(arg1, arg2); } else { // Call the old method service.method1(arg1); } If the interpreter immediately threw a runtime error when you tried to reference a missing method, you could not make this check for an undefined method. Missing or Extra Parameters If client code attempts to call a SOAP method using the wrong number of parameters, the interpreter's SOAP client will throw a bevocal.soap.SoapException exception with the cause property set to CALL_ERROR. The message property will contain an error message explaining that there was an argument mismatch. For example: try { myService.callMethod("too", "many", "parameters"); } catch (error if error.type == "bevocal.soap.SoapException") { if (error.cause == bevocal.soap.SoapException.CALL_ERROR) { // Probably passed the wrong number of arguments } } Invalid Parameter Types The current implementation does not validate parameter types against a method's WSDL definition. It simply serializes each parameter using the XML type that seems to be the best match. If this parameter does not match the type that was specified for that parameter in the WSDL, the server for the request will signal an error and return it as described next. It is also possible that the type-conversion engine will report an error if it is unable to serialize a complex JavaScript type. This is described in Type conversion on page 103. Server-side Errors Errors reported by a server are returned in a SOAP message with a <Fault> element. When the interpreter detects a fault message, it throws a bevocal.soap.SoapFault. Error Handling for Non-WSDL-Based Services When the interpreter does not have the WSDL available for a service proxy, it can't do much pre-flighting of calls. This means that there are only several types of errors that can occur: Type conversion error marshalling or unmarshalling an object (see Type conversion on page 103). Network communication error All other errors, including invalid method names, bad parameter lists, and so on, are reported by the SOAP server or service and thrown as a bevocal.soap.SoapFault, as described above.
106
PART 3
VoiceXML Reference
This part contain reference descriptions of the components of VoiceXML: Chapter 11, Tags Chapter 12, Properties Chapter 13, Variables Chapter 14, JavaScript Functions and Objects
108
11
Tags
This chapter provides detailed information about each VoiceXML tag. See: Tag Summary on page 110 for an overview of the tags, grouped by function Tag Index on page 112 for an alphabetical list of all tags
Each tag description includes the following information: Syntax Description Usage See Also Examples Summary of how the tag is used. Description of attributes or other details. Table of parent and children tags. Parent tags can contain this tag and children tags can be used within this tag. Links to related information. Short examples you can run as simple, standalone applications.
In the cases where the BeVocal VoiceXML interpreter deviates from the VoiceXML 2.0 Specification, the difference is clearly marked below in the following ways: Not implemented Extension Experimental Extension Functionality not currently available. Added functionality. Added functionality that may be included in a later specification for VoiceXML. If the extension is standardized, the BeVocal VoiceXML implementation will change as necessary to match the VoiceXML standard. Non-standard or superseded feature that was supported by an earlier version but has been replaced by a new feature. Tag is part of the VoiceXML 1.0 standard, but has been removed from VoiceXML 2.0.
Deprecated VoiceXML 1.0 only
Tags and attributes that were added to the specification in VoiceXML 2.0 are marked New in VoiceXML 2.0. VoiceXML Tag Summary lists any differences between the BeVocal VoiceXML implementation and the VoiceXML 2.0 standard.
TAGS
Tag Summary
The following table classifies VoiceXML tags according to their purpose. Purpose Defining the Application Tags <vxml> <meta> <metadata> <form> <menu> <field> <bevocal:enroll> Extension <bevocal:listen> Extension <record> <bevocal:register> Extension <subdialog> <transfer> <bevocal:verify> Extension <initial> <block> <menu> <choice> <grammar> <enumerate> <field> <enumerate> <grammar> <option> <filled> <help> <noinput> <nomatch> <subdialog> <param> <return> <goto> <submit> <link> <choice> <catch> <error> <help> <noinput> <nomatch>
Dialogs Input Items of a Form
Control Items of a Form Menus
Fields
Subdialogs
Controlling Dialog Transitions
Handling Events
110
Tag Summary
Purpose Controlling Synthesized Speech
Tags <break> <voice> New in VoiceXML 2.0 <emphasis> New in VoiceXML 2.0 <prosody> New in VoiceXML 2.0 <phoneme> New in VoiceXML 2.0 <say-as> New in VoiceXML 2.0 <sub> New in VoiceXML 2.0 <mark> New in VoiceXML 2.0 <s> New in VoiceXML 2.0 <sentence> New in VoiceXML 2.0 <p> New in VoiceXML 2.0 <paragraph> New in VoiceXML 2.0 <grammar> <rule> New in VoiceXML 2.0 <ruleref> New in VoiceXML 2.0 <token> New in VoiceXML 2.0 <one-of> New in VoiceXML 2.0 <item> New in VoiceXML 2.0 <tag> New in VoiceXML 2.0 <example> New in VoiceXML 2.0 <lexicon> New in VoiceXML 2.0 <div> VoiceXML 1.0 only <dtmf> VoiceXML 1.0 only <emp> VoiceXML 1.0 only <pros> VoiceXML 1.0 only <sayas> VoiceXML 1.0 only <bevocal:register> Extension <bevocal:verify> Extension <property> <block> <filled> <if> <bevocal:foreach> Extension <catch> <error> <help> <noinput> <nomatch> <object> <var> <assign> <clear> <if> <else> <elseif> <bevocal:foreach> Extension <script>
Specifying Grammars
Speaker Verification (Extension) Controlling the Interpreter Containers for Executable Content
Declaring and Setting Variables (executable content) Procedural Logic (executable content)
Scripting logic (executable content)
111
TAGS
Purpose Producing Audio Output (executable content)
Tags <audio> <prompt> <reprompt> <value> <enumerate> <throw> <rethrow> Extension <disconnect> <exit> <bevocal:dial> Extension <bevocal:hold> Extension <bevocal:connect> Extension <bevocal:whisper> Extension <bevocal:disconnect> Extension <submit> <send> VoiceXML 1.0 only, Extension <data> Extension <log> New in VoiceXML 2.0
Throwing Events (executable content) Termination the Session (executable content) Placing and Controlling Outbound Calls (executable content)
Sending and Fetching Data (executable content) Debugging
Tag Index
The following table lists the available tags, including tags for: VoiceXML elements Elements in the XML form of the W3C Speech Recognition Grammar Format, which are used to define grammars in that format Elements in the W3C Speech Synthesis Markup Language VoiceXML 1.0 only. Elements in the Java Speech Markup Language Tag <assign> <audio> <bevocal:connect> <bevocal:dial> <bevocal:disconnect> <bevocal:enroll> <bevocal:foreach> <bevocal:hold> <bevocal:listen> <bevocal:whisper> Description Assigns a value to a variable. Plays an audio clip to the user. Extension. Reconnects the user with an outbound call that was placed on hold. Extension. Initiates an outbound call, allowing the user to talk to a third party at another destination. Extension. Disconnects an outbound call or the inbound call. Extension: Create and modify enrolled grammars. Extension. Iterates over the elements of an array. Extension. Places an outbound call on hold. Extension. Allows the application to suspend execution and listen to the user during an outbound call. Extension. Interrupts an outbound call, allowing the application to play audio output to the called third party while the user is on hold.
112
Tag Index
Tag <block> <break> <catch> <choice> <clear> <data> <disconnect> <div> <dtmf> <else> <elseif> <emp> <emphasis> <enumerate> <error> <example> <exit> <field> <filled> <form> <goto> <grammar> <help> <if> <initial> <item> <lexicon> <link> <log> <mark>
Description Contains (non-interactive) executable code. Speech Synthesis Markup Language element that inserts a pause in audio output. Catches an event. Defines a menu item. Clears one or more form-item variables. Experimental Extension. Fetches arbitrary XML data from an HTTP server, or submits values to a server. Disconnects a telephone session. VoiceXML 1.0 only. Java Speech Markup Language element that classifies a region of text as a particular type. VoiceXML 1.0 only. Specifies a touch-tone key grammar. Marks the beginning of an else clause within an <if> element. Marks the beginning of an else-if clause within an <if> element. VoiceXML 1.0 only. Java Speech Markup Language element that changes the emphasis of speech output. New in VoiceXML 2.0. Speech Synthesis Markup Language element that changes the emphasis of speech output. Generates audio output that enumerates the options in a field or the choices in a menu. Catches an error event. New in VoiceXML 2.0. XML grammar element with an example phrase that matches the containing grammar rule. Exits a session. Declares an input field in a form. Contains actions to be executed when fields are filled. Presents information and collects data. Goes to another location in the same or different document. Specifies a speech-recognition grammar. Catches a help event. Executes actions conditionally. Declares initial logic upon entry into a (mixed-initiative) form. New in VoiceXML 2.0. XML grammar input element that indicates optional or repeated user input. New in VoiceXML 2.0. XML grammar input element that indicates a source of pronunciation information. Specifies a transition common to all dialogs in the links scope. New in VoiceXML 2.0. Writes debugging information to a BeVocal Caf call log, which you can view on the Caf web site. New in VoiceXML 2.0. Speech Synthesis Markup Language element that places a marker into the output stream for asynchronous notification.
113
TAGS
Tag <menu> <meta> <metadata> <noinput> <nomatch> <object> <one-of> <option> <p> <paragraph> <param> <phoneme> <prompt> <property> <pros> <prosody> <record> <bevocal:register> <reprompt> <rethrow> <return> <rule> <ruleref> <s> <say-as> <sayas> <script> <send>
Description Allows user to choose among alternative destinations. Defines a meta-data item as a name/value pair. Currently this tag has no effect. Catches a no-input event. Catches a no-match event. Always throws an unsupported object exception. New in VoiceXML 2.0. XML grammar input element that indicates alternative user inputs. Specifies an option in a <field>. New in VoiceXML 2.0. Speech Synthesis Markup Language element that classifies a region of text as a paragraph. New in VoiceXML 2.0. Speech Synthesis Markup Language element that classifies a region of text as a paragraph. Specifies a parameter in a <subdialog> element. New in VoiceXML 2.0. Speech Synthesis Markup Language element that provides a phonetic pronunciation for the contained text. Queues TTS and audio output to the user. Controls settings specific to the BeVocal VoiceXML implementation platform. VoiceXML 1.0 only. Java Speech Markup Language element that changes the prosody of speech output. New in VoiceXML 2.0. Speech Synthesis Markup Language element that changes the prosody of speech output. Records an audio sample. Extension. Register a voice print that can be used to verify caller identity. Plays a field prompt when a field is re-visited after an event. Extension. Causes the event currently being handled to be rethrown. Returns from a subdialog. New in VoiceXML 2.0. XML grammar element that defines a grammar rule. New in VoiceXML 2.0. XML grammar input element that references another rule. New in VoiceXML 2.0. Speech Synthesis Markup Language element that classifies a region of text as a sentence. New in VoiceXML 2.0. Speech Synthesis Markup Language element that modifies how the enclosed word or phrase is spoken. VoiceXML 1.0 only. Java Speech Markup Language element that modifies how a word or phrase is spoken. Specifies a block of client-side scripting logic in JavaScript. VoiceXML 1.0 only; Experimental Extension. Submits values to a web server without transitioning to a new VoiceXML document.
114
Tag Descriptions
Tag <sentence> <speak> <sub> <subdialog> <submit> <tag> <throw> <token> <transfer> <value> <var> <bevocal:verify> <voice> <vxml>
Description New in VoiceXML 2.0. Speech Synthesis Markup Language element that classifies a region of text as a sentence. New in VoiceXML 2.0. The <speak> element is the root element of a standalone SSML document which contains all other SSML elements. New in VoiceXML 2.0. Attribute provides substitute text to be spoken instead of the contained text. Invokes another dialog as a subdialog of the current one. Submits values to a document server. New in VoiceXML 2.0. XML grammar element that specifies how to interpret the user input. Throws an event. New in VoiceXML 2.0. XML grammar input element that specifies words to be spoken by the user. Transfers the users call to a third party at another destination. Inserts the value of a expression into audio output. Declares a variable. Extension. Verify that the speakers voice matches a stored voice print. New in VoiceXML 2.0. Speech Synthesis Markup Language element that requests a change in speaking voice. Contains the VoiceXML code of a document.
Tag Descriptions
The remainder of this chapter contains tag descriptions is alphabetical order.
115
TAGS
<assign>
Assigns a value to a variable. Syntax <assign name="string" expr="js_expression" /> Description Attribute name expr Tips: In JavaScript, + means both string concatenation and add. By default, values are treated as strings. If you want to add two numbers represented by string variables a and b in an <assign>, use: <assign name="x" expr="Number(a) + Number(b)"/> Multiply, divide, and subtract are not ambiguous in this way. If a JavaScript expression contains any of the characters <, >, or &, that character must be replaced with the corresponding escape sequence <, >, or &. For more information, see JavaScript Quick Reference. Description Name of variable. This variable must have already been declared. JavaScript expression that evaluates to the value assigned to this variable.
Usage Parents <bevocal:foreach> <block> <catch> <error> <filled> <help> <if> <noinput> <nomatch> Exception Exception error.badfetch See Also VoiceXML 2.0 Specification: <assign> JavaScript Quick Reference Related tag: <var> Description If the name or expr attribute is missing. Children None
116
<assign>
Examples <?xml version="1.0" ?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" "http://cafe.bevocal.com/libraries/dtd/vxml2-0-bevocal.dtd"> <vxml version="2.0" xmlns="http://www.w3.org/2001/vxml"> <form id="foo"> <var name="a"/> <var name="b"/> <var name="result"/> <block> <assign name="a" expr="Pine"/> <assign name="b" expr="Apple"/> <assign name="result" expr="a + b"/> </block> <block> <prompt> This is the test for the assign tag. If you put <value expr="a"/> and <value expr="b"/> together, it would make <value expr="result"/> </prompt> </block> </form> </vxml>
117
TAGS
<audio>
Plays an audio clip to the user. Syntax <audio src="URI" expr="js_expression" fetchhint="prefetch"|"safe" fetchtimeout="time_interval" maxage="time_interval" maxstale="time_interval" > bevocal:ssml="URI" > bevocal:ssmlexpr="js_expression" > Optional Content </audio> Description The audio clip is played in its entirety unless interrupted. The Audio Library contains stored audio files with commonly used spoken prompts and other sounds that you can use in your applications. If the bargein property is true, a user utterance can interrupt playing of the audio clip. If the bargeintype property is recognition, only an utterance that matches an active grammar can interrupt the audio clip. In the latter case, the bevocal.hotwordmin and bevocal.hotwordmax properties specify the minimum and maximum time duration, respectively, of the interrupting utterance. Attribute src Description The URI of the audio file. Optional (as alternative to expr, bevocal:ssml, and bevocal:ssmlexpr; you must specify one of these 4 attributes). If not specified or invalid (that is, the interpreter was unable to perform the fetch from the specified URI), any content of the <audio> element will be played instead. The content can include text or valid child tags. expr New in VoiceXML 2.0. JavaScript expression that evaluates to either a string or an array or strings, or can be the recorded audio from the input variable of a <record> item. If it evaluates to a string, the string is interpreted as a URI and the audio file at the location is fetched and played. Optional (as alternative to expr, bevocal:ssml, and bevocal:ssmlexpr; you must specify one of these 4 attributes). If the expression evaluates to JavaScript undefined, then the element including its alternate content is ignored. For compatibility with previous releases the content is also ignored if the expression evaluates to null. Extension. If it is an array, each element is treated as an audio file URI, each of which is fetched and played, in turn. fetchhint Specifies whether the interpreter can attempt to optimize dialog interpretation by prefetching the resource. See Prefetching Resources on page 40. Optional. Note: The interpreter can prefetch an audio file specified by the src attribute, but not by the expr attribute.
118
<audio>
Attribute fetchtimeout
Description Specifies the interval to wait for the resource to be returned before throwing a error.badfetch event. See Handling Fetching Delays on page 42. Optional. New in VoiceXML 2.0. Specifies the maximum acceptable age, in seconds, of the cached resource. See Maximum Age on page 44. Optional. New in VoiceXML 2.0. Specifies the maximum acceptable time, in seconds, during which an expired cached resource can still be used. See Maximum Stale Time on page 44. Optional. Extension. A URI which refers to an SSML document. This SSML document should be compliant to the W3C SSML spec but may have certain extensions. See Chapter 9, Dynamic SSML for details. Optional (as alternative to src, expr, and bevocal:ssmlexpr; you must specify one of these 4 attributes). Extension. A JavaScript expression which resolves to the URI expected by the bevocal:ssml attribute. Optional (as alternative to src, expr, and bevocal:ssml; you must specify one of these 4 attributes).
maxage maxstale
bevocal:ssml
bevocal:ssmlexpr
(VoiceXML 1.0 only) The following attributes can be used in applications in which the version attribute of the <vxml> tag is set to 1.0. Attribute caching Description VoiceXML 1.0 only. Specifies the caching policy for the resource being fetched. See Caching on page 45. Optional. Used in place of the VoiceXML 2.0 attributes maxage and maxstale. For playing prompts, the BeVocal interpreter supports popular formats including Wave (.wav), Sun audio (.au) and MP3. Because we use JMF technology, you can refer to the following reference for a complete list of audio formats supported: http://java.sun.com/products/java-media/jmf/2.1.1/formats.html. Note: If the specified audio file is an unsupported type, any alternative audio content of the <audio> element (text, prompts, and so on) is played instead. Tips: In production applications, you should consider using the Wave format for your audio files. The reasons for this recommendation are discussed in the Frequently Asked Questions, specifically in How can I improve the sound quality in my audio files? and Echo cancellation is not working, and my prompts barge in on themselves!. You can use <audio> within a <prompt>. If you do, it will inherit the attributes of the <prompt> element, such as bargein. If you name your audio files consistently, you can use the expr attribute to simplify the way you construct audio file names in your VoiceXML. For example: <prompt> <audio expr="'resources/prompts/hello' + sign +'.wav'"/> </prompt> <prompt> <audio expr="'resources/prompts/' + sign +'.wav'"/> </prompt> If you use the expr attribute in place of the src attribute, the interpreter cannot prefetch the audio file, which may cause a minor performance degradation. Once a given file is in the interpreters cache, however, this difference is typically not noticeable.
119
TAGS
Using JavaScript functions makes it easy to update the location of your audio files. For an expanded example, see Factorial on the VoiceXML Samples page of the BeVocal Caf; as a simple example: function female1(a) { return("audio/female1/en_us/" + a); } function common(b) { return(female1("common/" + b + ".wav")); } function number(b) { return(female1("number/" + b + ".wav")); } When calling JavaScript functions, use apostrophes in place of double quotes: <audio expr="common('bevocal_chimes')"/> If a JavaScript expression contains any of the characters <, >, or &, that character must be replaced with the corresponding escape sequence <, >, or &. For more information, see JavaScript Quick Reference.
Usage Parents <audio> <bevocal:foreach> <bevocal:listen> <bevocal:register> <bevocal:verify> <bevocal:whisper> <block> <catch> <choice> <emphasis> <enumerate> <error> <field> <filled> <help> <if> <initial> <menu> <noinput> <nomatch> <prompt> <prosody> <record> <s> <sentence> <subdialog> <transfer> <voice> See Also Examples Example 1using src: <?xml version="1.0" ?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" "http://cafe.bevocal.com/libraries/dtd/vxml2-0-bevocal.dtd"> VoiceXML 2.0 Specification: <audio> Children <audio> <break> <emphasis> <enumerate> <mark> <p> <paragraph> <phoneme> <prosody> <s> <say-as> <sentence> <value> <voice>
120
<audio>
<vxml version="2.0" xmlns="http://www.w3.org/2001/vxml"> <form id="foo"> <block> <audio> Welcome to BeVocal Cafe, the number One place to build and deploy your voice applications. </audio> <audio maxage="0" src="bevocal_cafe.wav"/> BeVocal Cafe. </block> </form> </vxml> Example 2using expr: <?xml version="1.0" ?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" "http://cafe.bevocal.com/libraries/dtd/vxml2-0-bevocal.dtd"> <vxml version="2.0" xmlns="http://www.w3.org/2001/vxml">  <script> <![CDATA[ var base = "http://cafe.bevocal.com/libraries/audio/female1/en_us/number/"; function one() { return base + "6000-e.wav"; } function many() { var result = new result[0] = base result[1] = base result[2] = base result[3] = base return result; } ]]> </script> <form> <block> <prompt> Playing result of one <audio expr="one()">one</audio> <break/> Playing result of many <audio expr="many()">many</audio> </prompt> </block> </form> </vxml>
Array(4); + "6000-b.wav"; + "300.wav"; + "37_and.wav"; + "1-32.wav";
121
TAGS
<bevocal:connect>
Extension. Reconnects the user with an outbound call that was placed on hold. Syntax <bevocal:connect call="js_expression" /> Description To use this tag, the containing <vxml> tag must declare the XML namespace bevocal by including the following attribute: xmlns:bevocal="http://www.bevocal.com/" A <bevocal:connect> tag reconnects an outbound call that was put on hold by a <bevocal:hold> tag or by the onhold attribute of the <bevocal:dial> tag that initiated the call. This tag does nothing if the specified call is not on hold. After the call is reconnected, the user and called third party talk to each other. The application continues to listen to the user; if a user utterance matches an active grammar, the application responds to the recognized utterance as usual. The application ignores all speech and DTMF signals from the called third party. Note: This and other call-control tags constitute a BeVocal VoiceXML extension. Committees are currently working to standardize call-control features for VoiceXML, and their current approach is different from the BeVocal VoiceXML implementation. Because the approval of any call-control standards will be quite some time in coming, however, BeVocal VoiceXML contains this extension to allow developers to start taking advantage of call-control features. BeVocal will continue to monitor the development within these committees. If the features in the BeVocal VoiceXML extension become a part of VoiceXML or a separate call-control standard, BeVocal VoiceXML will implement that standard. We will then deprecate the current extension and provide developers with information on how to convert their applications to the new standard. Attribute call Description JavaScript expression whose value is a JavaScript object that was initialized by <bevocal:dial>; specifies the call to be reconnected.
Tip: If a JavaScript expression contains any of the characters <, >, or &, that character must be replaced with the corresponding escape sequence <, >, or &. For more information, see JavaScript Quick Reference.
122
<bevocal:connect>
Usage Parents <bevocal:foreach> <block> <catch> <error> <filled> <help> <if> <noinput> <nomatch> Children None
The following events may be thrown by the execution of a <bevocal:connect> tag: Event connection.far_end.disconnect error.semantic Description The specified outbound call has already been disconnected. The call attribute is not a valid JavaScript expression or the expression does not evaluate to an object representing an outbound call.
See Also Chapter 6, Controlling Outbound Calls Related tags: <bevocal:dial>, <bevocal:disconnect>, <bevocal:hold>
123
TAGS
<bevocal:dial>
Extension. Initiates an outbound call, allowing the user to talk to a third party at another destination. Syntax <bevocal:dial name="string" dest="URI" destexpr="js_expression" silent="true"|"false" ani="digit_string" aniexpr="js_expression" connecttimeout="time_interval" maxtime="time_interval" bevocal:maxtimeexpr="js_expression" bevocal:type="blind"|"bridge"|"supervised" type="blind"|"bridge"|"consultation" onhold="true"|"false" transferaudio="URI" /> Description To use this tag, the containing <vxml> tag must declare the XML namespace bevocal by including the following attribute: xmlns:bevocal="http://www.bevocal.com/" The <bevocal:dial> tag may initiate a call in one of three ways: Transfer Method Bridge transfer Blind transfer Supervised or consultation transfer Description The current session with the interpreter resumes after the call with the third party completes. The current session terminates as soon as it starts the transfer, regardless of the success of the transfer. The current session terminates as soon as the transfer successfully connects the outbound call. If the transfer is unsuccessful, control returns to the application.
Note: To use blind or supervised transfers, contact BeVocal Customer Support. Only one outbound call can be in progress at a given time. The call placed by one <transfer> or <bevocal:dial> tag must be terminated before another <transfer> or <bevocal:dial> tag can be executed. Caf customers can use a <transfer> tag to place local and long-distance calls only; international calls are not allowed. (Hosting customers are allowed to make international calls.) During execution of the <transfer> element, the user and the called third party talk to each other. The application is quiet. No universal grammars or grammars in higher scopes are active. During a bridge transfer, if the <transfer> element includes any child grammars, the application listens to the user. If a user utterance matches a child grammar, the transfer is terminated and the input variable is set to the status near_end_disconnect. The bevocal.hotwordmin and bevocal.hotwordmax
124
<bevocal:dial>
properties specify the minimum and maximum time duration, respectively, for an utterance that can terminate the call. Note: This and other call-control tags constitute a BeVocal VoiceXML extension. Committees are currently working to standardize call-control features for VoiceXML, and their current approach is different from the BeVocal VoiceXML implementation. Because the approval of any call-control standards will be quite some time in coming, however, BeVocal VoiceXML contains this extension to allow developers to start taking advantage of call-control features. BeVocal will continue to monitor the development within these committees. If the features in the BeVocal VoiceXML extension become a part of VoiceXML or a separate call-control standard, BeVocal VoiceXML will implement that standard. We will then deprecate the current extension and provide developers with information on how to convert their applications to the new standard. Attribute name Description Name of a variable to be set to a JavaScript object referring to the call. Optional. If no variable exists with the specified name, a new variable is defined in the current scope; for example, in the anonymous scope of the containing <block>. dest URI of the destination (for example, phone, IP telephony address). Optional (as alternative to destexpr). You can specify the URI using any of the following formats: phone://8005551212 800-555-1212 phone://800-555-1212 tel:800-555-1212 tel:800-555-1212;postd=1234 This format allows you to specify an extension as post-dial digits (postd). After the call is answered, the specified digits (1234 in this example) are sent to the called third party as DTMF. sip:+16506414924@fwd.pulver.com SIP URIs are specified as: sip:<destination number>@<domain value>:<port> where the default port is 5060. SIP URIs are valid only for VoIP calls.
Note: A leading 1 on the phone number is optional and will be ignored. Note: You must specify one of dest or destexpr, but not both. destexpr JavaScript expression that evaluates to the URI of the destination. Optional (as alternative to dest). Note: You must specify one of dest or destexpr, but not both. silent Specifies whether the call-progress sounds are muted. Optional (default is false). trueThe call-progress sounds and all other sounds from the outbound call are muted until the post-dial digits have finished playing. falseThe user hears the call-progress sounds and a small snippet of the call after it is connected and before the post-dial digits are played.
125
TAGS
Attribute ani
Description A sequence of digits to be passed as the ANI or Caller ID for the outbound call. Optional (default is the value of the session.telephone.ani variable). The ANI may contain up to 32 digits in either of the following forms: A formatted phone number, with or without area code, for example, (201)555-1212 a string of ASCII digits, for example, 012345647830193856 ANI spoofing is disabled by default on Caf for outbound calls via call control tags. A warning is thrown to VXML log. It is enabled by default on hosting servers. Contact BeVocal to enable ANI spoofing on Caf. Note: You may specify either this attribute, or aniexpr, but not both. JavaScript expression that evaluates to the sequence of digits to be passed as the ANI or Caller ID for the outbound call. Optional (default is the session.telephone.ani variable). Note: You may specify either this attribute, or ani, but not both. Time to wait for the outbound call to connect before throwing a connection.far_end.noanswer event. Optional (default is 20 seconds). Express time interval as an unsigned number followed by either s for time in seconds or ms for time in milliseconds (the default).
aniexpr
connecttimeout
maxtime
How long the call is allowed to be in progress. Optional. The default for Caf customers is 60 seconds. This is an absolute maximum. If the time in this attribute is longer than 60 seconds, the interpreter uses 60 seconds as the limit. The default for hosting customers is 0, signifying no limit. Express time interval as an unsigned number followed by either s for time in seconds or ms for time in milliseconds (the default). If the call exceeds the maximum duration, the outbound call is disconnected and a connection.far_end.disconnect.timeout event is thrown.
bevocal:maxtimeexpr
Extension. A JavaScript expression which evaluates to the maxtime value. Optional. Again, the default for Caf customers is 60 seconds. This is an absolute maximum. If the time in this attribute is longer than 60 seconds, the interpreter uses 60 seconds as the limit. The default for hosting customers is 0, signifying no limit.
126
<bevocal:dial>
Attribute bevocal:type
Description Extension. Determines how much control the platform retains over a transferred call. Optional. The default is bridge. If the value is blind, as soon as the transfer starts, the current VoiceXML session ends and relinquishes control to the outbound call. Even if the transfer does not connect the call, the session is over. If the value is bridge, then the current VoiceXML session remains active. At the end of the outbound call, the session returns control to the application to resume processing. If the value is supervised, an intermediate path occurs. The current VoiceXML session monitors the progress of the outbound call until it is connected. If the call cannot be connected for some reason such as no answer or line busy, the session remains active and returns control to the application. If the call is connected, then the session ends, just as for a blind transfer. You can specify either bevocal:type or type, but not both. If you specify both, a parse error is thrown. Note: To use blind or supervised transfers, please contact BeVocal Customer Support.
type
Determines how much control the platform retains over a transferred call. Optional. The default is bridge. If the value is blind, as soon as the transfer starts, the current VoiceXML session ends and relinquishes control to the outbound call. Even if the transfer does not connect the call, the session is over. If the value is bridge, then the current VoiceXML session remains active. At the end of the outbound call, the session returns control to the application to resume processing. If the value is consultation, an intermediate path occurs. The current VoiceXML session monitors the progress of the outbound call until it is connected. If the call cannot be connected for some reason such as no answer or line busy, the session remains active and returns control to the application. If the call is connected, then the session ends, just as for a blind transfer. You can specify either bevocal:type or type, but not both. If you specify both, a parse error is thrown. Note: To use blind or consultation transfers, please contact BeVocal Customer Support.
onhold
Specifies whether the outbound call is put on hold immediately after it is answered and after any post-dial digits have been sent. Optional (default is false). truePut the called third party on hold until reconnected by a <bevocal:connect> tag. falseLeave the call active with the user connected to the called third party. Setting this attribute to true is equivalent to executing a <bevocal:hold> tag immediately after executing the <bevocal:dial> tag.
transferaudio
Audio (specified by the URI) to play while the call connection attempt is in progress. Note: This attribute is valid only for VoIP calls.
127
TAGS
The following events may be thrown by the execution of a <bevocal:dial> tag: Event connection.far_end.busy connection.far_end.noanswer connection.far_end.network_busy connection.far_end.network_disconnect error.connection.baddestination error.telephone.baddestination error.connection.noauthorization error.telephone.noauthorization error.connection.noresource error.telephone.noresource telephone.disconnect.hangup Description The call was refused by the endpoint (the number was busy). There was no answer within the time allowed for making the connection. The call was refused by an intermediate network. The call completed and was terminated by the network. The destination URI specified by dest or destexpr is not valid. New in VoiceXML 2.0. The destination URI specified by dest or destexpr is not valid. VoiceXML 1.0 only. A Caf customer attempted to make an international call. (The outbound call is not placed.) New in VoiceXML 2.0. A Caf customer attempted to make an international call. (The outbound call is not placed.) VoiceXML 1.0 only. Another outbound call is already in progress. (A new outbound call is not placed.) New in VoiceXML 2.0. Another outbound call is already in progress. (A new outbound call is not placed.) VoiceXML 1.0 only. The user hung up while the connection was being made; for example, before the outbound call was answered. VoiceXML 1.0 only. The user hung up while the connection was being made; for example, before the outbound call was answered. New in VoiceXML 2.0.
connection.disconnect.hangup
Usage Parents <bevocal:foreach> <block> <catch> <error> <filled> <help> <if> <noinput> <nomatch> See Also Chapter 6, Controlling Outbound Calls Children None
128
<bevocal:dial>
Related tags: <bevocal:connect>, <bevocal:disconnect>, <bevocal:hold>, <bevocal:listen>, <bevocal:whisper>, <transfer>
Examples <?xml version="1.0"?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" "http://cafe.bevocal.com/libraries/dtd/vxml2-0-bevocal.dtd"> <vxml version="2.0" xmlns="http://www.w3.org/2001/vxml"> <form> <var name="myCall"/>  <block> Placing an outbound call... <bevocal:dial name="myCall" dest="tel:4085551212"/> <bevocal:whisper call="myCall">  Please hold for a call from the President </bevocal:whisper> </block>  <bevocal:listen name="hotword"> <grammar type="application/x-nuance-gsl"> <![CDATA[ [ (good bye) { return ("exit") } (one) { return ("1") } (two) { return ("2") } (three) { return ("3") } ] ]]> </grammar> <filled> <if cond="hotword == 'exit'"> Good bye <bevocal:disconnect call="myCall"/> <else/>  <bevocal:whisper call="myCall"> <audio expr="'dtmf_' + hotword + '_100.wav'"> <value expr="hotword"/></audio> </bevocal:whisper> <clear namelist="hotword"/> </if> </filled> <catch event="connection.far_end.disconnect.timeout"> I'm sorry. The outbound call has exceeded the maximum allowed time. <exit/>
TAGS
</catch> </bevocal:listen> <catch event="connection.far_end.disconnect">  </catch> <catch event="connection.disconnect.hangup">  <bevocal:disconnect call="myCall"/> </catch> </form> </vxml>
130
<bevocal:disconnect>
<bevocal:disconnect>
Extension. Disconnects an outbound call or the inbound call. Syntax <bevocal:disconnect call="js_expression" /> Description To use this tag, the containing <vxml> tag must declare the XML namespace bevocal by including the following attribute: xmlns:bevocal="http://www.bevocal.com/" This extended version of <disconnect> can disconnect either an outbound call initiated by a <bevocal:dial> tag, or the users inbound call that started the current session: If a call attribute is included, it specifies the outbound call to be disconnected. The inbound call remains connected, allowing the user and the application to continue the session. If no call attribute is included, this tag is equivalent to <disconnect>; it performs the following steps: Disconnects the inbound call. If an outbound call is active or on hold, disconnects that call also. Throws a hang up event (connection.disconnect.hangup in VoiceXML 2.0; telephone.disconnect.hangup in VoiceXML 1.0). If an event handler catches the event, it can perform one last <submit> to notify the server that the call has ended. Because the call is no longer connected, any VoiceXML document returned from the server is ignored. The interpreter exits following execution of any event handler (or immediately if no handler catches the hang-up event). If the call is no longer active, <bevocal:disconnect> does nothing.
Note: This and other call-control tags constitute a BeVocal VoiceXML extension. Committees are currently working to standardize call-control features for VoiceXML, and their current approach is different from the BeVocal VoiceXML implementation. Because the approval of any call-control standards will be quite some time in coming, however, BeVocal VoiceXML contains this extension to allow developers to start taking advantage of call-control features. BeVocal will continue to monitor the development within these committees. If the features in the BeVocal VoiceXML extension become a part of VoiceXML or a separate call-control standard, BeVocal VoiceXML will implement that standard. We will then deprecate the current extension and provide developers with information on how to convert their applications to the new standard. Attribute call Description JavaScript expression whose value is a JavaScript object that was initialized by <bevocal:dial>; specifies the call to be disconnected. Optional; if omitted, the inbound call is disconnected.
131
TAGS
The execution of a <bevocal:disconnect> tag throws an error.semantic event if the call attribute is not a valid JavaScript expression or the expression does not evaluate to an object representing an outbound call. See Also Chapter 6, Controlling Outbound Calls Related tags: <bevocal:dial>, <bevocal:connect>, <disconnect>
132
<bevocal:enroll>
<bevocal:enroll>
Extension. Create and modify enrolled grammars. Syntax <bevocal:enroll name="string" expr="js_expression" cond="js_expression" grammarname="string" speakeridexpr="js_expression" phraseidexpr="js_expression" minconsistencies="integer" maxtries="integer" type="MIME_type" />
133
TAGS
Description To use this tag, the containing <vxml> tag must declare the XML namespace bevocal by including the following attribute: xmlns:bevocal="http://www.bevocal.com/" Attribute name Description Name of the input variable that will hold the enrollment result. The variable name may not be a JavaScript reserved keyword. The input variable has dialog (form) scope; its name must be unique among all VoiceXML and JavaScript variables within the form's scope. expr JavaScript expression that assigns the initial value of the input variable for this enrollment. Optional (default is undefined). If you set the input variable to a value other than undefined, you'll need to clear it before the enrollment can execute. cond JavaScript boolean expression that also must evaluate to true for the enrollment to execute. Optional (default is true). If not specified, the value of the input variable alone determines whether or not the field can execute. grammarname speakerid speakeridexpr The name of the enrollment grammar to which the new phrase will be added. If the grammar does not exist, it will be created. Deprecated; use speakeridexpr instead. JavaScript expression that evaluates to the ID of the current speaker. Because enrolled grammars are speaker-dependent, recognition against the grammar only works with the same speaker (and same value for speakeridexpr). A unique ID for the phrase being enrolled in the grammar. When the grammar is matched during speech recognition, the phrase ID for the enrolled phrase that was recognized is returned. For example, in an address book application, each name should be enrolled with a different phrase ID. The associated telephone numbers should be stored in a database that maps from phrase ID to telephone number. When a name from the grammar is recognized, the phrase ID can be used to look up the telephone number. minconsistencies The minimum number of consistent utterances that must be provided by the user for the enrollment to succeed. minconsistencies must be less than or equal to maxtries or a parse-time error.badfetch will be thrown. The default and minimum value for minconsistencies is 2. maxtries The maximum number of enrollment attempts the interpreter will make. If it has not collected minconsistencies consistent utterances in maxtries attempts, an error.enrollment.max_tries event will be thrown. The developer should catch the event and may decide to clear the enroll item and force the FIA to visit it again. maxtries must be greater than or equal to minconsistencies or a parse-time error.badfetch will be thrown. The default value for maxtries is 5.
phraseidexpr
134
<bevocal:enroll>
Attribute type
Description The MIME type for the recorded utterance. Allowed types are: audio/wav (RIFF format) audio/basic (m-law encoded) The default type is audio/wav. Deprecated. Use the bevocal.security.key property instead.
keyexpr
Properties of the Input Variable Once enrollment succeeds, the input variable is filled with the audio from one of the user's consistent utterances. Applications can send the audio to their back-end server using <submit> or <data>. Properties of the Shadow Variable Corresponding to the <bevocal:enroll> input variable name is a shadow variable name$. After the input variable is filled, some additional information is available in the following properties of this shadow variable: Property clash clashedPhraseIds enrollAudio numConsistentUtterances Description The number of clashes that occurred during enrollment The existing phrase IDs that clashed with the enroll utterance. The audio data from one of the consistent utterances. This is the same audio data that is stored in the input variable. The number of consistent utterances collected for this enrollment
For a field whose name is name, you access the property propName of the shadow variable with the syntax: name$.propName For example, you access the clash property for a field named en1 field as: en1$.clash Usage Model The <bevocal:enroll> tag must collect several utterances until it has a consistent statistical model of the phrase the user is trying to enroll. The first time a <bevocal:enroll> item is visited, it will collect one utterance and then return control to the FIA. Since the minimum value of minconsistencies is 2, the FIA's next iteration will select the same <bevocal:enroll> item, which will then collect a second utterance. This behavior will continue until a consistent enrollment is achieved, maxtries is reached, or an error occurs. When an enroll utterance clashes with an existing phrase in the enrolled grammar, an error.enrollment.clash event is thrown. You can use the shadow variables, name$.clash and name$.clashedPhraseIds to get information about the number of clashes and which phrase IDs the enroll utterance clashed with. Security Considerations The bevocal.security.key property controls access to enrollment grammars. In this situation, a key can be thought of as a namespace that qualifies the grammarname attribute of <bevocal:enroll>. Applications using one security key are totally prevented from accessing enrollment grammars created by an application using a second key, because their grammars live in separate namespaces. When you
135
TAGS
develop applications for one of BeVocal's commercial hosting services such as Enterprise Hosting, you will need a security key in order to use enrollment. When you develop on Caf, you can use enrollment without a key; however there are limitations. First, there will be an implied key derived from your Caf account number. This means that even if you use the same enrollment grammar name from two different Caf accounts, you will not be able to access the same enrolled phrases. Attempting to enroll more than ten phrases will cause an error.noauthorization event to be thrown. Tip: If a JavaScript expression contains any of the characters <, >, or &, that character must be replaced with the corresponding escape sequence <, >, or &. For more information, see JavaScript Quick Reference.
Usage Parents <form> Children <audio> <catch> <enumerate> <error> <filled> <grammar> <help> <link> <noinput> <nomatch> <option> <prompt> <property> <value>
Exceptions Exception error.enrollment.retries error.enrollment.clash error.noauthorization noinput Description The system collected maxtries utterances and still could not achieve a consistent model of the phrase being enrolled. The utterances that were collected clash with those for another phrase already enrolled in this model. Maximum number of phrases enrolled into the grammar. For a Caf developer, the enrolled grammar is limited to 10 phrases. No speech was heard before the time specified by the timeout attribute on the last queued prompt (or by the timeout property) elapsed.
See Also See Chapter 9, Voice Enrollment Grammars in the Grammar Reference, especially for the description of how to refer to an enrolled grammar. The JavaScript function bevocal.enroll.removeEnrolledPhrase
136
<bevocal:enroll>
Examples A form that adds items to an enrolled grammar. <?xml version="1.0" ?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" "http://cafe.bevocal.com/libraries/dtd/vxml2-0-bevocal.dtd"> <vxml version="2.0" xmlns="http://www.w3.org/2001/vxml"> <form id="enroll_names"> <block> Welcome to the address book demo. Lets add some names to the address book. </block>   <catch event="error.enrollment.clash"> Oops! There was a clash for the enrollment sample. <exit/> </catch>   <catch event="error.enrollment.max_tries"> Maximum tries reached. Please try again. <exit/> </catch> <catch event="error.noauthorization"> Maximum phrases enrolled. <exit/> </catch> <catch event="noinput"> <prompt> In the noinput handler </prompt> <reprompt/> </catch> <!-<!-<!-<!-<!-<!-<!-Prompts for a phrase to be enrolled. Executes this item at least twice to get 2 consistent samples for the phrase; that value is controlled by. minconsistencies. The grammarname and speakeridexpr uniquely identify an enrollment grammar. The phraseidexpr uniquely identifies a phrase in enrolled grammar and is returned when recognized against the enrollment grammar. --> --> --> --> --> --> -->
<bevocal:enroll name="en1" minconsistencies="2" maxtries="4" grammarname="ADDRESSBOOK" speakeridexpr="speaker10" phraseidexpr="tom" type="audio/wav"> <prompt count="1"> Say a name </prompt> <prompt count="2"> Say the name again. </prompt> <prompt count="3"> Please say the name again. </prompt> <filled>
137
TAGS
The enrolled phrase is <value expr="en1"/> </filled> </bevocal:enroll> <bevocal:enroll name="en2" minconsistencies="2" maxtries="4" grammarname="ADDRESSBOOK" speakeridexpr="speaker10" phraseidexpr="jackson" type="audio/wav"> <prompt count="1"> Say a name </prompt> <prompt count="2"> Say the name again. </prompt> <prompt count="3"> Please say the name again. </prompt> <filled> The enrolled phrase is <value expr="en2$.enrollAudio"/> </filled> </bevocal:enroll> </form> </vxml>
138
<bevocal:enroll>
And a form that uses that enrolled grammar. <?xml version="1.0" ?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" "http://cafe.bevocal.com/libraries/dtd/vxml-bevocal.dtd"> <vxml version="2.0" xmlns:bevocal="http://www.bevocal.com/"> <link next="addressbook.vxml"> <grammar> #ABNF 1.0; root $abook; $abook = address book; </grammar> </link> <form id="recognize_names"> <field name="f1"> Lets recognize the enrolled names. Say one of the enrolled names <grammar> <![CDATA[ #ABNF 1.0; root $call; $call= [call] $<enrolled:/ADDRESSBOOK?speaker=speaker10> [on|at] [his|her|its|else] [home|work|cell] [phone]; ]]> </grammar> <filled> Calling <value expr="f1"/> Thanks for using address book. Good Bye. </filled> </field> </form> </vxml>
139
TAGS
<bevocal:foreach>
Extension. Iterates over the elements of an array. Note: The <bevocal:foreach> tag has been replaced by the <foreach> tag and may be deprecated in a future release. Syntax <bevocal:foreach item="string" array="string" > Executable Content </bevocal:foreach> Description To use this tag, the containing <vxml> tag must declare the XML namespace bevocal by including the following attribute: xmlns:bevocal="http://www.bevocal.com/" The interpreter sets the iteration variable to each element of the array, in turn, and executes the contained elements once for each setting of the iteration variable. Attribute item Description Name of the iteration variable, which may not be a JavaScript reserved keyword. The scope of the iteration variable is the <bevocal:foreach> element. array The name of a variable whose value is an array.
140
<bevocal:foreach>
Usage Parents <bevocal:foreach> <block> <catch> <error> <filled> <help> <if> <noinput> <nomatch> Children <assign> <audio> <bevocal:connect> <bevocal:dial> <bevocal:disconnect> <bevocal:foreach> <bevocal:hold> <bevocal:whisper> <clear> <data> <disconnect> <exit> <goto> <if> <log> <prompt> <reprompt> <rethrow> <return> <script> <send> <subdialog> <submit> <throw> <value> <var>
See Also None Examples <?xml version="1.0" ?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" "http://cafe.bevocal.com/libraries/dtd/vxml2-0-bevocal.dtd"> <vxml version="2.0" xmlns="http://www.w3.org/2001/vxml"> <form id="form1"> <block>  <var name="serviceNames" expr="new Array(news, weather, traffic)"/> <prompt>Heres a list of services </prompt>  <bevocal:foreach item="service" array="serviceNames">  <prompt><value expr="service"/> <break size="small"/></prompt> </bevocal:foreach> </block> </form> </vxml>
141
TAGS
<bevocal:hold>
Extension. Places an outbound call on hold. Syntax <bevocal:hold call="js_expression" transferaudio="URI" /> Description To use this tag, the containing <vxml> tag must declare the XML namespace bevocal by including the following attribute: xmlns:bevocal="http://www.bevocal.com/" A <bevocal:hold> tag affects an outbound call that was initiated by a <bevocal:dial> tag; it disconnects the voice path between the user and the called third party, putting the third party on hold. The call remains on hold until reconnected with a <bevocal:connect> tag. While the call is on hold: The user can hear and talk to the application. The application listens to the user and plays audio output for the user. The called third party hears silence. Neither the user nor the application can hear the called third party
Note: This and other call-control tags constitute a BeVocal VoiceXML extension. Committees are currently working to standardize call-control features for VoiceXML, and their current approach is different from the BeVocal VoiceXML implementation. Because the approval of any call-control standards will be quite some time in coming, however,BeVocal VoiceXML contains this extension to allow developers to start taking advantage of call-control features. BeVocal will continue to monitor the development within these committees. If the features in the BeVocal VoiceXML extension become a part of VoiceXML or a separate call-control standard, BeVocal VoiceXML will implement that standard. We will then deprecate the current extension and provide developers with information on how to convert their applications to the new standard. Attribute call transferaudio Description JavaScript expression whose value is a JavaScript object that was initialized by <bevocal:dial>; specifies the call to be put on hold. Audio (specified by the URI) to play to the third-party while on hold. Note: This attribute is valid only for VoIP calls. Tip: If a JavaScript expression contains any of the characters <, >, or &, that character must be replaced with the corresponding escape sequence <, >, or &. For more information, see JavaScript Quick Reference.
142
<bevocal:hold>
The following events may be thrown by the execution of a <bevocal:hold> tag: Event connection.far_end.disconnect error.semantic Description The specified outbound call has already been disconnected. The call attribute is not a valid JavaScript expression or the expression does not evaluate to an object representing an outbound call.
See Also Chapter 6, Controlling Outbound Calls Related tags: <bevocal:dial>, <bevocal:connect>
143
TAGS
<bevocal:listen>
Extension. Allows the application to suspend execution and listen to the user during an outbound call. Syntax <bevocal:listen name="string" expr="js_expression" cond="js_expression" > Optional Content </bevocal:listen> Description To use this tag, the containing <vxml> tag must declare the XML namespace bevocal by including the following attribute: xmlns:bevocal="http://www.bevocal.com/" A <bevocal:listen> element lets the application listen to the user during an active outbound call that was initiated by a <bevocal:dial> tag. During execution of the <bevocal:listen> element, the user and the called third party talk to each other. The application is quiet. No universal grammars or grammars in higher scopes are active. The application ignores all speech and DTMF signals from the called third party. If the <bevocal:listen> element includes any child grammars, the application listens to the user. If a user utterance matches a child grammar, the <bevocal:listen> tags input variable is set to the recognition result. The bevocal.hotwordmin and bevocal.hotwordmax properties specify the minimum and maximum time duration, respectively, for an utterance that can match a <bevocal:listen> grammar. If a user utterance matches a child grammar, any child <filled> elements are executed in the order in which they occur. Recognition does not cause the call to terminate. If desired, a <filled> element can disconnect the call explicitly: Use the <disconnect> tag to end the inbound call; doing so disconnects any outbound call and terminates the session. Use the <bevocal:disconnect> tag to terminate the outbound call and continue with the session.
Following execution of the <bevocal:listen> element, the interpreter proceeds as usual, searching for the next form item to execute. Note: This and other call-control tags constitute a BeVocal VoiceXML extension. Committees are currently working to standardize call-control features for VoiceXML; however, the approval of call-control standards will take some time. BeVocal VoiceXML contains this extension to allow developers to take advantage of call-control features before standards are available, even though the eventual standards are likely to be nothing like the current extension. When the features in the BeVocal VoiceXML extension become a part of VoiceXML or a separate call-control standard, BeVocal VoiceXML will implement the standard. We will
144
<bevocal:listen>
then deprecate the current extension and provide developers with information on how to upgrade their applications to the new standard. Attribute name Description Name of the input variable that will hold the recognition result. The variable name may not be a JavaScript reserved keyword. The input variable has dialog (form) scope; its name must be unique among all VoiceXML and JavaScript variables within the forms scope. expr JavaScript expression that assigns the initial value of the input variable for this field. Optional (default is undefined). If you set the input variable to a value other than undefined, youll need to clear it before this element can execute. cond JavaScript boolean expression that also must evaluate to true for this element to execute. Optional (default is true). If not specified, the value of the input variable alone determines whether or not this element can execute. The following events may be thrown by the execution of a <bevocal:listen> element: Event connection.disconnect.hangup telephone.disconnect.hangup connection.far_end.disconnect connection.far_end.disconnect.timeout error.semantic Description The user hung up. New in VoiceXML 2.0. The user hung up. VoiceXML 1.0 only. The called third party hung. The outbound call was terminated due to timeout. No outbound call is active.
Properties of the Shadow Variable. Corresponding to the input variable name is a shadow variable called name$. After the input variable is filled, some additional information is available in the following properties of this shadow variable: Property confidence utterance inputmode duration Description The recognition confidence level (with 0.0 representing the lowest confidence and 1.0 representing the highest). A string representation of the words actually spoken by the user. The mode in which input was provided, one of voice or dtmf. After the transfer is complete, the floating point duration of the call in milliseconds.
For a field whose name is name, you access the property propName of the shadow variable with the syntax: name$.propName For example, you access the duration property for a field named services field as: services$.duration
145
TAGS
Usage Parents <form> Children <audio> <catch> <error> <filled> <grammar> <help> <noinput> <nomatch> <prompt> <property> <value>
See Also Chapter 6, Controlling Outbound Calls Related tags: <bevocal:dial>, <bevocal:disconnect>, <bevocal:whisper>
146
<bevocal:register>
<bevocal:register>
Extension. Register a voice print that can be used to verify caller identity. Syntax <bevocal:register name="string" type="boolean"|"date"|"digits"|"currency"| "number"|"phone"|"time" keyExpr="js_expression" mode="adapt"|"delete"|"skip"> Child Elements </bevocal:register> Description Input item that prompts for a value matching a particular grammar and then registers the users response as a voice print for the specified key. Any utterances that match a universal grammar or a grammar in document or application scope are recognized as they would be in a <field> element. However, only those utterances that match the grammar of the <bevocal:register> element itself are used to create or adapt a voice print. Note: Unlike other input items, a <bevocal:register> element is not a possible go-back destination for the go-back facility. That is, the user cannot say go back to return to a <bevocal:register> element, retracting earlier input to that element. See the Chapter 7, Go-Back Facility for a description of the go-back facility, a BeVocal VoiceXML extension. Any DTMF input is rejected and results in a nomatch event. Attribute name Description Name of the input variable that will hold the recognition result. The variable name may not be a JavaScript reserved keyword. The input variable has dialog (form) scope. type Specifies an internal grammar. Optional (as alternative to a child <grammar> element). boolean - Grammar for recognizing positive and negative responses. Returns true for yes and false for no. date - Grammar for recognizing dates. Returns string with format yyyymmdd; ???? is used for an unknown year and ?? is used for an unknown month or day. digits - Limited grammar for recognizing a sequence of digits. Returns a string of digits. currency - Grammar for recognizing amounts of money, in dollars (Not Implemented: International currency designation). Returns a string with format mm.nn. number - More general grammar for recognizing numbers. Returns a string that could include digits, a decimal point, or positive or negative sign. phone - Grammar for recognizing a telephone number adhering to the North American Dialing Plan (with no extension). Returns a sequence of digits. time - Grammar for recognizing a time. Returns a string with format hhmmx where and x is one of: a for AM, p for AM, h for 24 hour notation, or ? for an ambiguous time (could be AM or PM).
147
TAGS
Attribute keyExpr mode
Description Javascript expression whose value is the key identifying this users voice print. Action to take if a voice print exists for the key specified by the keyExpr attribute. Optional (default is adapt). delete - Delete the existing voice print and start creating a new voice print. adapt - Refine the existing voice print with the users response. skip - Skip execution of the <bevocal:register> tag.
Usage Parents <form> Children <audio> <catch> <error> <filled> <grammar> <help> <noinput> <nomatch> <prompt> <property> <value>
See Also Examples <?xml version="1.0" ?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" "http://cafe.bevocal.com/libraries/dtd/vxml2-0-bevocal.dtd"> <vxml version="2.0" xmlns="http://www.w3.org/2001/vxml"> <form id="register_user"> <field name="account" type="digits"> <prompt>What is your account number.</prompt> <filled> You will now be registered as the only authorized caller for this account. </filled> </field>      <bevocal:register name="phone1" type="phone" keyExpr="account" mode="delete"> <prompt>Please say your telephone number.</prompt> </bevocal:register> <bevocal:register name="phone2" type="phone" keyExpr="account"> <prompt>Please repeat your telephone number.</prompt> <filled> <if cond="phone1 != phone2">
Related tag: <bevocal:verify> Speaker Verification
<bevocal:register>
<prompt> You did not say the same phone number both times. Lets try again. </prompt> <clear namelist="phone1 phone2"/> </if> </filled> </bevocal:register> <bevocal:register name="numbers" type="digits" keyExpr="account"> <prompt> Please say one two three four five, one two three four five </prompt> </bevocal:register> <bevocal:register name="color" keyExpr="account"> <grammar> [red blue yellow green] </grammar> <prompt> Please say one of the the colors red, blue, yellow, or green. </prompt> </bevocal:register> </form> </vxml>
149
TAGS
<bevocal:verify>
Extension. Verify that the speakers voice matches a stored voice print. Syntax <bevocal:verify> name="string" type="boolean"|"date"|"digits"|"currency"| "number"|"phone"|"time" keyExpr="js_expression" Child Elements </bevocal:verify> Description Input item that prompts for a value matching a particular grammar and compares the users voice against the stored voice print for the specified key. Throws an error.verify.keynotfound event if no voice print has been registered for that key. Any utterances that match a universal grammar or a grammar in document or application scope are recognized as they would be in a <field> element. However, only those utterances that match the grammar of the <bevocal:verify> element itself are used to verify the speakers identity. Any DTMF input is rejected and results in a nomatch event. Note: Unlike other input items, a <bevocal:verify> element is not a possible go-back destination for the go-back facility. That is, the user cannot say go back to return to a <bevocal:verify> element, retracting earlier input to that element. See the Chapter 7, Go-Back Facility for a description of the go-back facility, a BeVocal VoiceXML extension. Attribute name Description Name of the input variable that will hold the recognition result. The variable name may not be a JavaScript reserved keyword. The input variable has dialog (form) scope. type Specifies an internal grammar. Optional (as alternative to a child <grammar> element). boolean - Grammar for recognizing positive and negative responses. Returns true for yes and false for no. date - Grammar for recognizing dates. Returns string with format yyyymmdd; ???? is used for an unknown year and ?? is used for an unknown month or day.; digits - Limited grammar for recognizing a sequence of digits. Returns a string of digits. currency - Grammar for recognizing amounts of money, in dollars (Not Implemented: International currency designation). Returns a string with format mm.nn. number - More general grammar for recognizing numbers. Returns a string that could include digits, a decimal point, or positive or negative sign. phone - Grammar for recognizing a telephone number adhering to the North American Dialing Plan (with no extension). Returns a sequence of digits. time - Grammar for recognizing a time. Returns a string with format hhmmx where and x is one of: a for AM, p for AM, h for 24 hour notation, or ? for an ambiguous time (could be AM or PM). Javascript expression whose value is the key identifying this users voice print.
keyExpr
150
<bevocal:verify>
Properties of the Shadow Variable. Corresponding to the input variable name is a shadow variable whose name is name$. The verifier returns the result of the verification in the decision property of this shadow variable. The possible results are: Value accepted rejected unsure Description The speakers voice matches the voice print. The speakers voice does not match the voice print. The verifier could not determine whether the speakers voice matches the voice print.
For a <bevocal:verify> element whose name is name, you access the decision property with the syntax: name$.decision Your application can use the value of the decision property to decide how to proceed. For example, if the value is unsure, it could repeat the verification step. Usage Parents <form> Children <audio> <catch> <error> <filled> <grammar> <help> <noinput> <nomatch> <prompt> <property> <value>
See Also Examples <?xml version="1.0" ?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" "http://cafe.bevocal.com/libraries/dtd/vxml2-0-bevocal.dtd"> <vxml version="2.0" xmlns="http://www.w3.org/2001/vxml"> <form id="verify_user"> <field name="account" type="digits"> <prompt>What is your account number.</prompt> </field> <bevocal:verify name="check1" type="phone" keyExpr="account"> <prompt>Please say your telephone number.</prompt> <filled> <if cond="check1$.decision==accepted"> <prompt>
Related tag: <bevocal:register> Speaker Verification
TAGS
You have been verified as the authorized caller for account number <value expr="account"/>. </prompt> <elseif cond="check1$.decision==rejected" /> <prompt> Sorry. You are not the authorized caller for account number <value expr="account"/>. </prompt> <exit/> <elseif cond="check1$.decision==unsure" /> <prompt>Unable to verify that you are the authorized caller for account number <value expr="account"/>. Lets try again. </prompt> <clear namelist="check1"/> </if> </filled> </bevocal:verify> </form> </vxml>
152
<bevocal:whisper>
<bevocal:whisper>
Extension. Interrupts an outbound call, allowing the application to play audio output to the called third party while the user is on hold. Syntax <bevocal:whisper call="js_expression" transferaudio="URI" > Optional Content </bevocal:whisper> Description To use this tag, the containing <vxml> tag must declare the XML namespace bevocal by including the following attribute: xmlns:bevocal="http://www.bevocal.com/" A <bevocal:whisper> element lets the application interrupt an active outbound call that was initiated by a <bevocal:dial> tag. During execution of the <bevocal:whisper> element: The application plays audio output for the called third party. The application ignores all speech and DTMF signals, whether from the user or from the called third party. The called third party can hear the application, but cannot hear the user. The user hears silence.
Note: This and other call-control tags constitute a BeVocal VoiceXML extension. Committees are currently working to standardize call-control features for VoiceXML, and their current approach is different from the BeVocal VoiceXML implementation. Because the approval of any call-control standards will be quite some time in coming, however, BeVocal VoiceXML contains this extension to allow developers to start taking advantage of call-control features. BeVocal will continue to monitor the development within these committees. If the features in the BeVocal VoiceXML extension become a part of VoiceXML or a separate call-control standard, BeVocal VoiceXML will implement that standard. We will then deprecate the current extension and provide developers with information on how to convert their applications to the new standard. Attribute call transferaudio Description JavaScript expression whose value is a JavaScript object that was initialized by <bevocal:dial>; specifies the call to be interrupted. Audio (specified by the URI) to play to the user while the called third-party hears the application. Note: This attribute is valid only for VoIP calls. Tip: If a JavaScript expression contains any of the characters <, >, or &, that character must be replaced with the corresponding escape sequence <, >, or &. For more information, see JavaScript Quick Reference.
153
TAGS
Usage Parents <bevocal:foreach> <block> <catch> <error> <filled> <help> <if> <noinput> <nomatch> Children <audio> <break> <emphasis> <enumerate> <mark> <p> <paragraph> <phoneme> <prosody> <s> <say-as> <sentence> <value> <voice>
The following events may be thrown by the execution of a <bevocal:whisper> element: Event connection.far_end.disconnect error.semantic Description The specified outbound call has already been disconnected. The call attribute is not a valid JavaScript expression or the expression does not evaluate to an object representing an active outbound call.
See Also Chapter 6, Controlling Outbound Calls Related tags: <bevocal:dial>, <bevocal:listen>
154
<block>
<block>
Contains (non-interactive) executable code. Syntax <block name="string" expr="js_expression" cond="js_expression" > Executable Content </block> Description Control item container of executable code. As with all form items, a blocks form-item variable must have a value of undefined before the block can execute. Just before the block is entered, the interpreter sets its form-item variables to true, so a block is typically executed only once per form invocation. Attribute name Description Name of form-item variable, which may not be a JavaScript reserved keyword. Optional (default is an unusable internal name). The form-item variable has dialog (form) scope; its name must be unique among all VoiceXML and JavaScript variables within the forms scope. Generally, you use this attribute only if you want to control block execution explicitly. expr JavaScript expression that assigns the initial value of the form-item variable. Optional (default is undefined). If you set the form-item variable to a value other than undefined, then youll need to clear it before the block can execute. Note that you need to give the block a name if you want to clear it separately from other form-item variables. cond JavaScript boolean expression that must also evaluate to true for the block to execute. Optional (default is true). If not specified, the value of the form-item variable alone determines whether or not the block can execute. Tip: If a JavaScript expression contains any of the characters <, >, or &, that character must be replaced with the corresponding escape sequence <, >, or &. For more information, see JavaScript Quick Reference.
155
TAGS
Usage Parents <form> Children <audio> <assign> <bevocal:connect> <bevocal:dial> <bevocal:disconnect> <bevocal:foreach> <bevocal:hold> <bevocal:whisper> <clear> <data> <disconnect> <enumerate> <exit> <goto> <if> <log> <prompt> <reprompt> <rethrow> <return> <script> <send> <subdialog> <submit> <throw> <var> <value>
See Also Examples <?xml version="1.0" ?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" "http://cafe.bevocal.com/libraries/dtd/vxml2-0-bevocal.dtd"> <vxml version="2.0" xmlns="http://www.w3.org/2001/vxml"> <form id="foo"> <block name="hello"> <prompt> Welcome to BeVocal Cafe. It is the best known place for Voice X M L Development. <audio src="bevocal_chimes.wav" /> </prompt> </block> </form> </vxml> VoiceXML 2.0 Specification: <block>
156
<break>
<break>
Speech Synthesis Markup Language element that inserts a pause in audio output. Syntax <break time="time_interval" size="none"|"small"|"medium"|"large" /> Description Attribute time Description New in VoiceXML 2.0. Amount of time to pause, in milliseconds. Optional (default is 1 second). Express time interval as an unsigned number followed by s for time in seconds; ms for time in milliseconds (the default). size How long to pause, specified qualitatively. Optional (as alternative to msecs). noneNo pause. smallShort pause (500 milliseconds). mediumLonger pause (1 second). largeLong pause (2 seconds).
(VoiceXML 1.0 only) The following attributes can be used in applications in which the version attribute of the <vxml> tag is set to 1.0. Attribute msecs Description VoiceXML 1.0 only. Amount of time to pause, in milliseconds. Optional (default is 1 second). Express time interval as an unsigned number followed by s for time in seconds; ms for time in milliseconds (the default). Used in place of the VoiceXML 2.0 time attribute. Usage Parents <audio> <bevocal:whisper> <choice> <emphasis> <enumerate> <p> <paragraph> <prompt> <prosody> <s> <sentence> <voice> Children None.
157
TAGS
See Also VoiceXML 2.0 Specification: <break> Related tags: <emphasis>, <mark>, <paragraph>, <phoneme>, <prosody>, <say-as>, <sentence>, <voice>
Examples <?xml version="1.0" ?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" "http://cafe.bevocal.com/libraries/dtd/vxml2-0-bevocal.dtd"> <vxml version="2.0" xmlns="http://www.w3.org/2001/vxml"> <form id="demo-break"> <block> <prompt> Voice X M L allows the programming of silence <break size="medium"/> with the break tag. </prompt> </block> </form> </vxml>
158
<catch>
<catch>
Catches an event. Syntax <catch event="event1 ..." count="integer" cond="js_expression" > Executable Content </catch> Description Container for event handling code. Like <block>, you can put non-interactive executable code (procedural logic) in a <catch> element to handle an event. Attribute event Description Name of the event(s) to catch. Optional. New in VoiceXML 2.0. You can catch all events by omitting this attribute. count Minimum number of times the event must have occurred during a form or menu invocation. Lets you handle different occurrences of the same event differently. Optional (default is 1). JavaScript boolean expression that must also evaluate to true for an event to be caught. Optional (default is true).
cond
Although you can define your own events, there is a set of predefined events. The VoiceXML interpreter provides a standard set of default event handlers for the predefined events. If multiple handlers for a given event are defined in, or inherited by, the element in which the event occurs, one handler is chosen based on count, scope, and document order. See Chapter 3, Event Handling. Tips: Because you can throw events from within a <catch>, be sure to avoid infinite loops. For example, the following handler would result in an infinite loop: <catch event="foobar"> <throw event="foobar"/> </catch> You can use a <submit> within a <catch> for a hang up event to notify the server that the call has ended (connection.disconnect.hangup in VoiceXML 2.0; telephone.disconnect.hangup in VoiceXML 1.0). Because the call is no longer connected, any VoiceXML document returned from the server will be ignored and the interpreter will exit. Similarly, if you use a <goto> within a <catch> for this event, it will be ignored and the interpreter will exit. Within an event handler, the _event variable contains the name of the event currently being handled; the _message variable contains the message string that provides additional information about the event. If no message was supplied when the event was thrown, the _message variable is undefined.
159
TAGS
The method by which event handlers are inherited from ancestor elements is called as if by copy semantics in the VoiceXML 2.0 specification. It helps to think of appropriate event-handler literally being copied into the scope of where the event was thrown. So variable references are resolved relative to the scope of the element where the event was thrown, and URL references are resolved relative to the document from which the event was thrown. For example, if you have a <catch> handler in an application root document, which is in a different directory as the main document which threw the event, URLs in the handler will be resolved to the directory of the main document. The change to URL resolution to the originating document is considered 2.0 behavior and applies only when the <vxml> tags version attribute is set to "2.0" or greater. In a VoiceXML 2.0 document, <catch> with an empty event attribute (<catch event="">) is no longer allowed. To catch all events use <catch> or <catch event=".">. If the version is set to 1.0 the empty event attribute can catch all events; however it is suggested you to use the new <catch> syntax since it works in both versions of the interpreter. If a JavaScript expression contains any of the characters <, >, or &, that character must be replaced with the corresponding escape sequence <, >, or &. For more information, see JavaScript Quick Reference.
Usage Parents <bevocal:listen> <bevocal:register> <bevocal:verify> <field> <form> <initial> <menu> <record> <subdialog> <transfer> <vxml> Children <audio> <assign> <bevocal:connect> <bevocal:dial> <bevocal:disconnect> <bevocal:foreach> <bevocal:hold> <bevocal:whisper> <clear> <data> <disconnect> <enumerate> <exit> <goto> <if> <log> <prompt> <reprompt> <rethrow> <return> <script> <send> <subdialog> <submit> <throw> <var> <value>
See Also VoiceXML 2.0 Specification: <catch> Related variables: _event, _message Related tags: <error>, <help>, <noinput>, <nomatch>, <rethrow>, <throw>
160
<catch>
Examples See other catch and throw examples under <throw> and <rethrow>. <?xml version="1.0" ?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" "http://cafe.bevocal.com/libraries/dtd/vxml2-0-bevocal.dtd"> <vxml version="2.0" xmlns="http://www.w3.org/2001/vxml"> <property name="universals" value="help" /> <form id="stars"> <catch event="nomatch"> <prompt> Sorry, I did not hear a number more than ten. </prompt> <reprompt/> </catch> <block name="numbergame"> You can create games with numbers by using JavaScript </block> <field name="mynumber" type="number"> <prompt> Tell me a number and I can repeat it for you. You can say help for information. </prompt> <catch event="help"> <prompt> Please say a number more than 10 and less than infinity. </prompt> </catch> <filled> <if cond="mynumber > 10"> <prompt> The number you said is <value expr="mynumber"/> </prompt> <else/> <clear namelist="mynumber"/> <throw event="nomatch"/> </if> </filled> </field> </form> </vxml>
161
TAGS
<choice>
Defines a menu item. Syntax <choice accept="exact"|"approximate" next="URI" event="event" expr="js_expression" dtmf="dtmf_sequence" eventexpr="js_expression" message="String" messageexpr="js_expression" fetchhint="prefetch"|"safe" fetchtimeout="time_interval" fetchaudio="URI" maxage="time_interval" maxstale="time_interval" > Choice Text </choice> Description Attribute accept Description New in VoiceXML 2.0. Specifies whether the default grammar generated for this <choice> element requires all words or accepts a subset of the words; overrides the accept attribute of the parent <menu> element. Optional. exactRequires the user to say the exact phrase that appears in the <choice> element. approximateAllows the user to say a subset of the words in the <choice> element. Note: The default is exact if the version attribute of the containing <vxml> element specifies 2.0 or greater. For backward compatibility, the default is approximate if the version attribute is less than 2.0 or unspecified. next event expr dtmf eventexpr message URI of the dialog or document to visit when this choice is selected. Optional (as an alternative to event or expr). An event to throw when this choice is selected. Optional (as an alternative to next or expr). JavaScript expression that evaluates to a URI of the dialog or document to visit when this choice is selected. Optional (as an alternative to next or event). DTMF sequence that can be used to select this choice. Optional (default is no DTMF value). New in VoiceXML 2.0. A JavaScript expression evaluating to the name of the event to be thrown. New in VoiceXML 2.0. A message string providing additional context about the event being thrown. The message is available as the value of a variable within the scope of the <catch> element. New in VoiceXML 2.0. A JavaScript expression evaluating to the message string.
messageexpr
162
<choice>
Attribute fetchhint fetchtimeout fetchaudio
Description Specifies whether the interpreter can attempt to optimize dialog interpretation by prefetching the resource. See Prefetching Resources on page 40. Optional. Specifies the interval to wait for the resource to be returned before throwing a error.badfetch event. See Handling Fetching Delays on page 42. Optional. Specifies the URI of background audio to be played during fetching. Note that this attribute and related properties affect whether queued prompts are played first. See Background Audio on page 42 for important details. Optional. New in VoiceXML 2.0. Specifies the maximum acceptable age, in seconds, of the cached resource. See Maximum Age on page 44. Optional. New in VoiceXML 2.0. Specifies the maximum acceptable time, in seconds, during which an expired cached resource can still be used. See Maximum Stale Time on page 44. Optional.
maxage maxstale
One and only one of the next, expr, or event attributes must be specified. (VoiceXML 1.0 only) The following attributes can be used in applications in which the version attribute of the <vxml> tag is set to 1.0. Attribute caching Description VoiceXML 1.0 only. Specifies the caching policy for the resource being fetched. See Caching on page 45. Optional. Used in place of the VoiceXML 2.0 attributes maxage and maxstale. Tip: If a JavaScript expression contains any of the characters <, >, or &, that character must be replaced with the corresponding escape sequence <, >, or &. For more information, see JavaScript Quick Reference.
Usage Parents <menu> Exceptions See Also Examples <?xml version="1.0" ?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" "http://cafe.bevocal.com/libraries/dtd/vxml2-0-bevocal.dtd"> <vxml version="2.0" xmlns="http://www.w3.org/2001/vxml">
Children <grammar>
error.badfetch - If the dtmf attribute of <menu> is true and any of the menu's <choice> tags have specified their own DTMF sequences to be something other than "*", "#" or "0". error.semantic - If not previously declared using JavaScript var within <script> or VoiceXML <var> or via being named a form item, like a <field name="foo"/>.
VoiceXML 2.0 Specification: <choice> Related tag: <enumerate>, <menu>
TAGS
<menu> <prompt> Welcome to BeVocal. <enumerate> For the <value expr="_prompt"/> service, say <value expr="_prompt"/> </enumerate> </prompt> <choice accept="approximate" next="movies.vxml">movie finder</choice> <choice next="horoscopes.vxml">horoscopes</choice> <choice accept="approximate" next="news.vxml">news desk</choice> </menu> </vxml>
164
<clear>
<clear>
Clears one or more form-item variables. Syntax <clear namelist="variable1 ..." /> Description Attribute namelist Description Space-separated list of variables to reset. Optional (default behavior clears all form-item variables in the current form). This attribute can specify any variable currently in scope, both VoiceXML variables and JavaScript variables, including variables that have not been explicitly declared. When the interpreter resets a form item, it: Usage Parents <bevocal:foreach> <block> <catch> <error> <filled> <help> <if> <noinput> <nomatch> See Also VoiceXML 2.0 Specification: <clear> Related tags: <field>, <record>, <subdialog>, <transfer>, <block>, <initial> Children None. Sets the form-item variables value to undefined. Resets the form items prompt and event counters to 0.
165
TAGS
Examples <?xml version="1.0" ?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" "http://cafe.bevocal.com/libraries/dtd/vxml2-0-bevocal.dtd"> <vxml version="2.0" xmlns="http://www.w3.org/2001/vxml"> <form id="form1"> <field name="ssn" type="digits"> <prompt> Please say your S S N number. </prompt> </field> <field name="passcode" type="digits"> <prompt> Please say your passcode number </prompt> </field> <field name="choice" type="boolean"> <prompt> Your S S N is <say-as type="number:digits"> <value expr="ssn"/> </say-as> and your passcode is <say-as type="number:digits"> <value expr="passcode"/> </say-as>. Are the values right ? </prompt> <filled> <if cond="choice"> <prompt> That is good. </prompt> <else/> <clear/> <prompt>Lets do it again </prompt> </if> </filled> </field> </form> </vxml>
166
<data>
<data>
Experimental Extension. Fetches arbitrary XML data from an HTTP server, or submits values to a server. Syntax <data src="URI" name="string" srcexpr="js_expression" expr="js_expression" method="get"|"post" namelist="variable1 ..." enctype=MIME_type fetchhint="prefetch"|"safe" fetchtimeout="time_interval" fetchaudio="URI" maxage="time_interval" maxstale="time_interval" /> Description The <data> tag fetches or submits data without transitioning to a new VoiceXML document. The XML data fetched by the <data> element is returned in a read-only JavaScript variable via an object model as specified in the W3C Document Object Model (DOM), described in http://www.w3c.org/TR/DOM-Level-2-Core. Appendix E of the specification describes the JavaScript language binding for the DOM. Note: The current BeVocal VoiceXML implementation of <data> precedes finalization of the standard to give developers the opportunity to use the tag and provide feedback, which we can pass on to the W3C. If <data> is standardized, the BeVocal VoiceXML implementation will change as necessary to match the VoiceXML standard. If such changes occur, we will attempt to maintain backwards compatibility with the current implementation. Attribute src name Description URI specifying the location of the XML data. Optional (as alternative to expr). Variable name, which must be a valid JavaScript identifier and may not be a reserved keyword in either JavaScript or Java. If the name attribute is omitted, the HTTP request is submitted, but the retrieved data is ignored. srcexpr JavaScript expression that evaluates to grammar file URI. Optional (as alternative to src or an inline grammar). If you specify this attribute, the element cannot have content. You can specify either srcexpr or expr, but not both. If you specify both, a parse error is thrown.
167
TAGS
Attribute expr
Description Extension. JavaScript expression that evaluates to grammar file URI. Optional (as alternative to src or an inline grammar). Note: The expr attribute has been replaced by the srcexpr attribute and may be deprecated in a future release. If you specify this attribute, the element cannot have content. You can specify either srcexpr or expr, but not both. If you specify both, a parse error is thrown.
method enctype
The query request method, either get or post. Optional (default is get). MIME encoding used when submitting data with the post method. Optional (default is application/x-www-form-urlencoded). The supported types are: application/x-www-form-urlencoded multipart/form-data The type multipart/form-data is more efficient when submitting large amounts of binary data.
namelist
Space-separated list of variables to submit to the server. Optional (default is to submit no variables). This attribute can specify any variable currently in scope, both VoiceXML variables and JavaScript variables, including shadow variables and other variables that have not been explicitly declared. A variable set to a JavaScript object is submitted as the individual component values; see Submitting Complex JavaScript Objects on page 46.
fetchhint fetchtimeout fetchaudio
Specifies whether the interpreter can attempt to optimize dialog interpretation by prefetching the resource. See Prefetching Resources on page 40. Optional. Specifies the interval to wait for the resource to be returned before throwing a error.badfetch event. See Handling Fetching Delays on page 42. Optional. Specifies the URI of an audio clip to play while a resource is being fetched. See Background Audio on page 42. Optional. The specified audio clip is played only if the bevocal.dtmf.flushbuffer property is set to true.
maxage maxstale
Specifies the maximum acceptable age, in seconds, of the cached resource. See Maximum Age on page 44. Optional. Specifies the maximum acceptable time, in seconds, during which an expired cached resource can still be used. See Maximum Stale Time on page 44. Optional.
If a <data> element names a variable that is already in scope, it does not declare a new variable with the same name, but simply assigns a value to the existing variablethe variable is assigned a reference to the DOM returned from the server. (VoiceXML 1.0 only) The following attributes can be used in applications in which the version attribute of the <vxml> tag is set to 1.0. Attribute caching Description VoiceXML 1.0 only. Specifies the caching policy for the resource being fetched. See Caching on page 45. Optional. Used in place of the VoiceXML 2.0 attributes maxage and maxstale.
168
<data>
Usage Parents <bevocal:foreach> <block> <catch> <error> <filled> <foreach> <form> <help> <if> <noinput> <nomatch> <vxml> See Also None Example <?xml version="1.0" ?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" "http://cafe.bevocal.com/libraries/dtd/vxml2-0-bevocal.dtd"> <vxml version="2.0" xmlns="http://www.w3.org/2001/vxml"> <var name="quote"/> <var name="ticker" expr="'f'"/> <form id="get_quote"> <block> <data name="quote" src="quote.xml"/> <assign name="document.quote" expr="quote.documentElement"/> <goto next="#play_quote"/> </block> </form> <form id="play_quote"> <script> <![CDATA[ // retrieve the value contained in the node t from the DOM exposed by d function GetData(d, t, nodata) { try { return d.getElementsByTagName(t).item(0).firstChild.data; } catch(e) { // the value could not be retrieved, so return this instead return nodata; } Children None.
169
TAGS
} ]]> </script> <block>  <var name="change" expr="GetData(quote, 'change', 0)"/>  <prompt> The stock for <value expr="GetData(quote, 'name', 'unknown')"/> is </prompt>   <if cond="change == 0"> <prompt> unchanged at </prompt> <else/> <if cond="change > 0">  <prompt> up by </prompt> <else/>  <prompt> down by </prompt> </if>  <prompt> <value expr="Math.abs(change)"/> to </prompt> </if>  <prompt> <value expr="GetData(quote, 'last', 0)"/> </prompt> </block> </form> </vxml> The data file quote.xml follows: <?xml version="1.0" ?> <xml> <quote> <ticker>F</ticker> <name>Ford Motor Company</name> <change>1.0</change> <last>30.00</last> </quote> </xml>
170
<disconnect>
<disconnect>
Disconnects a telephone session. Syntax <disconnect> namelist="string" <disconnect/> Description Forces the execution environment to disconnect the users inbound telephone call. If an outbound call is active or on hold, that call is also disconnected. This element throws a hang up event (connection.disconnect.hangup in VoiceXML 2.0; telephone.disconnect.hangup in VoiceXML 1.0). If an event handler catches the event, it can perform one last <submit> to notify the server that the call has ended. Because the call is no longer connected, any VoiceXML document returned from the server is ignored. The interpreter exits following execution of any event handler (or immediately if no handler catches the hang up event). Attribute namelist Description Allows the application to return data to the execution context This attribute is currently ignored. Usage Parents <bevocal:foreach> <block> <catch> <error> <filled> <foreach> <help> <if> <noinput> <nomatch> See Also VoiceXML 2.0 Specification: <disconnect> Related tags: <bevocal:disconnect>, <transfer> Children None.
171
TAGS
Examples <?xml version="1.0" ?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" "http://cafe.bevocal.com/libraries/dtd/vxml2-0-bevocal.dtd"> <vxml version="2.0" xmlns="http://www.w3.org/2001/vxml"> <form id="foo"> <error> <prompt> That code is bad! <value expr="code"/></prompt> <disconnect/> </error> <field name="code" type="digits"> <prompt> Say your passcode now. </prompt> <filled> <if cond="code < 100"> <throw event="error"/> <else/> <prompt>Good passcode!</prompt> </if> </filled> </field> <block> <prompt>That was the last form item.</prompt> <disconnect/> </block> </form> </vxml>
172
<div>
<div>
VoiceXML 1.0 only. Java Speech Markup Language element that classifies a region of text as a particular type. Syntax <div type="sentence"|"paragraph" > Text </div> Description Identifies enclosed text as a particular type for interpretive purposes. The contained text is spoken normally; <div> has no effect. Note: In VoiceXML 2.0, this tag is replaced by the <paragraph> and <sentence> tags. Usage Parents <audio> <bevocal:whisper> <choice> <div> <emp> <enumerate> <prompt> <pros> See Also VoiceXML 1.0 Specification: <div> Related tags: <break>, <emp>, <pros>, <sayas> Children <audio> <break> <div> <emp> <enumerate> <pros> <sayas> <say-as> <value>
173
TAGS
<dtmf>
VoiceXML 1.0 only. Specifies a touch-tone key grammar. Syntax <dtmf scope="document"|"dialog" src="URI" expr="js_expression" type="MIME_type" caching="safe"|"fast" fetchhint="prefetch"|"safe" fetchtimeout="time_interval" universal="string" > Optional Inline DTMF Grammar </dtmf> Description Defines a grammar for telephone key press sequences. Note: In VoiceXML 2.0, this tag is replaced by the <grammar> tag with the mode attribute set to dtmf. Attribute scope Description Sets the scope of the DTMF grammar. documentthe grammar will be active throughout the current document. If the document is the application root document, then it will be active throughout the application (application scope). dialogthe grammar is active throughout the current form. Note: A <dtmf> element can include a scope attribute only if its parent is a <form> element. Optional (default is dialog). The scope of any other <dtmf> element is determined by its parent: If the parent is an input item, the grammar has field scope. If the parent is a link, the scope is the element that contains the link. If the parent is a menu choice, the grammar scope is specified by the scope property of the containing <menu> element (or dialog scope by default). src expr URI of the DTMF grammar specification, when it is contained in an external file. Optional (as an alternative to an inline DTMF grammar). Extension. JavaScript expression that evaluates to grammar file URI. Optional (as alternative to src).
174
<dtmf>
Attribute type
Description MIME type of the DTMF grammar. Optional. For external grammars, the default type is taken from the Content-type header of the returned file. If not present, the type is inferred from the URL's extension or from the contents of the grammar (for example, a file beginning with <?xml maps to application/srgs+xml). The recognized extensions are: .grxml, .xmlXML Speech Grammar .gramABNF Speech Grammar .gsl, .grammarNuance GSL .ngoNuance Grammar Object .jsgfJava Speech Grammar Format For internal grammars, if the grammar definition specifies the grammar type (either directly with a declaration or indirectly by containing XML elements), the interpreter uses that type. If the grammar definition doesnt indicate the type, the interpreter uses the value of the type attribute, if present. Otherwise, the interpreter assumes that the grammar is in GSL format. The currently supported types are: application/srgs+xmlXML Speech Grammar application/grammar+xmlXML Speech Grammar (Deprecated; support for this value will be removed from a future release) application/srgsABNF Speech Grammar application/grammarABNF Speech Grammar (Deprecated; support for this value will be removed from a future release) application/x-nuance-gslNuance GSL application/x-gslNuance GSL. (Deprecated; support for this value will be removed from a future release.) application/x-nuance-dynagram-binaryNuance Grammar Object If you specify an unsupported type, an error is thrown. This value is used only if the web server returns an unsupported grammar type.
caching fetchhint fetchtimeout universal
Specifies the caching policy for the resource being fetched. See Caching on page 45. Optional. Specifies whether the interpreter can attempt to optimize dialog interpretation by prefetching the resource. See Prefetching Resources on page 40. Optional. Specifies the interval to wait for the resource to be returned before throwing a error.badfetch event. See Handling Fetching Delays on page 42. Optional. Extension. Makes this grammar a universal grammar with the specified name so that it can be activated and deactivated using the universals property. This attribute does not affect the scope of the grammar; it simply assigns it to a universal category.
175
TAGS
Usage Parents <bevocal:listen> <field> <form> <link> <transfer> See Also VoiceXML 1.0 Specification: <dtmf> Related tag: <grammar> Children None.
176
<else>
<else>
Marks the beginning of an else clause within an <if> element. Syntax <else/> Description Empty tag that marks an else clause within an <if> element. Usage Parents <if> See Also VoiceXML 2.0 Specification: <else> Related tags: <if>, <elseif> Children None.
177
TAGS
Examples <?xml version="1.0" ?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" "http://cafe.bevocal.com/libraries/dtd/vxml2-0-bevocal.dtd"> <vxml version="2.0" xmlns="http://www.w3.org/2001/vxml"> <form id="foo"> <field name="color"> <grammar type="application/x-nuance-gsl"> [black white green blue purple yellow red] </grammar> <prompt> If you say your favorite color, I shall tell you its hexa decimal color code. What <emphasis>color? </emphasis> </prompt> <filled> <var name="color_code"/> <if cond="color == black"> <assign name="color_code" expr="000000"/> <elseif cond="color == white"/> <assign name="color_code" expr="FFFFFF"/> <elseif cond="color == green"/> <assign name="color_code" expr="00FF00"/> <elseif cond="color == blue"/> <assign name="color_code" expr="0000FF"/> <elseif cond="color == purple"/> <assign name="color_code" expr="7D26CD"/> <elseif cond="color == yellow"/> <assign name="color_code" expr="8B8B00"/> <elseif cond="color == red"/> <assign name="color_code" expr="CD0000"/> <else/> <assign name="color_code" expr="?"/> </if> <prompt> The code for <value expr="color"/> is <value expr="color_code"/> </prompt> <clear namelist="color color_code"/> </filled> </field> </form> </vxml>
178
<elseif>
<elseif>
Marks the beginning of an else-if clause within an <if> element. Syntax <elseif cond="js_expression" /> Description Attribute cond Description JavaScript boolean expression that must evaluate to true for the clause to execute.
Usage Parents <if> See Also VoiceXML 2.0 Specification: <elseif> Related tags: <if>, <else> Children None.
179
TAGS
Examples <?xml version="1.0" ?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" "http://cafe.bevocal.com/libraries/dtd/vxml2-0-bevocal.dtd"> <vxml version="2.0" xmlns="http://www.w3.org/2001/vxml"> <form id="foo"> <field name="color"> <grammar type="application/x-nuance-gsl"> [black white green blue purple yellow red] </grammar> <prompt> If you say your favorite color, I shall tell you its hexa decimal color code. What <emphasis>color? </emphasis> </prompt> <filled> <var name="color_code"/> <if cond="color == 'black'"> <assign name="color_code" expr="'000000'"/> <elseif cond="color == 'white'"/> <assign name="color_code" expr="'FFFFFF'"/> <elseif cond="color == 'green'"/> <assign name="color_code" expr="'00FF00'"/> <elseif cond="color == 'blue'"/> <assign name="color_code" expr="'0000FF'"/> <elseif cond="color == 'purple'"/> <assign name="color_code" expr="'7D26CD'"/> <elseif cond="color == 'yellow'"/> <assign name="color_code" expr="'8B8B00'"/> <elseif cond="color == 'red'"/> <assign name="color_code" expr="'CD0000'"/> <else/> <assign name="color_code" expr="'?'"/> </if> <prompt> The code for <value expr="color"/> is <value expr="color_code"/> </prompt> <clear namelist="color color_code"/> </filled> </field> </form> </vxml>
180
<emp>
<emp>
VoiceXML 1.0 only. Java Speech Markup Language element that changes the emphasis of speech output. Syntax <emp level="strong"|"moderate"|"none"|"reduced" > Text </emp> Description Any attribute is ignored; the contained text is spoken normally. Note: In VoiceXML 2.0, this tag is replaced by the <emphasis> tag. Attribute level Description Level of emphasis to use when speaking the enclosed text. Optional (default is moderate). Possible values are: strong moderate none reduced Usage Parents <audio> <bevocal:whisper> <choice> <div> <emp> <enumerate> <prompt> <pros> See Also VoiceXML 1.0 Specification: <emp> Related tags: <break>, <div>, <pros>, <sayas> Children <audio> <enumerate> <value> <break> <div> <emp> <pros> <sayas> <say-as>
181
TAGS
<emphasis>
New in VoiceXML 2.0. Speech Synthesis Markup Language element that changes the emphasis of speech output. Syntax <emphasis level="strong"|"moderate"|"none"|"reduced" > Text </emp> Description Any attribute is ignored; the contained text is spoken normally. Attribute level Description Level of emphasis to use when speaking the enclosed text. Optional (default is moderate). Possible values are: strong moderate none reduced Usage Parents <audio> <bevocal:whisper> <choice> <emphasis> <enumerate> <p> <paragraph> <prompt> <prosody> <s> <sentence> <voice> See Also VoiceXML 2.0 Specification: <emphasis> Related tags: <break>, <mark>, <paragraph>, <phoneme>, <prosody>, <say-as>, <sentence>, <voice> Children <audio> <break> <emphasis> <enumerate> <mark> <phoneme> <prosody> <say-as> <value> <voice>
182
<enumerate>
<enumerate>
Generates audio output that enumerates the options in a field or the choices in a menu. Syntax <enumerate> Optional Template Content </enumerate> Description Automatically generates a description of acceptable input based on the template you provide. An <enumerate> element may be used in prompts and event handlers within <menu> elements and within <field> elements that contain <option> elements; an error.semantic event is though if it is used elsewhere. When this tag is in a <menu> element or a prompt or event handler that is executed while a menu is active, it enumerates all <choice> elements within the active menu. When this tag is in a <field> element or a prompt or event handler that is executed while a field is active, it enumerates all <option> elements within the active field. Two special variables are available for use in template content for auto-generated text. Variable _prompt _dtmf Meaning Prompt of current choice. DTMF sequence assigned to current choice.
If this tag has no content, the generated text simply lists the prompts from the option or choice elements in the field or menu.
183
TAGS
Usage Parents <audio> <bevocal:whisper> <block> <catch> <choice> <emphasis> <enumerate> <error> <field> <filled> <help> <if> <initial> <menu> <noinput> <nomatch> <p> <paragraph> <prompt> <prosody> <record> <s> <sentence> <subdialog> <transfer> <voice> See Also VoiceXML 2.0 Specification: <enumerate> Related tags: <menu>, <field> Children <audio> <break> <emphasis> <enumerate> <mark> <p> <paragraph> <phoneme> <prosody> <s> <say-as> <sentence> <value> <voice>
184
<enumerate>
Examples If you run this example, the menus prompt will be: Welcome to BeVocal. For movie finder, say movie finder. For horoscopes, say horoscopes. For news desk, say news desk. <?xml version="1.0" ?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" "http://cafe.bevocal.com/libraries/dtd/vxml2-0-bevocal.dtd"> <vxml version="2.0" xmlns="http://www.w3.org/2001/vxml"> <menu> <prompt> Welcome to BeVocal. <enumerate> For the <value expr="_prompt"/> service, say <value expr="_prompt"/> </enumerate> </prompt> <choice accept="approximate" next="movies.vxml">movie finder</choice> <choice next="horoscopes.vxml">horoscopes</choice> <choice accept="approximate" next="news.vxml">news desk</choice> </menu> </vxml>
185
TAGS
<error>
Catches an error event. Syntax <error count="integer" cond="js_expression" > Executable Content </error> Description Shorthand for <catch event="error">. Catches error events of all kinds. Attribute count cond Description Minimum number of times an error must have occurred during a form or menu invocation. Optional (default is 1). JavaScript expression that must also evaluate to true for an event to be caught. Optional (default is true).
If multiple error handlers are defined in, or inherited by, the element in which the error occurs, one handler is chosen based on event count, scope, and document order. See Chapter 3, Event Handling. Tips: Within an event handler, the _event variable contains the name of the event currently being handled; the _message variable contains the message string that provides additional information about the event. If no message was supplied when the event was thrown, the _message variable is undefined. If a JavaScript expression contains any of the characters <, >, or &, that character must be replaced with the corresponding escape sequence <, >, or &. For more information, see JavaScript Quick Reference.
186
<error>
See Also VoiceXML 2.0 Specification: <error> Related variables: _event, _message Related tags: <catch>, <help>, <noinput>, <nomatch>, <rethrow>, <throw>
187
TAGS
Examples <?xml version="1.0" ?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" "http://cafe.bevocal.com/libraries/dtd/vxml2-0-bevocal.dtd"> <vxml version="2.0" xmlns="http://www.w3.org/2001/vxml"> <form id="foo"> <error> <prompt> That code is bad! <value expr="code"/></prompt> <disconnect/> </error> <field name="code" type="digits"> <prompt> Say your passcode now. </prompt> <filled> <if cond="code < 100"> <throw event="error"/> <else/> <prompt>Good passcode!</prompt> </if> </filled> </field> <block> <prompt>That was the last form item.</prompt> <disconnect/> </block> </form> </vxml>
188
<example>
<example>
New in VoiceXML 2.0. XML grammar element with an example phrase that matches the containing grammar rule. Syntax <example> Example Input </example> Description This tag is used to define a grammar in the XML form of the W3C Speech Recognition Grammar Format. An <example> element encloses a sequence of tokens corresponding to user input that matches the containing rule. It is illustrative, for the benefit of a developer reading the grammar; the speech-recognition engine ignores the element. Usage Parents <rule> See Also Examples <rule id="snapper" scope="public">  <example>red snapper</example> <example>mutton snapper</example> <ruleref uri="#snapperType"/> <token>Snapper</token> </rule> Speech Recognition Grammar Specification: <example> Chapter 4, XML Speech Grammar Format in the Grammar Reference. Children None.
189
TAGS
<exit>
Exits a session. Syntax <exit expr namelist /> Description Unloads all documents and returns control to the interpreters execution environment. Attribute expr namelist Description Not implemented. JavaScript expression that evaluates to the value to return to the execution environment. Optional (as alternative to namelist). Not implemented. Space-separated list of variable names to return to the interpreter execution environment. Optional (default is to return nothing).
Not implemented Not implemented
The expr and namelist attributes are not meaningful because the BeVocal interpreter execution context does not accept return values from the execution of a VoiceXML document. Usage Parents <bevocal:foreach> <block> <catch> <error> <filled> <help> <if> <noinput> <nomatch> See Also VoiceXML 2.0 Specification: <exit> Children None.
190
<exit>
Examples <?xml version="1.0" ?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" "http://cafe.bevocal.com/libraries/dtd/vxml2-0-bevocal.dtd"> <vxml version="2.0" xmlns="http://www.w3.org/2001/vxml"> <form id="foo"> <catch event="connection"> <log expr="user said disconnect"/> </catch> <field name="choose"> <grammar type="application/x-nuance-gsl"> [ exit disconnect ] </grammar> <prompt>Please say exit</prompt> </field> <block> <if cond="choose==exit"> <exit/> <prompt> you should NOT hear this prompt! </prompt> </if> <disconnect/> </block> </form> </vxml>
191
TAGS
<field>
Declares an input field in a form. Syntax <field name="string" expr="js_expression" cond="js_expression" type="boolean"|"currency"|"date"|"digits"|"number"|"phone"|"time" slot="string" modal="true"|"false" bevocal:urlexpr="js_expression" > Child Elements </field> Description Input item that prompts user for a value that matches a particular grammar. Attribute name Description Name of the input variable that will hold the recognition result. The variable name may not be a JavaScript reserved keyword. The input variable has dialog (form) scope; its name must be unique among all VoiceXML and JavaScript variables within the forms scope. expr JavaScript expression that assigns the initial value of the input variable for this field. Optional (default is undefined). If you set the input variable to a value other than undefined, youll need to clear it before the field can execute. cond JavaScript boolean expression that also must evaluate to true for the field to execute. Optional (default is true). If not specified, the value of the input variable alone determines whether or not the field can execute.
192
<field>
Attribute type
Description Specifies a built-in grammar. Optional (as alternative to <grammar> element). booleanGrammar for recognizing positive and negative responses. Returns true for yes and false for no. currencyGrammar for recognizing amounts of money, in dollars (Not implemented: International currency designation). Returns a string with format mm.nn. dateGrammar for recognizing dates. Returns string with format yyyymmdd; ???? is used for an unknown year and ?? is used for an unknown month or day. digitsLimited grammar for recognizing a sequence of digits. Returns a string of digits. numberMore general grammar for recognizing numbers. Returns a string that could include digits, a decimal point, or positive or negative sign. phoneGrammar for recognizing a telephone number adhering to the North American Dialing Plan (with no extension). Returns a sequence of digits. timeGrammar for recognizing a time. Returns a string with format hhmmx where and x is one of: a for AM, p for AM, h for 24 hour notation, or ? for an ambiguous time (could be AM or PM). If this field is part of a mixed-initiative dialog, the name of the grammar slot that will be used to assign a value to the input variable for this field. Optional (defaults to variable name). This attribute is ignored by a field-level grammar. See the Grammar Reference for more information on grammar slots.
slot
modal
Boolean value that must be true to temporarily turn off higher level grammars. Optional (default is false). Lets you alter default behavior so that only this fields grammars are active while the field executes.
bevocal:urlexpr
JavaScript expression that evaluates to the URI of an audio file or a local audio variable. The local audio variable can be from a <record> tag or can be a shadow variable that contains audio. Optional. Lets you perform recognition against input from the specified audio file or from a local audio variable.
193
TAGS
Properties of the Shadow Variable. Corresponding to the input variable name is a shadow variable called name$. After the input variable is filled, some additional information is available in the following properties of this shadow variable: Property confidence utterance inputmode audio Description The recognition confidence level (with 0.0 representing the lowest confidence and 1.0 representing the highest). A string representation of the words actually spoken by the user. The mode in which input was provided, one of voice or dtmf. If the bevocal.audio.capture or recordutterance property is set to true and the users speech matched the field grammar, the audio property contains an audio capture of the users speech. You can send the captured audio to a server using a <data> element. Doing so is useful if you need a record of the users speech for legal reasons. recording If the recordutterance property is set to true and the users speech matched the field grammar, the recording property contains an audio capture of the users speech. If no audio is collected, this variable is undefined. You can send the captured audio to a server using a <data> element. Doing so is useful if you need a record of the users speech for legal reasons. recordingsize recordduration The size of the recording in bytes, or undefined if no audio is collected. The duration of the recording in milliseconds, or undefined if no audio is collected.
For a field whose name is name, you access the property propName of the shadow variable with the syntax: name$.propName For example, you access the confidence property for the color field as: color$.confidence Built-in TypesParameters. Some built-in field types can be parameterized to affect the built-in type's behavior. The following table shows the types that can be parameterized and indicates which input parameters must be specified. Field Type boolean digits Input Parameters yThe DTMF keypress for an affirmative response. nThe DTMF keypress for a negative answer. minlengthThe minimum number of digits in a valid response. maxlengthThe maximum number of digits in a valid response. lengthThe exact number of digits in a valid response. Note: If you do not specify length or maxlength, the built-in grammar accepts any number of digits. You should use length or maxlength whenever you can; doing so usually results in more accurate speech recognition. A parameter is specified within the type attribute with syntax of the form: typeName ? parameter = value
194
<field>
For example the following element specifies a field of type digits that must contain exactly five digits: <field name="mydigits" type="digits?length=5"> More than one parameter may be specified; the parameters are separated by semicolons. Note: The interpreter throws a error.badfetch event if it retrieves a VoiceXML file that contains a field type with inconsistent parameters, such as:  <field ... type="digits?minlength=5;maxlength=1"> Built-in TypesOutput. In VoiceXML 2.0 the fields type attribute no longer defines an implicit <say-as> type to output the fields value. Instead it plays the value in normal TTS. You must now use the type attribute of <say-as> for a type-specific readout of the value. Tip: If a JavaScript expression contains any of the characters <, >, or &, that character must be replaced with the corresponding escape sequence <, >, or &. For more information, see JavaScript Quick Reference.
Usage Parents <form> Children <audio> <catch> <enumerate> <error> <filled> <grammar> <help> <link> <noinput> <nomatch> <option> <prompt> <property> <value>
Exceptions error.semantic - Thrown if no grammars are active. See Also VoiceXML 2.0 Specification: <field> Grammar Reference Related tags: <filled>, <grammar>
195
TAGS
Examples Example 1no type: <?xml version="1.0" ?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" "http://cafe.bevocal.com/libraries/dtd/vxml2-0-bevocal.dtd"> <vxml version="2.0" xmlns="http://www.w3.org/2001/vxml"> <form id="foo"> <field name="name"> <prompt> What is your favorite color? </prompt> <grammar type="application/x-nuance-gsl"> [ red green yellow blue orange ] </grammar> <filled> <prompt> Your favorite color is <value expr="name"/> </prompt> </filled> </field> </form> </vxml> Example 2boolean type: <?xml version="1.0" ?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" "http://cafe.bevocal.com/libraries/dtd/vxml2-0-bevocal.dtd"> <vxml version="2.0" xmlns="http://www.w3.org/2001/vxml"> <form id="form1"> <field name="new_file" type="boolean"> <prompt> Do you want to play the game again? </prompt> <filled> <prompt> Your answer is <value expr="new_file"/> </prompt> </filled> </field> </form> </vxml> Example 3date type: <?xml version="1.0" ?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" "http://cafe.bevocal.com/libraries/dtd/vxml2-0-bevocal.dtd"> <vxml version="2.0" xmlns="http://www.w3.org/2001/vxml"> <form id="form1"> <field name="today" type="date"> <prompt> What date is today? Please say or enter month day and year. </prompt> <filled> <prompt> Your answer is
196
<field>
<say-as type="date"><value expr="today"/></say-as> </prompt> </filled> </field> </form> </vxml> Example 4digits type: <?xml version="1.0" ?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" "http://cafe.bevocal.com/libraries/dtd/vxml2-0-bevocal.dtd"> <vxml version="2.0" xmlns="http://www.w3.org/2001/vxml"> <form id="form1"> <field name="card_num" type="digits"> <prompt> Please say or enter the last four digits of your credit card. </prompt> <filled> <if cond="card_num.length != 4"> <prompt> Sorry, I didnt hear exactly four digits. </prompt> <clear/> <reprompt/> <else/> <prompt> The number you entered is <say-as type="number:digits"> <value expr="card_num"/> </say-as> </prompt> </if> </filled> </field> </form> </vxml> Example 5currency type: <?xml version="1.0" ?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" "http://cafe.bevocal.com/libraries/dtd/vxml2-0-bevocal.dtd"> <vxml version="2.0" xmlns="http://www.w3.org/2001/vxml"> <form id="form1"> <field name="ticket_cost" type="currency"> <prompt> Please say the cost of your ticket. </prompt> <filled> <prompt> The cost you entered is <say-as type="currency"><value expr="ticket_cost"/></say-as> </prompt> </filled> </field> </form>
197
TAGS
</vxml> Example 6number type: <?xml version="1.0" ?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" "http://cafe.bevocal.com/libraries/dtd/vxml2-0-bevocal.dtd"> <vxml version="2.0" xmlns="http://www.w3.org/2001/vxml"> <form id="form1"> <field name="num_count" type="number"> <prompt> Please say the number of computers you want to order. </prompt> <filled> <prompt>The number you entered is <value expr="num_count"/></prompt> </filled> </field> </form> </vxml> Example 7phone type: <?xml version="1.0" ?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" "http://cafe.bevocal.com/libraries/dtd/vxml2-0-bevocal.dtd"> <vxml version="2.0" xmlns="http://www.w3.org/2001/vxml"> <form id="form1"> <field name="phone_num" type="phone"> <prompt> Please say or enter your phone number. </prompt> <filled> <prompt> The phone number you entered is <say-as type="telephone"> <value expr="phone_num"/></say-as> </prompt> </filled> </field> </form> </vxml> Example 8time type: <?xml version="1.0" ?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" "http://cafe.bevocal.com/libraries/dtd/vxml2-0-bevocal.dtd"> <vxml version="2.0" xmlns="http://www.w3.org/2001/vxml"> <form id="form1"> <field name="meeting_time" type="time"> <prompt> Please say the meeting time. </prompt> <filled> <prompt> The meeting time is
198
<field>
<say-as type="time"> <value expr="meeting_time"/> </say-as> </prompt> </filled> </field> </form> </vxml>
199
TAGS
<filled>
Contains actions to be executed when fields are filled. Syntax <filled mode="any"|"all" namelist="variable1 ..." > Child Elements </filled> Description A <filled> element can be either the child of an input item or the child of a form. When used as the child of an input item, the <filled> element has no attributes; its action is taken when the last user input fills the input variable of the containing element. When used as the child of a form, the <filled> element may have attributes that specify when its action is taken. Attribute mode Description A value of any causes execution of this element when the last user input fills any one of the specified input variables. A value of all causes execution of this element when all specified input variables are filled; the last user input must have filled at least one of the fields. Optional (default value is all). Space-separated list of names of the input variables whose filling can trigger this element. Note that names of control items are not permitted in this list. Optional (default is all input variables in the form).
namelist
It is an error to specify attributes in a <filled> element within an input item.
200
<filled>
Usage Parents <bevocal:listen> <bevocal:register> <bevocal:verify> <field> <form> <record> <subdialog> <transfer> Children <audio> <assign> <bevocal:connect> <bevocal:dial> <bevocal:disconnect> <bevocal:foreach> <bevocal:hold> <bevocal:whisper> <clear> <data> <disconnect> <enumerate> <exit> <goto> <if> <log> <prompt> <reprompt> <rethrow> <return> <script> <send> <subdialog> <submit> <throw> <var> <value>
See Also Examples <?xml version="1.0" ?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" "http://cafe.bevocal.com/libraries/dtd/vxml2-0-bevocal.dtd"> <vxml version="2.0" xmlns="http://www.w3.org/2001/vxml"> <form id="foo"> <field name="name"> <prompt> What is your favorite color ? </prompt> <grammar type="application/x-nuance-gsl"> [ red green yellow blue orange ] </grammar> <filled> <prompt> Your favorite color is <value expr="name"/> </prompt> </filled> </field> </form> </vxml> VoiceXML 2.0 Specification: <filled> Related tag: <field>
201
TAGS
<foreach>
Iterates over the elements of an array. Syntax <foreach item="string" array="string" > Executable Content </bevocal:foreach> Description The interpreter sets the iteration variable to each element of the array, in turn, and executes the contained elements once for each setting of the iteration variable. Attribute item Description Name of the iteration variable, which may not be a JavaScript reserved keyword. The scope of the iteration variable is the <foreach> element. array Usage Parents <block> <catch> <error> <filled> <foreach> <help> <if> <noinput> <nomatch> Children <assign> <audio> <bevocal:connect> <bevocal:dial> <bevocal:disconnect> <bevocal:hold> <bevocal:whisper> <clear> <data> <disconnect> <exit> <foreach> <goto> <if> <log> <prompt> <reprompt> <rethrow> <return> <script> <send> <subdialog> <submit> <throw> <value> <var> The name of a variable whose value is an array.
202
<foreach>
See Also None Examples <?xml version="1.0" ?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" "http://cafe.bevocal.com/libraries/dtd/vxml2-0-bevocal.dtd"> <vxml version="2.0" xmlns="http://www.w3.org/2001/vxml"> <form id="form1"> <block>  <var name="serviceNames" expr="new Array(news, weather, traffic)"/> <prompt>Heres a list of services </prompt>  <foreach item="service" array="serviceNames">  <prompt><value expr="service"/> <break size="small"/></prompt> </foreach> </block> </form> </vxml>
203
TAGS
<form>
Presents information and collects data. Syntax <form id="string" scope="document"|"dialog" > Child Elements </form> Description A dialog for collecting user input. Note that when you reach the end of a form, execution does not proceed to the next form on the page. If you do not explicitly transition to another dialog using <goto> or <submit>, then the application terminates when the form completes. When execution leaves a form, all variables whose scope is that form go away. If you want to retain the value of a form-level variable after leaving the form, you must copy the value into a variable with document or application scope. Attribute id scope Description The name of the form. Sets the default scope of the forms grammars. Optional (default is dialog). documentThe forms grammars are active throughout the current document. If the document is the application root document, then they are active throughout the application (application scope). dialogBy default, the forms grammars have dialog scope, which means they will be active only in this form.
204
<form>
Usage Parents <vxml> Children <bevocal:listen> <bevocal:register> <bevocal:verify> <block> <catch> <data> <error> <field> <filled> <grammar> <help> <initial> <link> <noinput> <nomatch> <property> <record> <script> <subdialog> <transfer> <var>
Exceptions error.semantic - If no grammars are active when input is expected in a form See Also VoiceXML 2.0 Specification: <form> Related tags: <field>, <block>, <record>, <transfer>, <subdialog>, <initial>, <grammar>
Examples <?xml version="1.0" ?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" "http://cafe.bevocal.com/libraries/dtd/vxml2-0-bevocal.dtd"> <vxml version="2.0" xmlns="http://www.w3.org/2001/vxml"> <form id="welcome"> <block>You are in Form One. Welcome back home! </block> <field name="hello"> <grammar type="application/x-nuance-gsl"> [next dtmf-1 start] </grammar> <prompt> Say next or press one to go to Form 2 or start to start over again. </prompt> <filled> <if cond="hello==next|| hello==dtmf-1"> <goto next="#comeagain"/> <else/> <goto next="#welcome"/>
205
TAGS
</if> </filled> </field> </form> <form id="comeagain"> <block>You are now in Form 2.</block> <field name="goback" type="boolean"> <grammar type="application/x-nuance-gsl"> [back dtmf-2 continue] </grammar> <prompt> Say back or press two to go to Form 1 or say continue to enter this form again. </prompt> <filled> <if cond="goback==back|| goback==dtmf-2"> <prompt>Thanks for stopping by Form 2. Please come again.</prompt> <goto next="#welcome"/> <else/> <goto next="#comeagain"/> </if> </filled> </field> </form> </vxml>
206
<goto>
<goto>
Goes to another location in the same or different document. Syntax <goto next="URI" expr="js_expression" expritem="js_expression" nextitem="URI" fetchhint="prefetch"|"safe" fetchtimeout="time_interval" fetchaudio="URI" maxage="time_interval" maxstale="time_interval" /> Description Transitions to another item in the same form, to another dialog in the same document, or to a different document. Except when the transition is to another item in the same form, the transition will cause all values stored in the dialogs variables to be lost. This is true even if you transition into the same dialog as you were in before. The VoiceXML interpreter clears its prompt queue when transitioning to another form item. You will not be able to bargein during prompts played in the execution of a <goto> tag. Attribute next expr expritem nextitem fetchhint fetchtimeout fetchaudio Description The URI to go to next. Optional (as alternative to expr, expritem, nextitem). JavaScript expression that evaluates to the URI to go to next. Optional (as alternative to next, expritem, nextitem). JavaScript expression that evaluates to the name of the next item in the current form to visit next. Optional (as alternative to next, expr, nextitem). Name of the next item in the current form to visit next. Optional (as alternative to next, expr, expritem, nextitem). Specifies whether the interpreter can attempt to optimize dialog interpretation by prefetching the resource. See Prefetching Resources on page 40. Optional. Specifies the interval to wait for the resource to be returned before throwing a error.badfetch event. See Handling Fetching Delays on page 42. Optional. Specifies the URI of background audio to be played during fetching. Note that this attribute and related properties affect whether queued prompts are played first. See Background Audio on page 42 for important details. Optional. New in VoiceXML 2.0. Specifies the maximum acceptable age, in seconds, of the cached resource. See Maximum Age on page 44. Optional. New in VoiceXML 2.0. Specifies the maximum acceptable time, in seconds, during which an expired cached resource can still be used. See Maximum Stale Time on page 44. Optional.
maxage maxstale
One and only one of the attributes next, expr, nextitem, and expritem must be specified.
207
TAGS
Usage Parents <bevocal:foreach> <block> <catch> <error> <filled> <help> <if> <noinput> <nomatch> See Also Examples <?xml version="1.0" ?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" "http://cafe.bevocal.com/libraries/dtd/vxml2-0-bevocal.dtd"> <vxml version="2.0" xmlns="http://www.w3.org/2001/vxml"> <var name="something" expr="500"/> <form id="form1"> <block> <var name="something" expr="6000"/> <prompt> Something is <value expr="something"/> Going to the second dialog. </prompt> <goto next="#form2"/>  <prompt>goto failed</prompt> </block> </form> <form id="form2"> <block> <prompt> You are now in the second dialog. Something is now <value expr="something"/> Thank you. </prompt> </block> </form> </vxml> VoiceXML 2.0 Specification: <goto> Related tag: <submit> Children None.
208
<grammar>
<grammar>
Specifies a speech-recognition grammar. Syntax <grammar src="URI" srcexpr="js_expression" expr="js_expression" mode="voice"|"dtmf" root="string" tag-format="URI" version="version_number" xml:base="URI" xml:lang="lang" xmlns="URI" xmlns:xsi="URI" xsi:schemaLocation="URI" scope="document"|"dialog" type="MIME_type" universal="string" fetchhint="prefetch"|"safe" fetchtimeout="time_interval" maxage="time_interval" maxstale="time_interval" weight="N" > Optional Inline Grammar </grammar> Description Specifies a grammar to be used within some VoiceXML tag such as <field>, <form>, or <link>. When the speech-recognition engine detects a match with a grammar, it may cause a transition to another dialog (if the grammar is in a menu item or link) or assign a return value to assign to an input variable (if the grammar is in a form or input item). The <grammar> element serves two primary purposes: To define the grammar or point to a predefined grammar The <grammar> element can either contain the entire grammar definition directly, point to an external grammar file in one of several formats, or point to a built-in grammar. If the grammar definition is inline and is in the XML format, then the <grammar> element allows extra attributes to support the grammar definition; these attributes are ignored for an inline grammar in any other format and they are ignored for all external grammars.
Only to point to an external or built-in grammar
Only for an XML grammar allowed inline and in grammar file
Only in an external XML grammar file
Allowed for all grammars, except references to built-in grammars
Allowed for external grammars
Not implemented
209
TAGS
To specify aspects of how to use the grammar in the particular containing VoiceXML element. For this purpose, the <grammar> element supports a set of attributes you can use with any application grammar. You can use these attributes for any grammar format and for both external and internal grammars. You cannot use these attributes for built-in grammars. Attribute src Description URI of the grammar specification, when it is contained in an external file. Optional (as an alternative to an inline grammar or expr). If you specify this attribute, the element cannot have content. You specify a grammar with a URI of one of the following forms: Grammar in any grammar file, specifying a rule to start recognition from: GrammarFileURI#RuleName Grammar in any grammar file; using the grammars root rule: GrammarFileURI Grammar in a GSL compiled grammar file (the file identifies its root rule): compiled:grammar/key. Built-in grammar: builtin:grammar/type Built-in grammar with parameters: builtin:grammar/type?parameters An error.badfetch event is thrown if the specified file cannot be found or if an inline grammar is also specified. srcexpr JavaScript expression that evaluates to grammar file URI. Optional (as alternative to src or an inline grammar). If you specify this attribute, the element cannot have content. You can specify either srcexpr or expr, but not both. If you specify both, a parse error is thrown. expr Extension. JavaScript expression that evaluates to grammar file URI. Optional (as alternative to src or an inline grammar). Note: The expr attribute has been replaced by the srcexpr attribute and may be deprecated in a future release. If you specify this attribute, the element cannot have content. You can specify either srcexpr or expr, but not both. If you specify both, a parse error is thrown. mode New in VoiceXML 2.0. Whether the grammar is for speech or DTMF input. Optional (default is voice). Valid values are: voiceSpoken input dtmfDTMF input. Used to define a grammar in the XML grammar format. For an inline XML grammar, attribute occurs in the VoiceXML document. For an external XML grammar file, attribute occurs in the grammar file and is ignored if present in the VoiceXML document. For a grammar of any other format, this attribute is always ignored. root New in VoiceXML 2.0. Specifies the name of the root grammar rule. Required for an inline XML grammar; optional for an external XML grammar. Used to define a grammar in the XML grammar format. For an inline XML grammar, attribute occurs in the VoiceXML document. For an external XML grammar file, attribute occurs in the grammar file and is ignored if present in the VoiceXML document. For a grammar of any other format, this attribute is always ignored.
210
<grammar>
Attribute tag-format
Description New in VoiceXML 2.0. Indicates the content type of enclosed <tag> elements. Optional. The only valid value is semantics/1.0. Currently, this attribute is ignored. Used to define a grammar in the XML grammar format. For an inline XML grammar, attribute occurs in the VoiceXML document. For an external XML grammar file, attribute occurs in the grammar file and is ignored if present in the VoiceXML document. For a grammar of any other format, this attribute is always ignored.
version
New in VoiceXML 2.0. Version of the grammar format. Required for any XML grammar. The only allowed version is 1.0. Used to define a grammar in the XML grammar format. For an inline XML grammar, attribute occurs in the VoiceXML document. For an external XML grammar file, attribute occurs in the grammar file and is ignored if present in the VoiceXML document. For a grammar of any other format, this attribute is always ignored.
xml:base
New in VoiceXML 2.0. Base URI. Optional. Used to define a grammar in the XML grammar format. For an inline XML grammar, attribute occurs in the VoiceXML document. For an external XML grammar file, attribute occurs in the grammar file and is ignored if present in the VoiceXML document. For a grammar of any other format, this attribute is always ignored.
xml:lang
New in VoiceXML 2.0. The language and optional country local identifier for the grammar. Optional (default is en-US) The accepted language identifiers are: enEnglish en-USUnited States English esSpanish es-USUnited States Spanish fr-caFrench Canadian If an unsupported language is specified, an error.unsupported.language event is thrown. Used to define a grammar in the XML grammar format. For an inline XML grammar, the attribute occurs in the VoiceXML document. For an external XML grammar file, attribute occurs in the grammar file and is ignored if present in the VoiceXML document. For a grammar of any other format, this attribute is always ignored.
xmlns
New in VoiceXML 2.0. Indicates the grammar namespace. Required in an external XML grammar file. Must never occur in a VoiceXML document. The value must be http://www.w3.org/2001/06/grammar. New in VoiceXML 2.0. Indicates the location of the grammar schema. Optional in an external XML grammar file. Must never occur in a VoiceXML document. The only legal value is http://www.w3.org/2001/XMLSchema-instance. New in VoiceXML 2.0. Indicates the location of the grammar schema. Optional in an external XML grammar file. Must never occur in a VoiceXML document. The only legal value is "http://www.w3.org/2001/06/grammar http://www.w3.org/TR/speech-grammar/grammar.xsd"
xmlns:xsi
xmlns:schema Location
211
TAGS
Attribute scope
Description Sets the scope of the grammar. documentthe grammar will be active throughout the current document. If the document is the application root document, then it will be active throughout the application (application scope). dialogthe grammar is active throughout the current form. Note: A <grammar> element can include a scope attribute only if its parent is a <form> element. Optional (default is dialog). The scope of any other <grammar> element is determined by its parent: If the parent is an input item, the grammar has field scope. If the parent is a link, the scope is the element that contains the link. If the parent is a menu choice, the grammar scope is specified by the scope property of the containing <menu> element (or dialog scope by default). Allowed on a <grammar> element in a VoiceXML document; not allowed in an external XML grammar file.
type
MIME type of the grammar. Optional. The currently supported types are: application/srgs+xmlXML Speech Grammar application/grammar+xmlXML Speech Grammar (Deprecated; support for this value will be removed from a future release) application/srgsABNF Speech Grammar application/grammarABNF Speech Grammar (Deprecated; support for this value will be removed from a future release) application/x-nuance-gslNuance GSL application/x-gslNuance GSL. (Deprecated; support for this value will be removed from a future release.) application/x-nuance-dynagram-binaryNuance Grammar Object If you specify an unsupported type, an error is thrown. For external grammars, the default type is taken from the Content-type header of the returned file. If not present, the type is inferred from the URL's extension or from the contents of the grammar (for example, a file beginning with <?xml maps to application/srgs+xml). The recognized extensions are: .grxml, .xmlXML Speech Grammar .gramABNF Speech Grammar .gsl, .grammarNuance GSL .ngoNuance Grammar Object .jsgfJava Speech Grammar Format For internal grammars, if the grammar definition specifies the grammar type (either directly with a declaration or indirectly by containing XML elements), the interpreter uses that type. If the grammar definition doesnt indicate the type, the interpreter uses the value of the type attribute, if present. Otherwise, the interpreter assumes that the grammar is in GSL format. Allowed on any <grammar> element in a VoiceXML document; not allowed in an external XML grammar file.
universal
Extension. Makes this grammar a universal grammar with the specified name so that it can be activated and deactivated using the universals property. This attribute does not affect the scope of the grammar; it simply assigns it to a universal category. Allowed on any <grammar> element in a VoiceXML document; not allowed in an external XML grammar file.
212
<grammar>
Attribute fetchhint
Description Specifies whether the interpreter can attempt to optimize dialog interpretation by prefetching the resource. See Prefetching Resources on page 40. Optional. Allowed only on a <grammar> element pointing to an external grammar. Specifies the interval to wait for the resource to be returned before throwing a error.badfetch event. See Handling Fetching Delays on page 42. Optional. Allowed only on a <grammar> element pointing to an external grammar. New in VoiceXML 2.0. Specifies the maximum acceptable age, in seconds, of the cached resource. See Maximum Age on page 44. Optional. Allowed only on a <grammar> element pointing to an external grammar. New in VoiceXML 2.0. Specifies the maximum acceptable time, in seconds, during which an expired cached resource can still be used. See Maximum Stale Time on page 44. Optional. Allowed only on a <grammar> element pointing to an external grammar. New in VoiceXML 2.0. Not implemented. The weight of the grammar.
fetchtimeout
maxage
maxstale
weight
(VoiceXML 1.0 only) The following attributes can be used in applications in which the version attribute of the <vxml> tag is set to 1.0. Attribute caching Description VoiceXML 1.0 only. Specifies the caching policy for the resource being fetched. See Caching on page 45. Optional. Used in place of the VoiceXML 2.0 attributes maxage and maxstale. Multiple grammars may be active at the same time. When the speech-recognition engine detects a match from a higher level, control is passed to that grammars parent element. Grammar Formats. BeVocal VoiceXML supports the W3C specification, Speech Recognition Grammar Format, which defines an XML Grammar and an ABNF Grammar. The VoiceXML 2.0 specification defines the XML grammar as a must support, and the ABNF Grammar as a should support. It is expected that many VoiceXML developers will use ABNF because it is more readable and concise. BeVocal VoiceXML additionally supports the Java JSGF format (used as examples in the VoiceXML 1.0 draft) and the Nuance GSL Grammar format. See the following for details: Examples on page 219 has examples of every format. The Grammar Reference details syntax and usage of the various supported grammar formats.
213
TAGS
Compiled Grammars. You may refer to a compiled GSL grammar using the src attribute of the <grammar> tag; for example: <grammar src="compiled:/key"/> However, if you want to combine a compiled grammar in a more complex way with another grammar, you can refer to a compiled grammar from within an ABNF grammar: <grammar> <![CDATA[ #ABNF 1.0 en-US; root $script; $script = prescription $<compiled:/key>; ]]> </grammar> Using the Grammar Compiler tool on the BeVocal Caf, you can compile a grammar and receive an associated key. For information on compiling grammars, see Chapter 7, Grammar Compiler in Using the BeVocal Caf Tools. Built-in Grammars. The VoiceXML 2.0 specification (Appendix P) includes a set of built-in grammars as a convenience to enable developers to get started writing more complex VoiceXML applications quickly. Some of the basic built-in grammars such as date and time are actually difficult to write and tune by hand, but are very useful in many applications. The basic built-in grammars are: Grammar Type boolean currency date digits number phone time Description Recognizes a positive or negative response. Recognizes an amount of money, in dollars. Recognizes a calendar date. Recognizes a sequence of digits. Recognizes a number. Recognizes a telephone number adhering to the North American Dialing Plan (with no extension). Recognizes a clock time.
Note: All standard built-in grammars are supported in the Spanish language. Currently there are no extended built-in grammars for Spanish. Neither standard nor extended built-in grammars are currently supported in French Canadian. All extended built-in grammars are supported only in English. If you specify a language other than English and refer to an unsupported built-in grammar, a parse error error.unsupported.builtin is thrown.
214
<grammar>
In addition, BeVocal VoiceXML contains a set of extended built-in grammars, so VoiceXML developers can reference these quite complex grammars which have been tuned over the years by caller usage. The extended grammars are: Grammar Type airport airline datetime equity citystate stockindex street streetaddress Description Recognizes an airport name or code, such as DFW or Dallas-Fort Worth. Recognizes an airline name or code, such as AA or American Airlines. Recognizes a date and time. Please contact your BeVocal sales representative or sales@bevocal.com for further information on pricing and availability. Recognizes a company symbol, such as IBM or CSCO. Recognizes US city and state names, for example, Sunnyvale, California. Recognizes the names of the major US stock indexes, such as Nasdaq. Recognizes a street name (with or without street number). Recognizes a street name and street number.
Note that built-in grammars can be used where other grammars are used, commonly either in fields or in forms. A built-in grammar can be specified using the syntax: <grammar src="builtin:grammar/type"> For example the phone type is specified using: <grammar src="builtin:grammar/phone"> Most built-in grammars, such as phone, fill a single slot whose name is the same as the grammar name. For example, the date grammar fills a slot named date. However, some of the more complex grammars fill in multiple slots, for example airline fills in code and name. Built-in GrammarsFields and Forms. If the built-in grammar is used as a field grammar, the input variable is set to the slot value. If the built-in grammar fills multiple slots: If the built-in grammar is used as a form grammar, any field whose name or slot attribute matches one of the grammars slot names has its input variable set to the slot value. If the built-in grammar is used as a field grammar, then the slot names and values returned are converted into properties and values on the fields input variable. More technically, the input variable is set to a JavaScript object with properties that can be accessed once the value of the field has been filled.
For more details, see Chapter 1, Using VoiceXML Grammars in the Grammar Reference.
215
TAGS
Built-in GrammarsMultiple Slots. The following table lists the slots for the extended built-in grammars which return multiple slots. In the case of using a built-in grammar in a <field> element, these are the properties of the fields item variable which are assigned the slot values. Field Type airport Slots (or Properties) codeCode of the airport, such as DFW. For a list of known airport codes, see Airport & Airline Codes on the Resources page of the BeVocal Caf web site. spokencityThe city that was spoken, such as dallas. This slot is only returned if a domestic (US) airport is being recognized. airportcityThe airport city for the recognized airport, such as ft. worth. stateThe two-letter abbreviation for the state, such as tx. countryThe country where the airport is located, such as united states of america. codeThe airlines identifying code, such as AS. For a list of known airline codes, see Airport & Airline Codes. nameFull name of the airline, such as Alaska Airlines. monthThe month for this date/time. dayThe day for this date/time time_hoursThe hour value of the time time_minutesThe minute value of the time ampmWhether the date/time is AM or PM. timezoneThe timezone associated with this datet/ime duration_hoursThe number of hours to the specified time duration_minutesThe number of minutes to the specified time dayofweekThe day of thw eek associated with this date/time
airline
datetime
The following examples show utterances and the corresponding slots that are filled: "tomorrow " day = tomorrow "next monday" dayofweek = monday "six oclock central time" time_hours = 6 time_minutes = 0 timezone = ct "four twenty-two p m on november thirtieth" ampm = pm day = 30 month = november time_hours = 4 time_minutes = 22 "in one hour and fifteen minutes" duration_hours = 1 duration_minutes = 15 equity stockindex street nameThe name of the company, such as Cisco Systems. symbolStock symbol, such as CSCO. nameThe name of the index, such as Nasdaq Composite Index. symbolSymbol for the index, such as COMP. (none)
216
<grammar>
Field Type streetaddress citystate
Slots (or Properties) streetnameThe name of the street, such as Bordeaux Drive. streetnumberThe number of the street, such as 1380. cityThe city, such as Sunnyvale. countyThe county in which the city is located, such as Santa Clara. stateThe two-letter abbreviation for the state, such as CA. datacityIf the city is unlikely to appear in data feeds for traffic, weather, and so on (for example, because it is tiny or unincorporated), the name of an adjacent city that is more likely to work with data feeds. In all other cases, this property is identical to the city property.
A property is accessed with an expression of the form: fieldName.propertyName For example the city property of a citystate field is accessed as follows: <field name="mycity"> <grammar src="builtin:grammar/citystate"/> <filled> <prompt>The city is <value expr="mycity.city"/> </filled> </field> Built-in GrammarsParameters. Two standard built-in grammars, digits and boolean, can be parameterized. Specifically, you can set limits on the length of a digit string, and you can set DTMF key presses to mean yes or no. In addition, you must specify the city and state with the street and streetaddress built-in grammars; you can optionally specify the county to disambiguate a case in which the state contains two cities with the same name. The following table shows the built-in grammars that can be parameterized and indicates which parameters are required. Grammar boolean digits Parameters yThe DTMF key press for an affirmative answer. nThe DTMF key press for a negative answer. minlengthThe minimum number of digits in a valid utterance. maxlengthThe maximum number of digits in a valid utterance. lengthThe exact number of digits in a valid utterance. Note: If you do not specify length or maxlength, the built-in grammar accepts an infinite number of digits. You should use length or maxlength whenever you can; doing so usually results in more accurate speech recognition. equity street symbolWhen value is true, the grammar expects the input to be the stock symbol spelled-out, as in "eye" "bee" "emm" for IBM. cityThe city in which the street is located (required). stateThe state in which the specified city is located (required). countyThe county in which the specified city is located. If you omit the county and the city name is ambiguous, an error.badfetch event is thrown with a list of multiple possible counties in the error message. Note: An error.badfetch event is thrown if a required parameter is not specified, if the city is not a recognized city of the specified state, or if there is no street grammar for the specified city.
217
TAGS
Grammar streetaddress
Parameters cityThe city in which the street is located (required). stateThe state in which the specified city is located (required). countyThe county in which the specified city is located. If you omit the county and the city name is ambiguous, an error.badfetch event is thrown with a list of multiple possible counties in the error message. Note: An error.badfetch event is thrown if a required parameter is not specified, if the city is not a recognized city of the specified state, or if there is no street grammar for the specified city.
airport
domesticIf true, then the grammar recognizes only the major domestic (US) airports. If false, then it recognizes only the major international airports. If no parameter is specified, then this property defaults to true.
You express parameter information using URI-style query syntax of the form: builtin:grammar/typeName?parameter=value For example, the grammar matches a sequence of exactly five digits: <grammar src="builtin:grammar/digits?length=5"> You can specify more than one parameter, separated by semicolons. For example, the following grammar allows a user to press 7 for an affirmative answer and 9 for a negative answer: <grammar src="builtin:grammar/boolean?y=7;n=9"/> Note: The interpreter throws a error.badfetch event if it loads a VoiceXML file that contains a built-in grammar with an unrecognized parameter or with inconsistent parameters, such as:  <grammar src="builtin:grammar/digits?minlength=5;maxlength=1"> Built-in Extended GrammarsInput. The following table shows example user inputs for each extended built-in grammar. Grammar Type airport Example Inputs San Jose DFW airline American UA equity Cisco ORCL stockindex street streetaddress citystate Nasdaq Bordeaux Drive 1380 Bordeaux Drive Sunnyvale, California
Built-in Extended GrammarsOutput. In VoiceXML 2.0 the fields type attribute no longer defines an implicit <say-as> type to output the fields value. Instead it plays the value in normal TTS. You must now use the type attribute of <say-as> for a type-specific read-out of the value (in TTS). The bevocal:mode attribute of <say-as> can be used to define recorded output for some types. See the <say-as> tag for details.
218
<grammar>
Grammar Type airport airline equity stockindex street streetaddress citystate Tip:
Say-As Type airport airline equity stockindex street address citystate
Audio Output Airport name Airline name Company name Index name Street name Street number and name City State
String Result Dallas Fort Worth International Airport (DFW) American Airlines (AA) ORCL Nasdaq Composite Index Bordeaux Drive 1380 Bordeaux Drive Sunnyvale, CA
If a JavaScript expression contains any of the characters <, >, or &, that character must be replaced with the corresponding escape sequence <, >, or &. For more information, see JavaScript Quick Reference.
Usage Parents <bevocal:listen> <bevocal:register> <bevocal:verify> <choice> <field> <form> <link> <record> <transfer> See Also Examples Different grammar formats. XML Grammar - DTMF Recognition <?xml version="1.0" ?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" "http://cafe.bevocal.com/libraries/dtd/vxml2-0-bevocal.dtd"> <vxml version="2.0" xmlns="http://www.w3.org/2001/vxml"> <form> <field name="pin"> What is your pin? <grammar mode="dtmf" root="pin" version="1.0"> <rule id="digit">
Children <lexicon> <metadata> <rule>
VoiceXML 2.0 Specification: <grammar> Grammar Reference <say-as>
TAGS
<one-of> <item> 0 <item> 1 <item> 2 <item> 3 <item> 4 <item> 5 <item> 6 <item> 7 <item> 8 <item> 9 </one-of> </rule>
</item> </item> </item> </item> </item> </item> </item> </item> </item> </item>
<rule id="pin" scope="public"> <item repeat="4"><ruleref uri="#digit"/></item> </rule> </grammar> <filled> You entered <value expr="pin"/> </filled> </field> </form> </vxml> XML Grammar - Voice Recognition <?xml version="1.0" ?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" "http://cafe.bevocal.com/libraries/dtd/vxml2-0-bevocal.dtd"> <vxml version="2.0" xmlns="http://www.w3.org/2001/vxml"> <form> <field name="pin"> What is your pin? <grammar mode="voice" root="pin" version="1.0"> <rule id="digit"> <one-of> <item> 0 </item> <item> 1 </item> <item> 2 </item> <item> 3 </item> <item> 4 </item> <item> 5 </item> <item> 6 </item> <item> 7 </item> <item> 8 </item> <item> 9 </item> </one-of> </rule> <rule id="pin" scope="public"> <item repeat="4"><ruleref uri="#digit"/></item> </rule>
<grammar>
</grammar> <filled> You said <value expr="pin"/> </filled> </field> </form> </vxml>
221
TAGS
ABNF Grammar - DTMF Recognition <?xml version="1.0" ?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" "http://cafe.bevocal.com/libraries/dtd/vxml2-0-bevocal.dtd"> <vxml version="2.0" xmlns="http://www.w3.org/2001/vxml"> <form> <field name="pin"> What is your pin? <grammar type="application/grammar"> <![CDATA[ #ABNF 1.0; mode dtmf; root $pin; $digit = ( 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9); $pin = $digit<4> ; ]]> </grammar> <filled> You said <value expr="pin"/> </filled> </field> </form> </vxml> ABNF Grammar - Voice Recognition <?xml version="1.0" ?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" "http://cafe.bevocal.com/libraries/dtd/vxml2-0-bevocal.dtd"> <vxml version="2.0" xmlns="http://www.w3.org/2001/vxml"> <form> <field name="pin"> What is your pin? <grammar type="application/grammar"> <![CDATA[ #ABNF 1.0; root $pin; $digit = ( 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9); $pin = $digit<4> ; ]]> </grammar> <filled> You said <value expr="pin"/> </filled> </field> </form> </vxml>
222
<grammar>
GSL Grammar - DTMF Recognition <?xml version="1.0" ?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" "http://cafe.bevocal.com/libraries/dtd/vxml2-0-bevocal.dtd"> <vxml version="2.0" xmlns="http://www.w3.org/2001/vxml"> <form> <field name="pin"> What is your pin? <grammar type="application/x-nuance-gsl"> FourDigits (Digit Digit Digit Digit) Digit [dtmf-1 dtmf-2 dtmf-3 dtmf-4 dtmf-5 dtmf-6 dtmf-7 dtmf-8 dtmf-9 dtmf-0] </grammar> <filled> You said <value expr="pin"/> </filled> </field> </form> </vxml> GSL Grammar - Voice Recognition <?xml version="1.0" ?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" "http://cafe.bevocal.com/libraries/dtd/vxml2-0-bevocal.dtd"> <vxml version="2.0" xmlns="http://www.w3.org/2001/vxml"> <form> <field name="pin"> What is your pin? <grammar type="application/x-nuance-gsl"> FourDigits (Digit Digit Digit Digit) Digit [1 2 3 4 5 6 7 8 9 0] </grammar> <filled> You said <value expr="pin"/> </filled> </field> </form> </vxml>
223
TAGS
JSGF Grammar - Voice Recognition <?xml version="1.0" ?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" "http://cafe.bevocal.com/libraries/dtd/vxml2-0-bevocal.dtd"> <vxml version="2.0" xmlns="http://www.w3.org/2001/vxml"> <form> <field name="pin"> What is your pin? <grammar> <![CDATA[ #JSGF 1.0; grammar pin; <digit> = ( 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9); public <pin> = <digit> <digit> <digit> <digit>; ]]> </grammar> <filled> You said <value expr="pin"/> </filled> </field> </form> </vxml>
224
<help>
<help>
Catches a help event. Syntax <help count="integer" cond="js_expression" > Executable Content </help> Description Attribute count cond Description Minimum number of times an error must have occurred during a form or menu invocation. Optional (default is 1). JavaScript expression that must also evaluate to true for an event to be caught. Optional (default is true).
If multiple handlers for help events are defined in, or inherited by, the element in which the events occurs, one handler is chosen based on event count, scope, and document order. A help event is thrown whenever user input matches the predefined help universal grammar. When the <vxml> tags version attribute is 2.0 or greater, all universal grammars are deactivated by default. You can activate the help grammars by setting the universals property. The following tag activates all universal grammars, including help: <property name="universals" value="all" /> The following tag activates the help grammar and deactivates the other predefined universal grammars: <property name="universals" value="help"/> When the <vxml> tags version attribute is 1.0, all universal grammars are active by default. You can deactivate the help grammar by setting the universals property. The following tag deactivates all universal grammars, including help: <property name="universals" value="none" /> The following tag deactivates the help grammar and activates the other predefined universal grammars: <property name="universals" value="exit cancel goback"/> For additional information about universal grammars, see Universal Commands and Grammars on page 14. Tip: If a JavaScript expression contains any of the characters <, >, or &, that character must be replaced with the corresponding escape sequence <, >, or &. For more information, see JavaScript Quick Reference.
225
TAGS
See Also VoiceXML 2.0 Specification: <help> Related tags: <catch>, <error>, <noinput>, <nomatch>, <rethrow>, <throw>
226
<help>
Examples <?xml version="1.0" ?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" "http://cafe.bevocal.com/libraries/dtd/vxml2-0-bevocal.dtd"> <vxml version="2.0" xmlns="http://www.w3.org/2001/vxml"> <form id="foo"> <property name="universals" value="help"/> <field name="sports"> <grammar type="application/x-nuance-gsl"> [ football basketball tennis skiing ] </grammar> <help> Please say one of football, basketball, tennis or skiing </help> <prompt> What is your favorite sport ? </prompt> <filled> <prompt> Looks like <value expr="sports"/> is your favorite sports. </prompt> </filled> </field> </form> </vxml>
227
TAGS
<if>
Executes actions conditionally. Syntax <if cond="js_expression" > Executable Content </if> Description Attribute cond Usage Parents <bevocal:foreach> <block> <catch> <error> <filled> <help> <if> <noinput> <nomatch> Children <audio> <assign> <bevocal:connect> <bevocal:dial> <bevocal:disconnect> <bevocal:foreach> <bevocal:hold> <bevocal:whisper> <clear> <data> <disconnect> <else> <elseif> <enumerate> <exit> <goto> <if> <log> <prompt> <reprompt> <rethrow> <return> <script> <send> <subdialog> <submit> <throw> <var> <value> Description JavaScript expression that evaluates to a boolean value that must be true for the if clause to execute. Optional (default is true).
228
<if>
See Also Examples <?xml version="1.0" ?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" "http://cafe.bevocal.com/libraries/dtd/vxml2-0-bevocal.dtd"> <vxml version="2.0" xmlns="http://www.w3.org/2001/vxml"> <form id="form"> <field name="hello"> <grammar type="application/x-nuance-gsl"> [one dtmf-1 goodbye] </grammar> <prompt>Say one or press one to continue or say goodbye to exit.</prompt> <nomatch>Sorry, I did not get it.<reprompt/></nomatch> <filled> <if cond="hello==one || hello==dtmf-1"> <prompt> Welcome to this part of the world. </prompt> <else/> <prompt> Sorry you could not have much fun. Goodbye </prompt> </if> </filled> </field> </form> </vxml> VoiceXML 2.0 Specification: <if> Related tags: <else>, <elseif>
229
TAGS
<initial>
Declares initial logic upon entry into a (mixed-initiative) form. Syntax <initial name="string" expr="js_expression" cond="js_expression" > Child Elements </initial> Description Control item that controls the initial interaction in a mixed initiative form. An <initial> element can request user input or perform other non-interactive initialization tasks at the beginning of a mixed initiative form. As with all form items, an <initial>s form-item variable must have a value of undefined in order to be visited. If any of the forms fields are filled as a result of user input, the interpreter sets all <initial> form-item variables to true before performing any <filled> actions. After that, it will request user input in a directed mode based on the prompts associated with the fields that are still unfilled. Attribute name Description Name of form-item variable, which may not be a JavaScript reserved keyword. Optional (default is an unusable internal name). The form-item variable has dialog (form) scope; its name must be unique among all VoiceXML and JavaScript variables within the forms scope. Generally, you use this attribute only if you want to control <initial> execution explicitly. expr JavaScript expression that assigns the initial value of the form-item variable. Optional (default is undefined). If you set the form-item variable to a value other than undefined, youll need to clear it before the <initial> element can execute. Note that you need to give the form-item variable a name if you want to clear it separately from other form-item variables. cond JavaScript boolean expression that must also evaluate to true for the <initial> element to execute. Optional (default is true). If not specified, the value of the form-item variable alone determines whether or not the <initial> element can execute. Tip: If a JavaScript expression contains any of the characters <, >, or &, that character must be replaced with the corresponding escape sequence <, >, or &. For more information, see JavaScript Quick Reference.
230
<initial>
Usage Parents <form> Children <audio> <catch> <enumerate> <error> <help> <link> <noinput> <nomatch> <prompt> <property> <value>
Exceptions error.semantic - Thrown if no grammars are active. See Also Examples <?xml version="1.0" ?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" "http://cafe.bevocal.com/libraries/dtd/vxml2-0-bevocal.dtd"> <vxml version="2.0" xmlns="http://www.w3.org/2001/vxml"> <form id="foo"> <grammar src="foo.grammar#Main"/> <nomatch> Okay, let me ask you the questions separately. <reprompt/> <assign name="selection" expr="true"/> </nomatch> <initial name="selection"> Please say how many apples or oranges you want. </initial> <field name="fruit"> <grammar src="foo.grammar#Fruit"/> Do you want apples or oranges? </field> <field name="quantity"> <grammar src="foo.grammar#Quantity"/> How many <value expr="fruit"/> do you want? </field> <block> <prompt> You said that you wanted <value expr="quantity"/> <value expr="fruit"/>. </prompt> </block> </form> </vxml> VoiceXML 2.0 Specification: <initial>
231
TAGS
<item>
New in VoiceXML 2.0. XML grammar input element that groups tokens or elements in a rule. Identifies one of a set of alternatives and serves as an easy way to associate attributes with a set of rule pieces. Syntax <item repeat="M"|"M-N"|"0-1"|"M-" repeat-prob="P" weight="N.M" xml:lang="lang" > Content </item> Description This tag is used to define grammars in the XML form of the W3C Speech Recognition Grammar Format. An <item> element can contain any number of input elements and <tag> elements. The input elements indicate a sequence that must be matched in order. The repeat attribute applies to the entire sequence, indicating that it is optional or that it may be repeated. Any contained <tag> elements apply to the entire sequence. If the user input matches the input elements and repeat attribute, the interpreter uses the <tag> elements to assign values to input variables. Attribute repeat Description Indicates the number of times that the contained expansion may be repeated. Optional (if omitted, the sequence of input elements must occur exactly once). M The contained expansion is repeated exactly M times. M must be 0 or a positive integer. M-N The contained expansion is repeated between M and N times, (inclusive). M must be 0 or a positive integer; N must be a positive integer larger than M. For example, "3-5" declares that the contained expansion can occur exactly three, four, or five times. 0-1 is a special case, indicating that the contained expansion is optional. M- The contained expansion is repeated M times or more. M must be 0 or a positive integer. For example, "3-" declares that the contained expansion can occur 3, 4, 5 or more times. The probability of successive repetition of the repeated expression. This attribute is ignored if the repeat attribute is not specified. Optional. Must be in the range between 0.0 and 1.0; note that this is different from a weight attached to the entire item. A simple example is an optional item (zero or one occurrences) with a probability, for example, of 0.6. This indicates that the chance that the item will be matched is 60% and that the chance that it will not be present is 40%.
repeat-prob
232
<item>
Attribute weight
Description Indicates how likely the item is. Optional. Must be a positive floating point number, such as 2, 2.5, 0.8, or .4. A weight of 1 is equivalent to not specifying a weight. A weight larger than 1 indicates greater likelihood; a weight less than 1 indicates less likelihood.
xml:lang
The language and optional country local identifier for the item. Optional (default is the language of the enclosing element). The accepted language identifiers are: enEnglish en-USUnited States English esSpanish es-USUnited States Spanish fr-caFrench Canadian This attribute allows you to mix multiple languages in the same rule. If an unsupported language is specified, an error.unsupported.language event is thrown.
Usage Parents <rule> <item> <one-of> Children <token> <ruleref> <item> <one-of> <tag>
See Also Examples <-- The word "angel" is optional and is not very likely to occur. --> <item> <item repeat="0-1" repeat-prob="0.25">angel</item> <token>fish</token> </item> <-- The rule reference to digit must occur between 2 and 4 times --> <-- and it is very likely it will occur 3 or 4 times. --> <item repeat="2-4" repeat-prob=".8"> <ruleref uri="#digit"/> </item> Speech Recognition Grammar Specification: <item> Chapter 4, XML Speech Grammar Format in the Grammar Reference.
233
TAGS
<lexicon>
New in VoiceXML 2.0. XML grammar element that references an external pronunciation lexicon document. Currently, the BeVocal VoiceXML interpreter ignores the <lexicon> element. Syntax <lexicon uri="URI" type="mediaType" /> Description Attribute uri type Description The location of the pronunciation lexicon. Required. Media type of the lexicon document. Optional.
All <lexicon> elements must occur before any <rule> elements in the grammar. Usage Parents <grammar> See Also Speech Recognition Grammar Specification: <lexicon> Chapter 4, XML Speech Grammar Format in the Grammar Reference. Children None
234
<link>
<link>
Specifies a transition common to all dialogs in the links scope. Syntax <link next="URI" expr="js_expression" event="event" dtmf="DTMF_sequence" eventexpr="js_expression" message="String" messageexpr="js_expression" fetchhint="prefetch"|"safe" fetchtimeout="time_interval" fetchaudio="URI" maxage="time_interval" maxstale="time_interval" > Link Grammar </link> Description Transitions to another dialog or throws an event when user input matches one of the link grammars or the DTMF sequence specified in the dtmf attribute. If the link transitions to a different dialog, execution jumps to the links destination; if it throws an event, execution resumes in the current dialog after the event is handled. The VoiceXML interpreter clears the prompt queue when going to another form. You will not be able to bargein during prompts played in the execution of a <link> tag that goes to another form. Attribute next expr Description The URI to go to when user input matches one of the link grammars. Optional (as alternative to expr, event). JavaScript expression that evaluates to the URI to go to when user input matches one of the link grammars. The JavaScript expression is evaluated in the context of the JavaScript scope containing the link, not in the context of the currently active form item. Optional (as alternative to next, event). The event to throw when user input matches one of the link grammars. Optional (as alternative to next, expr). New in VoiceXML 2.0. A DTMF sequence to activate this link. The specified sequence is equivalent to a simple DTMF grammar; the link is activated when user input matches this DTMF sequence. This attribute can be used at the same time as child grammars. eventexpr message New in VoiceXML 2.0. A JavaScript expression evaluating to the name of the event to be thrown. New in VoiceXML 2.0. A message string providing additional context about the event being thrown. The message is available as the value of a variable within the scope of the <catch> element. New in VoiceXML 2.0. A JavaScript expression evaluating to the message string.
event dtmf
messageexpr
235
TAGS
Attribute fetchhint fetchtimeout fetchaudio
Description Specifies whether the interpreter can attempt to optimize dialog interpretation by prefetching the resource. See Prefetching Resources on page 40. Optional. Specifies the interval to wait for the resource to be returned before throwing a error.badfetch event. See Handling Fetching Delays on page 42. Optional. Specifies the URI of background audio to be played during fetching. Note that this attribute and related properties affect whether queued prompts are played first. See Background Audio on page 42 for important details. Optional. New in VoiceXML 2.0. Specifies the maximum acceptable age, in seconds, of the cached resource. See Maximum Age on page 44. Optional. New in VoiceXML 2.0. Specifies the maximum acceptable time, in seconds, during which an expired cached resource can still be used. See Maximum Stale Time on page 44. Optional.
maxage maxstale
One and only one of the next, expr, or event attributes must be specified. (VoiceXML 1.0 only) The following attributes can be used in applications in which the version attribute of the <vxml> tag is set to 1.0. Attribute caching Description VoiceXML 1.0 only. Specifies the caching policy for the resource being fetched. See Caching on page 45. Optional. Used in place of the VoiceXML 2.0 attributes maxage and maxstale. Tips: During application development, put a link like the following in your application root document, so that while youre calling your application you can say: BeVocal reload or press: *** to start the application again. <vxml version="2.0" xmlns="http://www.w3.org/2001/vxml"> <link next="<http://www.mysite.com/myapp.vxml>"> <grammar type="application/x-nuance-gsl"> [ (bevocal reload) (dtmf-star dtmf-star dtmf-star) ] </grammar> </link> ... </vxml> If a JavaScript expression contains any of the characters <, >, or &, that character must be replaced with the corresponding escape sequence <, >, or &. For more information, see JavaScript Quick Reference.
Usage Parents <field> <form> <initial> <vxml>

236
Children <grammar>
<link>
See Also Examples <?xml version="1.0" ?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" "http://cafe.bevocal.com/libraries/dtd/vxml2-0-bevocal.dtd"> <vxml version="2.0" xmlns="http://www.w3.org/2001/vxml"> <link maxage="0" next="target.vxml"> <grammar type="application/x-nuance-gsl"> [ (go elsewhere) ] </grammar> </link> <form id="form"> <field name="welcome"> This is a test for link tag. You can say go elsewhere to load the next page of this application. </field> </form> </vxml> VoiceXML 2.0 Specification: <link>
237
TAGS
<log>
New in VoiceXML 2.0. Writes debugging information to a the BeVocal Caf call log, which you can view on the Caf web site. Syntax <log label="string" expr="js_expression" > Debugging Text </log> Description Similar functionality is available within a JavaScript script using the bevocal.log function. Attribute label expr Description A string that is added as a label to messages produced by this <log> element. Optional (default is no label on the messages). JavaScript expression that evaluates to a string to be added to the call log as a separate message. Optional.
A <log> element may write one or two messages to the call log. It writes one message corresponding to the expr attribute, if any, and one message corresponding to the contained debugging text, if any. If the element has a label attribute, the specified label precedes each message. Tip: If a JavaScript expression contains any of the characters <, >, or &, that character must be replaced with the corresponding escape sequence <, >, or &. For more information, see JavaScript Quick Reference.
Usage Parents <bevocal:foreach> <block> <catch> <error> <filled> <if> <help> <noinput> <nomatch> See Also VoiceXML 2.0 Specification: <log> Children <value>
238
<log>
Examples <?xml version="1.0" ?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" "http://cafe.bevocal.com/libraries/dtd/vxml2-0-bevocal.dtd"> <vxml version="2.0" xmlns="http://www.w3.org/2001/vxml"> <form id="foo">  <block name="dbg"> <log> num: <value expr="num"/> fruit: <value expr="fruit"/> </log> </block> <field name="num" type="number"> <prompt>Say a number.</prompt> </field> <field name="fruit"> <grammar type="application/x-nuance-gsl"> [ apples oranges ] </grammar> <prompt>Do you want apples or oranges?</prompt> </field> <filled mode="any" namelist="num fruit"> <clear namelist="dbg"/> </filled> <block> <log> End of form foo reached </log> </block> </form> </vxml>
239
TAGS
<mark>
New in VoiceXML 2.0. Speech Synthesis Markup Language element that places a marker into the output stream for asynchronous notification. Syntax <mark name="string" /> Description This element does not affect the speech output process. Note: Currently, this element is ignored. Attribute name Usage Parents <audio> <bevocal:whisper> <choice> <emphasis> <enumerate> <p> <paragraph> <prompt> <prosody> <s> <sentence> <voice> See Also VoiceXML 2.0 Specification: <mark> Related tags: <break>, <emphasis>, <paragraph>, <phoneme>, <prosody>, <say-as>, <sentence>, <voice> Children None Description The name of the mark. When producing audio output for the containing element, the speech synthesizer throws an event that includes this name.
240
<menu>
<menu>
Allows user to choose among alternative destinations. Syntax <menu id="string" scope="document"|"dialog" dtmf="dtmf_sequence" accept="exact"|"approximate" > Child Elements </menu> Description Dialog that transitions to <choice> destinations based on user input. Attribute id Description Menu identifier. Optional. Lets you specify this menu as the target for a <goto> or <submit>. scope Sets the scope of the menus grammar. Optional (default is dialog). documentThe menus grammar is active throughout the current document. If the document is the application root document, then it is active throughout the application (application scope). dialogThe menus grammar is active only in the current menu. Enables DTMF selection for all choices in this menu. Optional (default is false). trueThe interpreter assigns DTMF selectors of 1, 2, and so on to the first 9 or fewer <choice> elements in document order that do not explicitly specify a DTMF sequence using the dtmf attribute. falseThe interpreter does not make implicit DTMF assignments to menu choices with no DTMF sequences. Note that at most 9 choices can be assigned DTMF selectors with this attribute. If the menu contains more than 9 choices, you can explicitly assign selectors to the additional choices using the dtmf attribute of their <choice> tags. If you set this attribute to true, any values you specify for dtmf attributes of individual choices must be different from the numbers that will be assigned automatically. For example, If the menu contains 4 choices with no dtmf attribute, those choices will be assigned the selectors 1 through 4. The other choices should have their dtmf attributes set to 0, *, #, or a number greater than 4. If the menu tries to assign the same DTMF selector to more than one choice, an error.badfetch event is thrown when the containing document is fetched. accept New in VoiceXML 2.0. Specifies whether the default grammars generated for <choice> elements require all words or accept a subset of the words. exactRequires the user to say the exact phrase that appears in the <choice> element. approximateAllows the user to say a subset of the words in the <choice> element. Note: The default is exact if the version attribute of the containing <vxml> element is 2.0 or greater. For backward compatibility, the default is approximate if the version attribute is less than 2.0.
dtmf
TAGS
Usage Parents <vxml> Children <audio> <catch> <choice> <enumerate> <error> <help> <noinput> <nomatch> <prompt> <property> <script> <value>
Exceptions error.semantic - Thrown if no grammars are active. See Also VoiceXML 2.0 Specification: <menu> Related tags: <choice>, <enumerate>
242
<menu>
Examples <?xml version="1.0" ?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" "http://cafe.bevocal.com/libraries/dtd/vxml2-0-bevocal.dtd"> <vxml version="2.0" xmlns="http://www.w3.org/2001/vxml"> <menu id="mainMenu" > <prompt> This is the main menu. Where do you want to start to complete your order? <enumerate/> </prompt> <choice next="#dateForm"> date </choice> <choice next="#quantityForm"> quantity </choice> <choice next="#phoneForm"> phone </choice> </menu> <form id="dateForm"> <field name="date" type="date"> <prompt> When do you want to pick up your order?</prompt> <filled> <prompt> Just to confirm, I heard you say you will pick it up on <say-as type="date"> <value expr="date"/> </say-as> </prompt> </filled> </field> <block> <goto next="#quantityForm" /> </block> </form> <form id="quantityForm"> <field name="quantity" type="number"> <prompt> How many bags of candy do you want?</prompt> <filled> <prompt> Okay, you have ordered <value expr="quantity"/> bags of candy. </prompt> </filled> </field> <block> <goto next="#phoneForm" /> </block> </form> <form id="phoneForm"> <field name="phone" type="phone"> <prompt> At what number can I reach you when the order is ready? </prompt> <filled> <prompt> I guess <say-as type="telephone"> <value expr="phone"/> </say-as> is your home number. </prompt> </filled> </field> </form> </vxml>
243
TAGS
<meta>
Defines a meta-data item as a name/value pair. Can be used either in defining a grammar in the XML grammar or in defining a VoiceXML document. Syntax <meta name="string" http-equiv="string" content="string" /> Description Attribute name Description Name of the meta-data property. Optional (as an alternative to http-equiv). This form lets you provide general document information such as author, copyright, description, keywords, and so on. In particular, you can set the name attribute to maintainer to have the trace log emailed to the user specified in the content attribute after the call. For a description of the log file, see Log Access Service. http-equiv Name of an HTTP response header. Optional (as an alternative to name). This form lets you provide information to use in HTTP response headers. For example: <meta http-equiv="Cache-control" content="max-age 10"/> <meta http-equiv="Expires" content="02 Feb 2002 23:59:59 GMT"/> While you are allowed to use this form of the <meta> tag to provide caching information about VoiceXML documents and grammars, we do not recommend you do this if you have a choice. If possible, you should place this information directly in your HTTP response headers. content Usage Parents <vxml> <grammar> Children None. Value of the meta-data property.
Allowed as a child of the <grammar> tag only when defining a grammar using the XML grammar format. Exceptions error.badfetch - If both the name and http-equiv attributes are specified. See Also VoiceXML 2.0 Specification: <meta> Chapter 4, XML Speech Grammar Format in the Grammar Reference
244
<meta>
Examples <?xml version="1.0" ?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" "http://cafe.bevocal.com/libraries/dtd/vxml2-0-bevocal.dtd"> <vxml version="2.0" xmlns="http://www.w3.org/2001/vxml">  <meta name="author" content="bevocal"/> <meta name="maintainer" content="myName@myCompany" /> <form> <block> <prompt> Welcome to the BeVocal Cafe. The client log is emailed to the maintainer. Please check your email. </prompt> </block> </form> </vxml>
245
TAGS
<metadata>
The <metadata> element is container in which information about the document can be placed using a metadata schema. Can be used either in defining a grammar in the XML grammar or in defining a VoiceXML document. This element is intended for including RDF metadata in formats like Dublin Core, but this is not supported by W3C DTD in the last Call Working Draft, which only supports an empty <metadata> element. When the BeVocal interpreter supports schemas, then metadata schemas will be supported. Syntax <metadata/> Description Currently this tag has no effect. Usage Parents <vxml> <grammar> Children None.
Allowed as a child of the <grammar> tag only when defining a grammar using the XML grammar format. Exceptions error.badfetch - Will be thrown currently if the <metadata> element has content. See Also VoiceXML 2.0 Specification: <metadata> Chapter 4, XML Speech Grammar Format in the Grammar Reference
246
<noinput>
<noinput>
Catches a no-input event. Syntax <noinput count="integer" cond="js_expression" > Executable Content </noinput> Description Attribute count cond Description Minimum number of times an error must have occurred during a form or menu invocation. Optional (default is 1). JavaScript expression that must also evaluate to true for an event to be caught. Optional (default is true).
If multiple handlers for no-input events are defined in, or inherited by, the element in which the events occurs, one handler is chosen based on event count, scope, and document order. See Chapter 3, Event Handling. Tip: If a JavaScript expression contains any of the characters <, >, or &, that character must be replaced with the corresponding escape sequence <, >, or &. For more information, see JavaScript Quick Reference.
247
TAGS
Usage Parents <bevocal:listen> <bevocal:register> <bevocal:verify> <form> <initial> <menu> <record> <subdialog> <transfer> <vxml> Children <audio> <assign> <bevocal:connect> <bevocal:dial> <bevocal:disconnect> <bevocal:foreach> <bevocal:hold> <bevocal:whisper> <clear> <data> <disconnect> <enumerate> <exit> <goto> <if> <log> <prompt> <reprompt> <rethrow> <return> <script> <send> <subdialog> <submit> <throw> <var> <value>
See Also Examples <?xml version="1.0" ?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" "http://cafe.bevocal.com/libraries/dtd/vxml2-0-bevocal.dtd"> <vxml version="2.0" xmlns="http://www.w3.org/2001/vxml"> <noinput> Sorry I did not hear what you said <reprompt/> </noinput> <form id="foo"> <field name="name"> <prompt> What is your favorite color ? </prompt> <grammar type="application/x-nuance-gsl"> [ red green yellow blue orange ] </grammar> <filled> <prompt> Your favorite color is <value expr="name"/> </prompt> VoiceXML 2.0 Specification: <noinput> Related tags: <catch>, <error>, <help>, <nomatch>, <rethrow>, <throw>
248
<noinput>
</filled> </field> </form> </vxml>
249
TAGS
<nomatch>
Catches a no-match event. Syntax <nomatch count="integer" cond="js_expression" > Executable Content </nomatch> Description Attribute count cond Description Minimum number of times an error must have occurred during a form or menu invocation. Optional (default is 1). JavaScript expression that must also evaluate to true for an event to be caught. Optional (default is true).
If multiple handlers for no-match events are defined in, or inherited by, the element in which the events occurs, one handler is chosen based on event count, scope, and document order. See Chapter 3, Event Handling. Tip: If a JavaScript expression contains any of the characters <, >, or &, that character must be replaced with the corresponding escape sequence <, >, or &. For more information, see JavaScript Quick Reference.
250
<nomatch>
Usage Parents <bevocal:listen> <field> <form> <initial> <menu> <record> <bevocal:register> <subdialog> <transfer> <bevocal:verify> <vxml> Children <audio> <assign> <bevocal:connect> <bevocal:dial> <bevocal:disconnect> <bevocal:foreach> <bevocal:hold> <bevocal:whisper> <clear> <data> <disconnect> <enumerate> <exit> <goto> <if> <log> <prompt> <reprompt> <rethrow> <return> <script> <send> <subdialog> <submit> <throw> <var> <value>
See Also VoiceXML 2.0 Specification: <nomatch> Related tags: <catch>, <error>, <help>, <noinput>, <rethrow>, <throw>
251
TAGS
Examples <?xml version="1.0" ?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" "http://cafe.bevocal.com/libraries/dtd/vxml2-0-bevocal.dtd"> <vxml version="2.0" xmlns="http://www.w3.org/2001/vxml"> <nomatch> Please say one of red green yellow blue or orange </nomatch> <form id="foo"> <field name="name"> <prompt> What is your favorite color ? </prompt> <grammar type="application/x-nuance-gsl"> [ red green yellow blue orange ] </grammar> <filled> <prompt> Your favorite color is <value expr="name"/> </prompt> </filled> </field> </form> </vxml>
252
<object>
<object>
Throws an unsupported object exception. Syntax <object> Child Elements </object> Description This tag allows a platform to expose platform-specific functionality. The BeVocal VoiceXML interpreter does not expose such functionality. VoiceXML 1.0 only. When it encounters the <object> tag, the interpreter always throws an error.unsupported.object exception. New in VoiceXML 2.0. When it encounters the <object> tag, the interpreter throws an error.unsupported.object.objectname exception, where objectname is the value of the class attribute of the <object> tag. Usage Parents <form> Children -
253
TAGS
<one-of>
New in VoiceXML 2.0. XML grammar input element that indicates alternative user inputs. Syntax <one-of xml:lang="lang" > Alternatives </one-of> Description This tag is used to define grammars in the XML form of the W3C Speech Recognition Grammar Format. The contained items are alternatives; any one of them may be matched by the user input. Each alternative is an <item> element. Attribute xml:lang Description The language and optional country local identifier for the set of alternatives. Optional (default is the language of the enclosing element). The accepted language identifiers are: enEnglish en-USUnited States English esSpanish es-USUnited States Spanish fr-caFrench Canadian This attribute allows you to mix multiple languages in the same rule. If an unsupported language is specified, an error.unsupported.language event is thrown. Usage Parents <rule> <item> See Also Examples <grammar ...> <rule id="coloredObject"> <ruleref id="color"/> <ruleref id="object"/> </rule> Speech Recognition Grammar Specification: <one-of> Chapter 4, XML Speech Grammar Format in the Grammar Reference Children <item>
254
<one-of>
<rule id="color"> <one-of> <item> red <item> pink <item> yellow <item> canary <item> green <item> khaki </one-of> </rule>
<tag> <tag> <tag> <tag> <tag> <tag>
color="red" color="red" color="yellow" color="yellow" color="green" color="green"
</tag> </tag> </tag> </tag> </tag> </tag>
</item> </item> </item> </item> </item> </item>
<rule id="object"> <one-of> <item> <tag> object="vehicle" </tag> <one-of><item>truck</item><item>car</item></one-of> </item> <item> <tag> object="toy" </tag> <one-of><item>ball</item><item>block</item></one-of> </item> <item> <tag> object="clothing" </tag> <one-of><item>shirt</item><item>blouse</item></one-of> </item> </one-of> </rule> </grammar>
255
TAGS
<option>
Specifies an option in a <field>. Syntax <option accept="exact"|"approximate" value="string" dtmf="dtmf_sequence" > Option Text </option> Description Provide one of a simple set of alternatives within a field without specifying a grammar. A grammar for the field is generated automatically, based on the option list. You can use <enumerate> to generate prompts automatically based on option lists as well. Attribute accept Description New in VoiceXML 2.0. Specifies whether the default grammar generated for this <choice> element requires all words or accepts a subset of the words; overrides the accept attribute of the parent <menu> element. exactRequires the user to say the exact phrase that appears in the <choice> element. approximateAllows the user to say a subset of the words in the <choice> element. Note: The default is exact if the version attribute of the containing <vxml> element specifies 2.0 or greater. For backward compatibility, the default is approximate if the version attribute is less than 2.0 or unspecified. value String to assign to the input variable when this item is selected. Optional (default is the value of the dtmf attribute, if any, otherwise, the option text itself with leading and trailing white space removed). DTMF sequence to assign to this option. Optional.
dtmf Usage Parents <field> See Also
Children None.
VoiceXML 2.0 Specification: <option> Related tags: <field>, <choice>, <enumerate>
256
<option>
Examples <?xml version="1.0" ?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" "http://cafe.bevocal.com/libraries/dtd/vxml2-0-bevocal.dtd"> <vxml version="2.0" xmlns="http://www.w3.org/2001/vxml"> <form id="mainform"> <block name="welcome"> This is a test of the option tag. </block> <catch event="nomatch noinput"> <prompt> Sorry, I didn't understand. </prompt> <reprompt/> </catch> <field name="mainmenu"> <prompt> Please select an application from the list of <enumerate/> </prompt> <option value="Directions">Driving Directions </option> <option value="Portfolio">Stock Quotes </option> <option value="Stores">Business Finder </option> <filled> <prompt>You chose <value expr="mainmenu"/> </prompt> </filled> </field> </form> </vxml>
257
TAGS
<p>
New in VoiceXML 2.0. Speech Synthesis Markup Language element that classifies a region of text as a paragraph. Syntax <p xml:lang="lang" > Text </p> Description Identifies enclosed text as a paragraph for interpretive purposes. This tag is included for brevity; it is the exact equivalent of <paragraph>. Attribute xml:lang Usage Parents <audio> <bevocal:whisper> <choice> <enumerate> <prompt> <prosody> <voice> Children <audio> <break> <emphasis> <enumerate> <mark> <phoneme> <prosody> <s> <say-as> <sentence> <value> <voice> Description Not Supported. The language and optional country local identifier for the paragraph. Optional
See Also VoiceXML 2.0 Specification: <paragraph> Related tags: <break>, <emphasis>, <mark>, <paragraph>, <phoneme>, <prosody>, <s>, <say-as>, <sentence>, <voice>
258
<paragraph>
<paragraph>
New in VoiceXML 2.0. Speech Synthesis Markup Language element that classifies a region of text as a paragraph. Syntax <paragraph xml:lang="lang" > Text </paragraph> Description Identifies enclosed text as a paragraph for interpretive purposes. For brevity, the tag <p> is an exact equivalent of this tag. Attribute xml:lang Usage Parents <audio> <bevocal:whisper> <choice> <enumerate> <prompt> <prosody> <voice> Children <audio> <break> <emphasis> <enumerate> <mark> <phoneme> <prosody> <s> <say-as> <sentence> <value> <voice> Description Not Supported. The language and optional country local identifier for the paragraph. Optional
See Also VoiceXML 2.0 Specification: <paragraph> Related tags: <break>, <emphasis>, <mark>, <phoneme>, <prosody>, <say-as>, <sentence>, <voice>
259
TAGS
<param>
Specifies a parameter in a <subdialog> element. Syntax <param name="string" expr="js_expression" value="string" valuetype type="MIME_type" index="integer" > Nested Param Element </param> Description Passes values to subdialogs. When a <param> element is used to pass a parameter to a subdialog, the subdialog must contain a <var> declaration for the parameter. If the <var> contains an initializing expr attribute, that initializing value is ignored and the value passed with the <param> element is used instead. Attribute name expr value valuetype type index Description Parameter name. JavaScript expression that evaluates to the value of this parameter. Optional (as alternative to value). String to assign as the value of this parameter. Optional (as alternative to expr). Not implemented. Type of the value attribute (following Nuance precedent). Optional. Extension. Index of current <param> element. Optional. Use when parent <param> element is a vector (java.lang.Vector) or an array. Tip: If a JavaScript expression contains any of the characters <, >, or &, that character must be replaced with the corresponding escape sequence <, >, or &. For more information, see JavaScript Quick Reference.
Not implemented
Usage Parents <param> <subdialog> See Also VoiceXML 2.0 Specification: <param> Related tag: <subdialog> Children <param>
260
<param>
Examples <?xml version="1.0" ?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" "http://cafe.bevocal.com/libraries/dtd/vxml2-0-bevocal.dtd"> <vxml version="2.0" xmlns="http://www.w3.org/2001/vxml"> <form id="main"> <block>We're about to call the sub dialog</block> <subdialog name="result" src="#subbie"> <param name="hello" value="goodbye"/> <param name="goodbye" value="goodbye"/> </subdialog> <block> We're back from the sub dialog. Result dot hello equals <value expr="result.hello"/> Result dot goodbye equals <value expr="result.goodbye"/> </block> </form> <form id="subbie">   <var name="hello" expr="'hello'"/>  <var name="goodbye"/> <block> This is the sub dialog. hello equals <value expr="hello"/> goodbye equals <value expr="goodbye"/> <return namelist="hello goodbye"/> </block> </form> </vxml>
261
TAGS
<phoneme>
New in VoiceXML 2.0. Speech Synthesis Markup Language element that provides a phonetic pronunciation for the contained text. Syntax <phoneme alphabet="ipa"|"worldbet"|"xsampa"|"darpa" ph="phoneme_string" > Text </phoneme> Description A phoneme element may be empty. However, your application code will be easier to understand and maintain if the element contains human-readable text that approximates the specified phoneme string. Note: Any attribute is ignored; the contained text is spoken normally. Attribute alphabet Description The phonetic alphabet used in the phoneme string. Optional (default is darpa). Possible values are: darpaDARPA phonetic alphabet ipaInternational Phonetic Alphabet (IPA) worldbetWorldbet (Postscript) phonetic alphabet xsampaX-SAMPA phonetic alphabet Note: This attribute is passed directly to the TTS engine. Currently, the TTS engine only supports DARPA and ignores the other values. ph Usage Parents <audio> <choice> <emphasis> <enumerate> <p> <paragraph> <prompt> <prosody> <s> <sentence> <voice> See Also VoiceXML 2.0 Specification: <phoneme> Related tags: <break>, <emphasis>, <mark>, <paragraph>, <prosody>, <say-as>, <sentence>, <voice> Children None The phoneme string in the specified phonetic alphabetic.
262
<phoneme>
Example <?xml version="1.0" ?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" "http://cafe.bevocal.com/libraries/dtd/vxml2-0-bevocal.dtd"> <vxml version="2.0" xmlns="http://www.w3.org/2001/vxml" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:bevocal="http://www.bevocal.com/"> <form> <block> <prompt> <voice name="jennifer"> in good <phoneme ph="m eh 1 zh er 0"/> </voice> </prompt> </block> </form> </vxml>
263
TAGS
<prompt>
Queues TTS and audio output to the user. Syntax <prompt bargein="true"|"false" bargeintype="hotword"|"speech" cond="js_expression" count="integer" timeout="time_interval" xml:lang > Prompt Text </prompt> Description Uses TTS to convey information to the user. If the prompt consists of simple text only, you can omit the prompt tags and the text will be interpreted as if they were present. The prompt is played in its entirety unless interrupted. If the bargein property is true, a user utterance can interrupt the prompt. If the bargeintype property is hotword, only an utterance that matches an active grammar can interrupt the prompt. In the latter case, the bevocal.hotwordmin and bevocal.hotwordmax properties specify the minimum and maximum time duration, respectively, of the interrupting utterance. Attribute bargein bargeintype Description Determines whether user input will be recognized during the prompt. Optional (if not specified, the current value of the bargein property is used). New in VoiceXML 2.0. Determines what kind of input can interrupt the prompt. Optional (if not specified, the current value of the bargeintype property is used). Possible values are: hotwordUser input that doesnt match the grammar is ignored and only speech that matches a grammar can interrupt the prompt. speechAny user utterance can interrupt the prompt. This attribute is relevant only when bargein is true. If the <prompt> occurs inside a bridge transfer, the value of this bargeintype attribute is ignored. For a bridge transfer, the bargein type is always hotword. cond count JavaScript boolean expression that must evaluate to true for the prompt to be spoken. Minimum number of times the user must have visited the form item containing the prompt for the prompt to be spoken. Optional (default is 1). Lets you vary prompts if the user is having problems and revisits the same form item several times. Form item prompt counters are reset with each invocation of the form.
Not implemented
264
<prompt>
Attribute timeout
Description Time to wait before throwing a no-input event. Optional (if not specified, the current value of the timeout property is used). Express time interval as an unsigned number followed by s for time in seconds; ms for time in milliseconds.
xml:lang Tip:
Not implemented.
If a JavaScript expression contains any of the characters <, >, or &, that character must be replaced with the corresponding escape sequence <, >, or &. For more information, see JavaScript Quick Reference.
Usage Parents <bevocal:foreach> <bevocal:listen> <bevocal:register> <bevocal:verify> <block> <catch> <error> <field> <filled> <help> <if> <initial> <menu> <noinput> <nomatch> <record> <transfer> <subdialog> See Also VoiceXML 2.0 Specification: <prompt> Related tags: <reprompt>, <audio>, <transfer> Children <audio> <break> <emphasis> <enumerate> <mark> <p> <paragraph> <prosody> <s> <say-as> <sentence> <value> <voice>
265
TAGS
Examples Example 1. Tapered prompts: <?xml version="1.0" ?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" "http://cafe.bevocal.com/libraries/dtd/vxml2-0-bevocal.dtd"> <vxml version="2.0" xmlns="http://www.w3.org/2001/vxml"> <property name="universals" value="help" /> <form id="tapered"> <block> <prompt bargein="false"> This is question number 1. </prompt> </block> <field name="color"> <noinput> <reprompt/> </noinput> <nomatch> <reprompt/> </nomatch> <grammar type="application/x-nuance-gsl"> [blue red green yellow] </grammar> <prompt count="1">What is the color of the Sky?</prompt> <prompt count="2">Choose a color.</prompt> <prompt count="3">Choose from red blue green or yellow.</prompt> <help> The color of the sky is usually blue except during sunset and sunrise. </help> <filled> <if cond="color==blue"> <prompt>Thats correct. The sky is <value expr="color"/> </prompt> <else/> <prompt> thats not correct. The sky is blue in color. </prompt> </if> </filled> </field> </form> </vxml>
266
<prompt>
Example 2: <?xml version="1.0" ?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" "http://cafe.bevocal.com/libraries/dtd/vxml2-0-bevocal.dtd"> <vxml version="2.0" xmlns="http://www.w3.org/2001/vxml"> <form id="mynum"> <var name="phone" expr="5555555555"/> <field name="get_num" type="boolean"> <prompt> My phone number is <say-as type="telephone"> <value expr="phone"/> </say-as> Do you want to tell me your number? </prompt> <filled> <if cond="get_num"> <goto nextitem="your_num"/> <else/> <prompt> Ok, do not tell me. </prompt> </if> </filled> </field> <field name="your_num" type="phone" cond="get_num"> <prompt> What is your phone number </prompt> <filled> <prompt> So your number is <say-as type="telephone"> <value expr="your_num"/> </say-as> </prompt> </filled> </field> </form> </vxml>
267
TAGS
<property>
Controls settings specific to the BeVocal VoiceXML implementation platform. Syntax <property name="string" value="string" bevocal:expr="js_expression" /> Description Sets values that affect platform behavior and/or represent default attribute values. Attribute name value bevocal:expr Description Property name. The name can be any of the supported properties described in Chapter 12, Properties. Property value. The allowable values depend on the property specified in the name attribute. Property value. The JavaScript expression is evaluated whenever the <property>s scope is reinitialized.
The property settings apply to the parent element and all descendents. However, property values set at lower levels take precedence. The value of time-related properties must be an unsigned number, optionally followed by s for seconds or ms for milliseconds. If there is no suffix, seconds are assumed. Tip: If a JavaScript expression contains any of the characters <, >, or &, that character must be replaced with the corresponding escape sequence <, >, or &. For more information, see JavaScript Quick Reference.
Usage Parents <bevocal:listen> <bevocal:register> <bevocal:verify> <field> <form> <initial> <menu> <record> <subdialog> <transfer> <vxml> See Also VoiceXML 2.0 Specification: <property> Children None.
268
<property>
Examples Example 1bargein property: <?xml version="1.0" ?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" "http://cafe.bevocal.com/libraries/dtd/vxml2-0-bevocal.dtd"> <vxml version="2.0" xmlns="http://www.w3.org/2001/vxml"> <nomatch><reprompt/></nomatch> <property name="bargein" value="true"/> <form id="form_property"> <property name="bargein" value="false"/> <block> <prompt> This is a prompt where you can not barge in </prompt> <goto next="#form2"/> </block> </form> <form id="form2"> <field name="option" type="boolean"> <prompt> Try barging in by saying either yes or no as I speak. This is a prompt where you can interrupt me. </prompt> <filled> <if cond="option"> <prompt>You want to go. Goodbye.</prompt> <disconnect/> <else/> <prompt> Continuing till you say yes. </prompt> <clear/> </if> </filled> </field> </form> </vxml> Example 2timing properties: <?xml version="1.0" ?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" "http://cafe.bevocal.com/libraries/dtd/vxml2-0-bevocal.dtd"> <vxml version="2.0" xmlns="http://www.w3.org/2001/vxml"> <form>  <property name="timeout" value="10s"/> <noinput> <prompt>The caller said nothing.</prompt> <reprompt/> </noinput>
269
TAGS
 <property name="incompletetimeout" value="5s"/> <nomatch> <prompt>The caller did not say enough to match the grammar.</prompt> <reprompt/> </nomatch>  <property name="completetimeout" value="0.1s"/> <filled> <prompt>The caller said the correct thing.</prompt> <exit/> </filled> <field name="one"> Say one two three four <grammar type="application/x-nuance-gsl"> (one two three four) </grammar> </field> <block> Finished. This should never play. </block> </form> </vxml> Example 3maximum error properties: <?xml version="1.0" ?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" "http://cafe.bevocal.com/libraries/dtd/vxml2-0-bevocal.dtd"> <vxml version="2.0" xmlns="http://www.w3.org/2001/vxml">  <property <property <property <noinput> name="bevocal.maxerrors" value="9"/> name="bevocal.maxdialogerrors" value="4" /> name="timeout" value="1.5s" /> i did not hear anything </noinput>
<form id="firstform"> <catch event="error.bevocal.maxdialogerrors_exceeded"> <prompt> max dialog errors event detected. going to next form. </prompt> <goto next="#secondform"/> </catch> <catch event="error.bevocal.maxerrors_exceeded"> <prompt> total max errors event detected. this should not happen. exiting test. </prompt> <exit/>
270
<property>
</catch> <field name="first" type="boolean"> please do not say anything. you should hear the no input message 3 times before moving to the next field. </field> </form>  <form id="secondform"> <property name="bevocal.maxdialogerrors" value="3" /> <catch event="error.bevocal.maxdialogerrors_exceeded"> <prompt> max dialog errors event detected. going to next form. </prompt> <goto next="#thirdform"/> </catch> <catch event="error.bevocal.maxerrors_exceeded"> <prompt> total max errors event detected. this should not happen. exiting test. </prompt> <exit/> </catch> <field name="first" type="boolean"> please do not say anything. you should hear the no input message 2 times before moving to the next field. </field> </form>  <form id="thirdform"> <catch event="error.bevocal.maxdialogerrors_exceeded"> <prompt> max dialog errors event detected. this should not happen. </prompt> <exit/> </catch> <catch event="error.bevocal.maxerrors_exceeded"> <prompt> total max errors event detected. the test was successful. now exiting. </prompt> <exit/> </catch> <field name="first" type="boolean"> please do not say anything. you should hear the no input message 1 time before you get a max total errors event. </field> </form> </vxml>
271
TAGS
<pros>
VoiceXML 1.0 only. Java Speech Markup Language element that changes the prosody of speech output. Syntax <pros rate="string" vol="string" pitch="string" range="string" > Text </pros> Description Controls how the enclosed text is spoken. Currently all attributes are ignored and text is spoken normally. Note: In VoiceXML 2.0, this tag is replaced by the <prosody> tag. Usage Parents <audio> <bevocal:whisper> <choice> <div> <emp> <enumerate> <prompt> <pros> See Also VoiceXML 1.0 Specification: <pros> Related Tags: <break>, <div>, <emp>, <sayas> Children <audio> <break> <div> <emp> <enumerate> <pros> <sayas> <say-as> <value>
272
<prosody>
<prosody>
New in VoiceXML 2.0. Speech Synthesis Markup Language element that changes the prosody of speech output. Syntax <prosody pitch="high"|"medium"|"low"|"default"| "number_of_Hertz"|"relativeChange" contour="targets_at_intervals" range="high"|"medium"|"low"|"default"| "number_of_Hertz"|"relativeValue" rate="fast"|"medium"|"slow"|"default"| "relativeChange" duration="time_interval" volume="silent"|"soft"|"medium"|"loud"| "default"|"percent_volume"|"relativeChange" > Text </prosody> Description Controls how the enclosed text is spoken. Currently only rate and volume attributes are implemented. For other attributes the text is spoken normally. Attribute pitch contour Description Not implemented. The baseline pitch for the contained text in Hertz, a relative change or values high, medium, low, default. Not implemented. Sets the actual pitch contour for the contained text. The pitch contour is defined as a set of targets at specified intervals in the speech output. In each pair of the form (interval,target), the first value is a percentage of the period of the contained text and the second value is the value of the pitch attribute (absolute, relative, relative semitone, or descriptive values are all permitted). Interval values outside 0% to 100% are ignored. If a value is not defined for 0% or 100% then the nearest pitch target is copied. range rate duration Not implemented. The pitch range (variability) for the contained text in Hertz, a relative change or values high, medium, low, default. The speaking rate for the contained text, a relative change or values fast, medium, slow, default. Not implemented. The desired time to take to read the element contents. Express time interval as an unsigned number followed by s for time in seconds; ms for time in milliseconds. volume The volume for the contained text as a percentage (in the range 0.0 to 100.0), a relative change or values silent, soft, medium, loud, default.
Not implemented Not implemented Not implemented
Not implemented
273
TAGS
Usage Parents <audio> <bevocal:whisper> <choice> <emphasis> <enumerate> <p> <paragraph> <prompt> <prosody> <s> <sentence> <voice> Children <audio> <break> <emphasis> <enumerate> <mark> <p> <paragraph> <phoneme> <prosody> <s> <say-as> <sentence> <value> <voice>
See Also VoiceXML 2.0 Specification: <prosody> Related tags: <break>, <emphasis>, <mark>, <paragraph>, <phoneme>, <say-as>, <sentence>, <voice>
274
<record>
<record>
Records an audio sample. Syntax <record name="string" expr="js_expression" cond="js_expression" maxtime="time_interval" finalsilence="time_interval" type="MIME_type" beep="true"|"false" modal dtmfterm="true"|"false" > Child Elements </record> Description Input item that collects a recording from the user and stores the result in the input variable. Attribute name Description Name of the input variable used to store the recording. The variable name may not be a JavaScript reserved keyword. The input variable has dialog (form) scope; its name must be unique among all VoiceXML and JavaScript variables within the forms scope. expr JavaScript expression that assigns the initial value of the input variable. Optional (default is undefined). If you set the input variable to a value other than undefined, youll need to clear it before the <record> can execute. cond JavaScript boolean expression that also must evaluate to true for the <record> to execute. Optional (default is true). If not specified, the value of the input variable alone determines whether or not the <record> can execute. maxtime Maximum duration of the recording. Optional (default is 10s). Express a time interval as an unsigned number followed by s for time in seconds; ms for time in milliseconds (the default). finalsilence Duration of silence that will terminate the recording. Optional (default is 1.5s). The minimum allowed value is 0.2 seconds. The maximum allowed value is 10 seconds. Express a time interval as unsigned number followed by s for time in seconds; ms for time in milliseconds (the default). Note: If the value of the finalsilence attribute is 0s, the interpreter does a fixed length record for maxtime seconds. The recording does not wait for speech input and does not throw a NOINPUT event. The recording can still be terminated using any dtmf key if the dtmfterm attribute is set to true.
Not implemented
275
TAGS
Attribute type
Description MIME encoding of the submitted document. Optional (default is audio/wav). The supported types and the format of the resulting recording are: audio/wavWAV (RIFF header) 8 KHz 8-bit mono mu-law [PCM] single channel audio/basicWAV (RIFF header) 8 KHz 8-bit mono mu-law [PCM] single channel audio/vnd.wave;codec=1WAV (RIFF header) 8 KHz 16-bit mono Linear [PCM] single channel]
beep modal dtmfterm
Specifies whether to emit a beep just prior to recording. Optional (default is false). Not implemented. Specifies whether a DTMF keypress terminates the recording. Optional (default is true). When set to true, the DTMF keypress is not part of the recording.
If no audio is collected during the execution of <record>, its input variable is unfilled. This allows the form item to be visited again by the FIA. Properties of the Shadow Variable. Corresponding to the input variable name is a shadow variable called name$. After the recording is made, additional information is available in the following properties of this shadow variable: Property duration size termchar Description The duration of the recording in milliseconds. The size of the recording in bytes. If the dtmfterm attribute is true, and the user terminates the recording by pressing a DTMF key, then this property is the key pressed (for example, #). Otherwise, the property is null. true if the recording was terminated because the maximum duration (specified by the maxtime attribute) was reached; false if the recording terminated within the maximum time. If the recordutterance or bevocal.audio.capture property is set to true and the users speech matched the field grammar, the recording property contains an audio capture of the users speech. If no audio is collected, this variable is undefined. You can send the captured audio to a server using a <data> element. Doing so is useful if you need a record of the users speech for legal reasons. recordingsize recordduration The size of the recording in bytes, or undefined if no audio is collected. The duration of the recording in milliseconds, or undefined if no audio is collected.
maxtime
recording
For a <record> element whose name is name, you access the property propName of the shadow variable with the syntax: name$.propName
276
<record>
For example, you access the duration property for the <record> named greeting as: greeting$.duration Note: If the user hangs up during the recording, the interpreter throws a hang up event (connection.disconnect.hangup in VoiceXML 2.0; telephone.disconnect.hangup in VoiceXML 1.0). Any audio data recorded before hang up is available in the input variable and properties of the shadow variable are set as described above. Tip: If a JavaScript expression contains any of the characters <, >, or &, that character must be replaced with the corresponding escape sequence <, >, or &. For more information, see JavaScript Quick Reference.
Usage Parents <form> Children <audio> <catch> <enumerate> <error> <filled> <grammar> <help> <noinput> <nomatch> <prompt> <property> <value>
See Also Examples <?xml version="1.0" ?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" "http://cafe.bevocal.com/libraries/dtd/vxml2-0-bevocal.dtd"> <vxml version="2.0" xmlns="http://www.w3.org/2001/vxml"> <form id="form-record"> <record name="greeting" beep="true" maxtime="10s" finalsilence="2000ms"> <prompt> Please record your greeting after the tone. </prompt> <noinput> i did not hear anything. <reprompt/> </noinput> </record> <field name="confirm" type="boolean"> <prompt> your greeting is <audio expr="greeting"/> to keep it say yes, to discard it say no. </prompt> <filled> <if cond="confirm">
VoiceXML 2.0 Specification: <record>
TAGS
<prompt> ok, i will save your greeting </prompt> <submit method="post" namelist="greeting" next="greetingstore.jsp"/> <else/> <prompt> ok, lets try again </prompt> <clear namelist="greeting confirm"/> </if> </filled> </field> </form> </vxml>
278
<reprompt>
<reprompt>
Plays a field prompt when a field is re-visited after an event. Syntax <reprompt/> Description Normally the interpreter suppresses prompts when it selects the next form item after executing a <catch> element or other event handler. By placing a reprompt element in the event handler, you can cause normal prompting to occur and prompt counters to increment when the next form item is executed. This tag affects both the explicit fields of a form and the single implicit field of a menu. Usage Parents <bevocal:foreach> <block> <catch> <error> <filled> <help> <if> <noinput> <nomatch> See Also VoiceXML 2.0 Specification: <reprompt> Related tag: <prompt> Children None.
279
TAGS
Examples <?xml version="1.0" ?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" "http://cafe.bevocal.com/libraries/dtd/vxml2-0-bevocal.dtd"> <vxml version="2.0" xmlns="http://www.w3.org/2001/vxml">  <form id="form-record"> <field name="mycolor"> <prompt> What is your favorite color? </prompt> <prompt count="2"> Please choose red green yellow blue or orange. </prompt> <grammar type="application/x-nuance-gsl"> [ red green yellow blue orange ] </grammar> <noinput> I did not quite get your message. <reprompt/> </noinput> </field> <block> <prompt> You chose <value expr="mycolor"/> </prompt> </block> </form> </vxml>
280
<rethrow>
<rethrow>
Extension. Causes the event currently being handled to be rethrown. Syntax <rethrow /> Description When executed within an event handler, this tag causes the event currently being handled to be rethrown. The execution environment searches for a new handler for the event starting in the scope above the one containing the current handler. Rethrowing an event allows you to handle the same event at different levels. For example, an event handler in a form could perform a certain amount of cleanup and then rethrow the event so that a document-level event handler could perform further cleanup. For more information, see Chapter 3, Event Handling. Usage Parents <bevocal:foreach> <block> <catch> <error> <filled> <help> <if> <noinput> <nomatch> See Also Related tags: <catch>, <error>, <help>, <noinput>, <nomatch>, <throw> Children None.
281
TAGS
Examples See other catch and throw examples under <throw> and <catch>. <?xml version="1.0" ?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" "http://cafe.bevocal.com/libraries/dtd/vxml2-0-bevocal.dtd"> <vxml version="2.0" xmlns="http://www.w3.org/2001/vxml"> <catch event="myEvent"> <prompt>Catch of my event at document scope.</prompt> <goto next="#nonumbers" /> </catch> <form id="numbers"> <catch event="myEvent"> <prompt> Catch of my event at form scope. This will test the rethrow tag. </prompt> <rethrow/> </catch> <block name="numbergame"> This is a test of the throw tag. </block> <field name="mynumber" type="number"> <prompt> Tell me a number greater than ten. </prompt> <filled> <prompt> The number you said is <value expr="mynumber"/> </prompt> <if cond="mynumber < 10"> <throw event="myEvent"/> </if> </filled> </field> </form> <form id="nonumbers"> <block>You will hear no numbers here. Goodbye.</block> </form> </vxml>
282
<return>
<return>
Returns from a subdialog. Syntax <return namelist="variable1 ..."> event="event"/ eventexpr="js_expression" message="String" messageexpr="js_expression" /> Description Causes subdialogs execution context to terminate. You can use the <return> to pass back variable values from the subdialogs execution context and to propagate an event back to the calling dialog (for example, a no-match event). To propagate an event back to the calling dialog, use the event attribute. For example: <return event="nomatch"/> If <return> is encountered in a dialog that is not executing as a subdialog, an error.semantic event is thrown. Attribute namelist Description Space-separated list of variables to return to the calling dialog. Optional (default is to return no variables). This attribute can specify any variable currently in scope, both VoiceXML variables and JavaScript variables, including shadow variables and other variables that have not been explicitly declared. A variable set to a JavaScript object is returned as the entire object; it is not broken into its individual components as happens with variable submission. event Return to calling dialog and throw this event.
The values returned with the namelist attribute are available to the calling dialog as properties of the <subdialog> input variable. They can be accessed using the following notation: subdialogName.namelistVarible For example, if a subdialog is invoked with: <subdialog name="sub" .../> and exited with: <return namelist="a"/> then the return value can be accessed as: "sub.a"
283
TAGS
Usage Parents <bevocal:foreach> <block> <catch> <error> <help> <noinput> <nomatch> <if> <filled> Exceptions See Also Examples <?xml version="1.0" ?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" "http://cafe.bevocal.com/libraries/dtd/vxml2-0-bevocal.dtd"> <vxml version="2.0" xmlns="http://www.w3.org/2001/vxml"> <form id="main"> <subdialog name="result" src="#subbie"> <param name="goodbye" value="goodbye"/> </subdialog> <block> Were back from the sub dialog. Result dot hello equals <value expr="result.hello"/> Result dot goodbye equals <value expr="result.goodbye"/> </block> </form> <form id="subbie">    <var name="hello" expr="hello"/>  <var name="goodbye"/> <block> This is the sub dialog. <return namelist="hello goodbye"/> </block> </form> </vxml> VoiceXML 2.0 Specification: <return> Related tag: <subdialog> error.badfetch - Thrown if both event and namelist attributes are specified. Children None.
284
<rule>
<rule>
New in VoiceXML 2.0. XML grammar element that defines a grammar rule. Syntax <rule id="string" scope="private"|"public" > Other Content </rule> Description This tag is used to define grammars in the XML form of the W3C Speech Recognition Grammar Format. A <rule> element can contain any number of input elements, examples, and <tag> elements. The input elements indicate a sequence that must be matched in order. Any contained <tag> elements apply to the entire sequence. If the user input matches the input elements, the <tag> elements are interpreted to assign values to input variables. Any <example> elements are ignored by the speech-recognition engine. Attribute id Description The name of the rule; must be unique within the containing grammar. "NULL", "VOID", or "GARBAGE" are reserved for special rules; id must not be one of these values. The scope in which this rule can be used. Optional (default is "private"). privateThe rule can be used only by the containing grammar publicThe rule can be referenced by another grammar. Note: Do not confuse the scope of a rule with the scope of a containing grammar. The scope of the grammar indicates where in the VoiceXML application the grammar is active. Usage Parents <grammar> Children <example> <item> <one-of> <ruleref> <tag> <token>
scope
See Also Speech Recognition Grammar Specification: <rule> Chapter 4, XML Speech Grammar Format in the Grammar Reference.
285
TAGS
Examples This rule matches one of the 3 phrases "Garibaldi", "Nassau Grouper", or "Trout": <rule id="fish"> <one-of> <item>Garibaldi</item> <item>"Nassau Grouper"</item> <item>Trout</item> </one-of> </rule> This rule matches any input that ends with "trigger fish", such as "queen trigger fish", "picasso trigger fish", and "mean old trigger fish", as well as simply "trigger fish": <rule id="trigger"> <ruleref special="GARBAGE"/> <token>Trigger Fish</token> </rule> The following set of rules creates one public rule named snapper and two private rules named snapperType and fishColors: <rule id="snapper" scope="public"> <ruleref uri="#snapperType"/> <token>Snapper</token> </rule> <rule id="snapperType"> <token>Mutton</token> <ruleref uri="#fishColors"/> </rule> <rule id="fishColors" scope="private"> <one-of> <item>Black</item> <item>Gray</item> <item>Red</item> </one-of> </rule>
286
<ruleref>
<ruleref>
New in VoiceXML 2.0. XML grammar input element that references another rule. Syntax <ruleref uri="URI" type="MIME_type" xml:lang="lang" special="NULL"|"VOID"|"GARBAGE" > Optional Tags </ruleref> Description This tag is used to define grammars in the XML form of the W3C Speech Recognition Grammar Format. The referenced rule specifies user input to be matched. If the user input matches the referenced rule, any contained <tag> elements are interpreted to assign values to input variables. You can refer to a rule from within another rule either in the same or a different grammar definition. In summary, the formats for doing so are: Reference to a Local rule Named rule of a grammar identified by a URI Root rule of a grammar identified by a URI Rule of a grammar identified by a URI and specifying a media type Rule of a grammar identified by a URI and specifying a language Special rule definition Format <ruleref uri="#rulename"/> <ruleref uri="grammarURI#rulename"/> <ruleref uri="grammarURI"/> <ruleref uri="grammarURI" type="mediaType"/> <ruleref uri="grammarURI" xml:lang="lang"/> <ruleref special="NULL"/> <ruleref special="VOID"/> <ruleref special="GARBAGE"/>
287
TAGS
Attribute uri
Description The URI of the referenced rule. Optional (Must provide uri or special, but not both). May be one of the following: #ruleNameReferences the local rule ruleName in the contained grammar. grammarFileURI#ruleNameReferences the public rule ruleName in the grammar defined in the grammar file whose URI is grammarFileURI. The grammar file can be in any grammar format, not just XML. grammarFileURIReferences the default rule of the grammar defined in the grammar file whose URI is grammarFileURI. The grammar file can be in any grammar format.
type
MIME type of the grammar. Optional; only relevant for a reference to a rule in another grammar file. For such a reference, the default type is taken from the Content-type header of the returned file. If not present, the type is inferred from the URL's extension or from the contents of the grammar (for example, a file beginning with <?xml maps to application/srgs+xml). The recognized extensions are: .grxml, .xmlXML Speech Grammar .gramABNF Speech Grammar .gsl, .grammarNuance GSL .ngoNuance Grammar Object .jsgfJava Speech Grammar Format The currently supported types are: application/srgs+xmlXML Speech Grammar application/grammar+xmlXML Speech Grammar (Deprecated; support for this value will be removed from a future release) application/srgsABNF Speech Grammar application/grammarABNF Speech Grammar (Deprecated; support for this value will be removed from a future release) application/x-nuance-gslNuance GSL application/x-gslNuance GSL. (Deprecated; support for this value will be removed from a future release.) application/x-nuance-dynagram-binaryNuance Grammar Object If you specify an unsupported type, an error is thrown. This value is used only if the web server returns an unsupported grammar type.
288
<ruleref>
Attribute xml:lang
Description The language and optional country local identifier for the rule reference. Optional (default is the language of the enclosing element). The accepted language identifiers are: enEnglish en-USUnited States English esSpanish es-USUnited States Spanish fr-caFrench Canadian This attribute is only useful if the rule referred to is in the GSL format. In all other cases, the interpreter ignores this attribute. If an unsupported language is specified, an error.unsupported.language event is thrown.
special
The referenced special rule. Optional (Must provide uri or special, but not both). NULLRule that is automatically matched, that is, matched without the user speaking. VOIDRule that can never be matched. Inserting VOID into a sequence automatically makes that sequence unspeakable GARBAGERule that matches any speech up until the next rule match, the next token or until the end of spoken input.
Usage Parents <rule> <item> See Also Examples // Reference to a specific rule of an external grammar <ruleref uri="../fish.gram#butterflies" /> // Reference to the root rule of an external grammar <ruleref uri="http://www.myCompany.com/grammars/fish.gram" /> // References with associated media types <ruleref uri="http://www.myCompany.com/grammars/fish#butterflies" type="application/srgs" /> <ruleref uri="../fish" type="application/srgs" /> // Reference with an associated media type and language <ruleref uri="http://www.myCompany.com/grammars/animals#butterflies" type="application/x-nuance-gsl" xml:lang="en-US" /> Speech Recognition Grammar Specification: <ruleref> Chapter 4, XML Speech Grammar Format in the Grammar Reference. Children None.
289
TAGS
<s>
New in VoiceXML 2.0. Speech Synthesis Markup Language element that classifies a region of text as a sentence. Syntax <s xml:lang="lang" > Text </s> Description Identifies enclosed text as a sentence for interpretive purposes. This tag is included for brevity; it is the exact equivalent of <sentence>. Attribute xml:lang Usage Parents <audio> <bevocal:whisper> <choice> <enumerate> <p> <paragraph> <prompt> <prosody> <voice> See Also VoiceXML 2.0 Specification: <sentence> Related tags: <break>, <emphasis>, <mark>, <p>, <paragraph>, <phoneme>, <prosody>, <say-as>, <sentence>, <voice> Children <audio> <break> <emphasis> <enumerate> <mark> <phoneme> <prosody> <say-as> <value> <voice> Description Not supported. The language and optional country local identifier for the sentence. Optional
290
<say-as>
<say-as>
New in VoiceXML 2.0. Speech Synthesis Markup Language element that modifies how the enclosed word or phrase is spoken. Syntax <say-as bevocal:mode="mode" type="typeName"|"typeName:format" > Text </say-as> Description Attribute bevocal:mode Description Extension; When value recorded, the output will be recorded audio. If no recorded audio is available, then TTS will be used to play out the content. Optional. Note: Recorded prompts are not available for Spanish or French Canadian. type sub Speak enclosed text in the given style. The type attribute takes quite a few values that are listed below. Deprecated; Use the <sub> element instead. This attribute was supported in an earlier release; if you use it, the interpreter throws a parse error.
The following events may be thrown during the execution of a <say-as> element: error.badfetch - In VoiceXML 1.0; the <say-as> tag is a version 2.0 element. error.badfetch - When this element contains a class attribute. The class attribute is not valid in version 2.0 error.semantic - When the type attribute contains a value which is not one of those allowed in the enumerated list of values above. Note that some values have changed between version 1.0 and 2.0. For example type="phone" (from 1.0) would cause this error to be thrown.
Usage The type attribute takes quite a few values. Some <say-as> types require a format. For others, it is optional. The values break down into 3 categoriescorresponding to a built-in grammar, other standard SSML <say-as> types, and extended <say-as> types available with the BeVocal interpreter. The sole type value that corresponds to a built-in grammar is: Type vxml Description Contained text is a fields input variable. The allowed formats correspond exactly to the available built-in grammars: boolean, currency, date, digits, number, phone, and time. The format is required.
291
TAGS
The other standard <say-as> types and their formats are: Type acronym spell-out Description Contained text is an acronym. For example, <say-as type="acronym">IBM</say-as> is spoken as "eye bee em". Characters in the contained text string are spoken as individual characters. For example, <say-as type="spell-out">smith</say-as> is spoken as "ess em eye tea aitch". Contained text can be interpreted as a number. The allowed number formats are ordinal, cardinal, and digits. For example, <say-as type="number:ordinal">12</say-as> is spoken as "twelfth", but <say-as type="number:digits">12</say-as> is spoken as "one two". Contained text is a date. The allowed date formats indicate the form of the input and are dmy, mdy, ymd, ym, my, md, y, m, and d. For example, <say-as type="date:my">12/2003</say-as> is spoken as "December, two thousand and three". Contained text is a time of day. The allowed time formats indicate the form of the input and are hms, hm, and h. For example, <say-as type="time:h">3</say-as> is spoken as "3 oclock". Contained text is a temporal duration. The allowed duration formats indicate the form of the input and are hms, hm, ms, h, m, and s. For example, <say-as type="duration:h">3</say-as> is spoken as "3 hours". Contained text is a currency amount. For example, <say-as type="currency">$3</say-as> is spoken as "3 dollars". Contained text is a measurement. For example, <say-as type="measure">3ft</say-as> is spoken as "3 feet". Contained text is a telephone number. For example, <say-as type="telephone">5555555555</say-as> is spoken as "five five five <pause> five five five <pause> five five <pause> five five". Contained text is a proper name of a person, company and so on. Contained text is an internet identifier. The allowed net formats indicate the form of the input and are email and uri. For example, <say-as type="net:email">you@company.com</say-as> is spoken as "you at company dot com".
number
date
time
duration
currency measure telephone
name net
292
<say-as>
Extensions. The extended <say-as> types are: Type airport Description Contained text is an airport code, such as DFW. For a list of the allowed airport codes, see Airport & Airline Codes on the Resources page of the BeVocal Caf web site. Contained text is an airline codes, such as AA. For a list of the allowed airport codes, see Airport & Airline Codes. Contained text is a company symbol, such as ibm or csco Contained text is a street name (with or without street number), such as bordeaux drive or 1380 bordeaux drive Contained text is a city name, such as sunnyvale Contained text is a state name, such as california or ca Contained text is a city name and state name separated by a comma, such as sunnyvale, california Contained text is a street, city, state, and zip code separated by commas, such as 1380 bordeaux drive, sunnyvale, california, 94089 Children <value>
airline equity street city state citystate address
Parents <audio> <bevocal:whisper> <choice> <emphasis> <enumerate> <p> <paragraph> <prompt> <prosody> <s> <sentence> <voice> See Also Examples
VoiceXML 2.0 Specification: <say-as> Related tags: <break>, <emphasis>, <mark>, <paragraph>, <phoneme>, <prosody>, <sentence>, <sub>, <voice>
To hear an example of each supported type, see the Say-As Types example in the VoiceXML Samples section of the BeVocal Caf.
293
TAGS
<sayas>
VoiceXML 1.0 only. Java Speech Markup Language element that modifies how a word or phrase is spoken. Syntax <sayas sub="string" class="phone"|"date"|"time"|"digits"|"literal"| "currency"|"number"|"airport"|"airline"| "equity"|"street"|"city"|"state"| "citystate" type="phone"|"date"|"time"|"digits"|"literal"| "currency"|"number"|"airport"|"airline"| "equity"|"street"|"city"|"state"| "citystate" phon Not implemented > Text </sayas> Description Note: In VoiceXML 2.0, this tag is replaced by the <say-as> tag. The <say-as> tag is from the Synchronized Speech Markup Language. Attribute sub Description Substitute text to be spoken instead of enclosed text. Optional.
294
<sayas>
Attribute type class
Description Speak enclosed text in the given style. Optional. Possible values with enclosed text formats are: phoneFor telephone number adhering to the North American Dialing Plan, such as 408-907-3200 dateFor dates, in yyyymmdd or yyyymm format, such as 20001210 timeFor times such as 11:45 PM digitsFor digits, such as 123456 literalFor literals currencyFor currency in dollars and cents, such as $123.45 numberFor numbers, such as 10.5 or 10 Extensions: airportFor airport codes, such as DFW. For a list of the allowed airport codes, see Airport & Airline Codes on the Resources page of the BeVocal Caf web site. airlineFor airline codes, such as AA. For a list of the allowed airport codes, see Airport & Airline Codes. equityFor company symbol, such as ibm or csco streetFor street name (with or without street number), such as bordeaux drive or 1380 bordeaux drive cityFor city name, such as sunnyvale stateFor state name, such as california or ca citystateFor city name and state name separated by a comma, such as sunnyvale, california Note: The type attribute is the VoiceXML 2.0 standard; the class attribute is an extension. These two attributes are identical; only one of them should be used.
phon
Not implemented. Representation of the Unicode International Phonetic Alphabet (IPA) characters to be spoken instead of enclosed text. Optional.
The following events may be thrown during the execution of a <sayas> element: Usage Parents <audio> <bevocal:whisper> <choice> <div> <emp> <enumerate> <prompt> <pros> See Also VoiceXML 1.0 Specification: <sayas> Related tags: <break>, <div>, <emp>, <pros> Children None. error.badfetch - Thrown in VoiceXML 2.0. error.semantic - When the type attribute contains a value which is not one of those allowed in the enumerated list of values above. Note that some values have changed between version 1.0 and 2.0.
295
TAGS
Examples <?xml version="1.0" ?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 1.0//EN" "http://cafe.bevocal.com/libraries/dtd/vxml1-0-bevocal.dtd"> <vxml version="1.0"> <form> <block> <prompt> Here is a currency amount: <sayas class="currency"> $123.45 </sayas> Here is a company: <sayas class="equity"> csco </sayas> Here is a number: <sayas class="number"> 10.05 </sayas> Here is an integer number. You should not hear any decimal content: <sayas class="number"> 10 </sayas> Here is an airport: <sayas class="airport"> DFW </sayas> Here is an airline: <sayas class="airline"> AA </sayas> Here is a phone number: <sayas class="phone"> 512-301-0691 </sayas> Here is a date: <sayas class="date"> 20001210 </sayas> Here is a street: <sayas class="street"> heiden lane </sayas> Here is a city: <sayas class="city"> austin </sayas> Here is a state: <sayas class="state"> texas </sayas> Here is a citystate: <sayas class="citystate"> austin, texas </sayas> Here is a digit string: <sayas class="digits"> 123456 </sayas> </prompt> </block> </form> </vxml>
296
<script>
<script>
Specifies a block of client-side scripting logic in JavaScript. Syntax <script src="URI" charset="string" fetchhint="prefetch"|"safe" fetchtimeout="time_interval" maxage="time_interval" maxstale="time_interval" srcexpr="js_expression" > Script Text </script> Description Scripts do not have their own scope, but are executed in the scope of the containing element. Attribute src charset fetchhint fetchtimeout maxage maxstale Description URI of the script. Optional (as alternative to inline). Character encoding of the script if src is used. Optional. Specifies whether the interpreter can attempt to optimize dialog interpretation by prefetching the resource. See Prefetching Resources on page 40. Optional. Specifies the interval to wait for the resource to be returned before throwing a error.badfetch event. See Handling Fetching Delays on page 42. Optional. New in VoiceXML 2.0. Specifies the maximum acceptable age, in seconds, of the cached resource. See Maximum Age on page 44. Optional. New in VoiceXML 2.0. Specifies the maximum acceptable time, in seconds, during which an expired cached resource can still be used. See Maximum Stale Time on page 44. Optional. JavaScript expression that evaluates to grammar file URI. Optional (as alternative to src or an inline grammar). If you specify this attribute, the element cannot have content. (VoiceXML 1.0 only) The following attributes can be used in applications in which the version attribute of the <vxml> tag is set to 1.0. Attribute caching Description VoiceXML 1.0 only. Specifies the caching policy for the resource being fetched. See Caching on page 45. Optional. Used in place of the VoiceXML 2.0 attributes maxage and maxstale.
srcexpr
297
TAGS
Tips: Inside a <script> element, if your JavaScript script contains any of the characters <, >, or &, that character must be escaped. The easiest way to do this is to place the entire script inside a CDATA section, as in: <script> <![CDATA[ function factorial(n) { return (n <= 1) ? 1 : n * factorial(n-1); } ]]> </script> Remember that any variables or functions declared in a script are valid only within the scope that contains the <script> element. Define the script in document scope (in the <vxml> element) if it defines items that you want to use in several dialogs or blocks. Any items defined by a script inside a block are accessible only within that block. In the following example, the first block contains a script that defines the function foo. That function can be used in a JavaScript expression later in the same block. However, it is illegal to use the function in a different block. The <value> tag in the second block will fail because the function foo goes out of scope when the interpreter leaves the first block. <block> <script> <!{CDATA[ function foo() { return 1; } ]]> </script>  Foo is <value expr="foo()"/> </block> <block>  Foo is <value expr="foo()"/> </block>
Usage Parents <bevocal:foreach> <block> <catch> <error> <filled> <form> <help> <if> <menu> <noinput> <nomatch> <vxml> See Also VoiceXML 2.0 Specification: <script> JavaScript Quick Reference Children None.
298
<script>
Examples Example 1script to get the current time: <?xml version="1.0" ?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" "http://cafe.bevocal.com/libraries/dtd/vxml2-0-bevocal.dtd"> <vxml version="2.0" xmlns="http://www.w3.org/2001/vxml"> <form id="form"> <block name="time">  <var name="hours"/> <var name="minutes"/> <var name="seconds"/> <script> var d=new Date(); hours=d.getHours(); minutes=d.getMinutes(); seconds=d.getSeconds(); </script> <prompt> The time is now <value expr="hours"/> hours <value expr="minutes"/> minutes, and <value expr="seconds"/> seconds. </prompt> </block> </form> </vxml> Example 2 script that defines a factorial function: <?xml version="1.0" ?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" "http://cafe.bevocal.com/libraries/dtd/vxml2-0-bevocal.dtd"> <vxml version="2.0" xmlns="http://www.w3.org/2001/vxml"> <property name="universals" value="help" /> <script> <![CDATA[ function factorial(n) { return (n <= 1) ? 1 : n * factorial(n-1); } ]]> </script> <var name="max" expr="10"/> <form id="form-factorial"> <field name="fact" slot="num" type="number"> <prompt> Tell me a number and Ill tell you its factorial. </prompt> <catch event="help"> <prompt> Please say a number more then zero and less than <value expr="max"/> </prompt> </catch>
299
TAGS
<filled> <if cond="fact < max"> <prompt> <value expr="fact" /> factorial is <value expr="factorial(fact)"/> </prompt> <else/> <prompt> Please choose a number less than <value expr="max"/>. </prompt> <clear namelist="fact" /> </if> </filled> </field> </form> </vxml>
300
<send>
<send>
VoiceXML 1.0 only; Experimental Extension. Submits values to a web server without transitioning to a new VoiceXML document. Syntax <send url="URI" expr="js_expression" method="get"|"post" enctype=MIME_type namelist="variable1 ..." fetchtimeout="time_interval" fetchaudio="URI" /> Description Note: VoiceXML 2.0 applications should use the BeVocal VoiceXML extension <data> tag instead of this tag. This tag submits the specified data to a web server; when the web server returns a successful HTTP result code, execution of the current document continues with the tag following the <send>. This tag is useful when you need to save data (for example, the result of a <record>) in a database but do not need to transfer to a new document. The servlet or CGI script document to which <send> submits data must return a valid HTTP reply, including headers and at least one blank line indicating the end of the headers. If the returned HTTP result code does not indicate success, an error.badfetch is thrown to signal a server error to the VoiceXML application. That application can catch the error and play an error message or take some other action to alert the user. Note: VoiceXML 2.0 applications should use the BeVocal VoiceXML extension <data> tag instead of this tag. This tag is obsolete and will be removed in a future release of BeVocal VoiceXML. Attribute url expr method enctype Description URI to which to submit the values. Optional (as alternative to expr). A JavaScript expression that evaluates to the URI to which to submit the values. Optional (as alternative to next). The query request method. Optional (default is get). MIME encoding of the submitted document. Optional (default is application/x-www-form-urlencoded). The supported types are: application/x-www-form-urlencoded multipart/form-data The type multipart/form-data is more efficient when submitting large amounts of binary data.
301
TAGS
Attribute namelist
Description Space-separated list of variables to submit. Optional (default is to submit all input variables that have been given explicit names with the name attribute of <field>, <record>, <transfer>, or <subdialog>). This attribute can specify any variable currently in scope, both VoiceXML variables and JavaScript variables, including shadow variables and other variables that have not been explicitly declared. A variable set to a JavaScript object is submitted as the individual component values; see Submitting Complex JavaScript Objects on page 46.
fetchtimeout fetchaudio
Specifies the interval to wait for the resource to be returned before throwing a error.badfetch event. See Handling Fetching Delays on page 42. Optional. Specifies the URI of an audio clip to play while a resource is being fetched. See Background Audio on page 42. Optional.
Usage Parents <bevocal:foreach> <block> <catch> <error> <filled> <help> <if> <noinput> <nomatch> See Also Related tags: <submit> Children None.
302
<send>
Example <?xml version="1.0" ?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 1.0//EN" "http://cafe.bevocal.com/libraries/dtd/vxml1-0-bevocal.dtd"> <vxml version="1.0"> <form id="form-record"> <field name="name" type="boolean"> <prompt> Say yes to send your data to Snoop Servlet or no to exit </prompt> <filled> <if cond="name"> <send method="post" url="http://www.yoursite.com/SomeServlet"/> <prompt>Data sent successfully!</prompt> </if> <prompt> Good bye! </prompt> <exit/> </filled> </field> </form> </vxml>
303
TAGS
<sentence>
New in VoiceXML 2.0. Speech Synthesis Markup Language element that classifies a region of text as a sentence. Syntax <sentence xml:lang="lang" > Text </sentence> Description Identifies enclosed text as a sentence for interpretive purposes. For brevity, the tag <s> is an exact equivalent of this tag. Attribute xml:lang Usage Parents <audio> <bevocal:whisper> <choice> <enumerate> <p> <paragraph> <prompt> <prosody> <voice> See Also VoiceXML 2.0 Specification: <sentence> Related tags: <break>, <emphasis>, <mark>, <paragraph>, <phoneme>, <prosody>, <say-as>, <voice> Children <audio> <break> <emphasis> <enumerate> <mark> <phoneme> <prosody> <say-as> <value> <voice> Description Not Supported. The language and optional country local identifier for the sentence. Optional
304
<speak>
<speak>
The <speak> element is the root element of a standalone SSML document which contains all other SSML elements. The <speak> element is not a VoiceXML 2.0 element. Syntax <speak version="1.0" xml:lang="lang" > Text </speak> Description Through the BeVocal VoiceXML extension of <audio> with attributes bevocal:ssml and bevocal:ssmlexpr, you are able to refer to standalone SSML documents which must contain the <speak> element as the root element. Attribute xml:lang Usage Parents Children all SSML elements See Also SSML Specification: <speak> <audio> tag. Chapter 9, Dynamic SSML Description Not Supported. The language and optional country local identifier for the SSML document. Optional
305
TAGS
<sub>
New in VoiceXML 2.0. Speech Synthesis Markup Language element whose alias attribute provides substitute text to be spoken instead of the contained text. This allows the document to contain both a written and a spoken form for a string. Syntax <sub alias="substituteText" > OriginalText </sub> Description Attribute alias Usage Parents <speak> See Also VoiceXML 2.0 Specification: <sub> Related tags: <say-as>, <speak> Children None. Description The string to be substituted for the OriginalText string.
306
<subdialog>
<subdialog>
Invokes another dialog as a subdialog of the current one. Syntax <subdialog name="string" expr="js_expression" cond="js_expression" src="URI" srcexpr="js_expression" method="get"|"post" enctype=MIME_type namelist="variable1 ..." fetchhint="prefetch"|"safe" fetchtimeout="time_interval" fetchaudio="URI" maxage="time_interval" maxstale="time_interval" > Child Elements </subdialog> Description Input item that invokes a subdialog, which is a reusable dialog that is specially coded to pass back values with a <return> element. When control returns from the subdialog, the returned values are available as properties of the <subdialog> input variable. You pass values into the subdialog by including <param> child elements. The subdialog must contain a <var> declaration for each parameter passed to it. If the <var> contains an initializing expr attribute, that initializing value is ignored and the value passed with the <param> element is used instead. When a subdialog is invoked, it runs in a new execution context. This means that all variables and state information in the subdialogs document are reinitialized (including the application root document, if one is used). Any changes that the subdialog makes to application-scoped variables apply only in the subdialog context. When the subdialog returns, its context is destroyed and the context of the calling dialog is in the same state it was in before the subdialog call. The only way for the subdialog to return information to its calling dialog is with a <return> element. If the subdialog invokes other dialogs, those dialogs are also run in the new execution context. The new context terminates and the old context resumes only when the subdialog or another dialog it has invoked calls <return> to pass back the results. Note: As a BeVocal extension, the BeVocal VXML Interpreter treats <subdialog> as executable content. This allows a subdialog to be invoked from within any tag which allows executable content, such as a <block> form item. If the <subdialog> is inside of a <form> element, then upon return, execution proceeds to any applicable <filled> elements in the calling context. If you use subdialog in an executable context, such as within an <if> element, the <filled> element does not work. Instead, you should use <assign> directly after the subdialog call.
307
TAGS
Attribute name
Description Name of the input variable used to store the results of the subdialog. The input variable has dialog (form) scope; its name must be unique among all VoiceXML and JavaScript variables within the forms scope. You can access the return values as properties of the input variable using the syntax: name.returnVariableName
expr
JavaScript expression that assigns the initial value of the input variable. Optional (default is undefined). If you set the input variable to a value other than undefined, youll need to clear it before the <subdialog> can execute.
cond
JavaScript boolean expression that also must evaluate to true for the <subdialog> to execute. Optional (default is true). If not specified, the value of the input variable alone determines whether or not the <subdialog> can execute.
src
URI of the subdialog. You can use the #DialogName syntax to refer to another dialog in the current document. Even in this case, the subdialog is run in a new context. Note that the method, enctype, namelist, caching, fetchtimeout, and fetchaudio parameters apply only if src points to a different document (as opposed to starting with # to invoke a subdialog in the current document).
srcexpr method enctype
A JavaScript expression yielding the URI of the subdialog The query request method, either get or post. Optional (default is get). MIME encoding of the submitted document. Optional (default is application/x-www-form-urlencoded). The supported types are: application/x-www-form-urlencode multipart/form-data The type multipart/form-data is more efficient when submitting large amounts of binary data.
namelist
Space-separated list of variables to submit. Optional (default is to submit nothing). This attribute can specify any variable currently in scope, both VoiceXML variables and JavaScript variables, including shadow variables and other variables that have not been explicitly declared. A variable set to a JavaScript object is submitted as the individual component values; see Submitting Complex JavaScript Objects on page 46.
fetchhint fetchtimeout fetchaudio
Specifies whether the interpreter can attempt to optimize dialog interpretation by prefetching the resource. See Prefetching Resources on page 40. Optional. Specifies the interval to wait for the resource to be returned before throwing a error.badfetch event. See Handling Fetching Delays on page 42. Optional. Specifies the URI of background audio to be played during fetching. Note that this attribute and related properties affect whether queued prompts are played first. See Background Audio on page 42 for important details. Optional.
308
<subdialog>
Attribute maxage maxstale
Description New in VoiceXML 2.0. Specifies the maximum acceptable age, in seconds, of the cached resource. See Maximum Age on page 44. Optional. New in VoiceXML 2.0. Specifies the maximum acceptable time, in seconds, during which an expired cached resource can still be used. See Maximum Stale Time on page 44. Optional.
(VoiceXML 1.0 only) The following attributes can be used in applications in which the version attribute of the <vxml> tag is set to 1.0. Attribute caching Description VoiceXML 1.0 only. Specifies the caching policy for the resource being fetched. See Caching on page 45. Optional. Used in place of the VoiceXML 2.0 attributes maxage and maxstale. modal VoiceXML 1.0 only. Boolean value that must be false to enable <link> grammars from the calling context. Optional (default is true). Lets you alter default behavior so that <link> grammars from the calling context can be active while the subdialog executes. Note: In VoiceXML 2.0, all subdialogs are modal. To ensure portability, your VoiceXML applications should avoid using this attribute. By default, no grammars from the calling dialogs context are active, except any default grammars defined by the VoiceXML interpreter. However, if the modal attribute is false, <link> elements in the calling dialogs context are active. When a grammar from the calling context is triggered during execution of a subdialog, the subdialog context terminates and control returns to the calling context. If an event is thrown during execution of a subdialog and no event handler for the event is found in the subdialog context, the interpreters response depends on whether the subdialog is modal. Tip: If a JavaScript expression contains any of the characters <, >, or &, that character must be replaced with the corresponding escape sequence <, >, or &. For more information, see JavaScript Quick Reference. If the subdialog is modal, a fatal error occurs, causing the interpreter to exit. If the subdialog is non-modal, the interpreter causes the subdialogs context to return. It then rethrows the event in the calling context and starts its search for the event handler in that context.
309
TAGS
Usage Parents <bevocal:foreach> <block> <catch> <error> <filled> <form> <help> <if> <noinput> <nomatch> Children <audio> <catch> <enumerate> <error> <filled> <help> <noinput> <nomatch> <param> <prompt> <property> <value>
See Also Examples <?xml version="1.0" ?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" "http://cafe.bevocal.com/libraries/dtd/vxml2-0-bevocal.dtd"> <vxml version="2.0" xmlns="http://www.w3.org/2001/vxml"> <form id="main"> <block>Were about to call the sub dialog</block> <subdialog name="result" src="#subbie"> <param name="hello" value="goodbye"/> <param name="goodbye" value="goodbye"/> </subdialog> <block> Were back from the sub dialog. Result dot hello equals <value expr="result.hello"/> Result dot goodbye equals <value expr="result.goodbye"/> </block> </form> <form id="subbie">   <var name="hello" expr="hello"/>  <var name="goodbye"/> <block> This is the sub dialog. hello equals <value expr="hello"/> goodbye equals <value expr="goodbye"/> <return namelist="hello goodbye"/> </block> </form> </vxml> VoiceXML 2.0 Specification: <subdialog> Related tags: <param>, <return>
310
<submit>
<submit>
Submits values to a document server. The <submit> element is used to submit information to the origin web server and then transition to the document sent back in the response. Syntax <submit next="URI" expr="js_expression" method="get"|"post" enctype=MIME_type namelist="variable1 ..." fetchtimeout="time_interval" fetchaudio="URI" maxage="time_interval" maxstale="time_interval" /> Description Attribute next expr method enctype Description URI to which to submit the values. Optional (as alternative to expr). JavaScript expression that evaluates to the URI to which to submit the values. Optional (as alternative to next). The query request method. Optional (default is get). MIME encoding of the submitted document. Optional (default is application/x-www-form-urlencoded). The supported types are: application/x-www-form-urlencoded multipart/form-data The type multipart/form-data is more efficient when submitting large amounts of binary data. namelist Space-separated list of variables to submit. Optional (default is to submit all input variables that have been given explicit names with the name attribute of <field>, <record>, <transfer>, or <subdialog>). This attribute can specify any variable currently in scope, both VoiceXML variables and JavaScript variables, including shadow variables and other variables that have not been explicitly declared. A variable set to a JavaScript object is submitted as the individual component values; see Submitting Complex JavaScript Objects on page 46. fetchtimeout fetchaudio Specifies the interval to wait for the resource to be returned before throwing a error.badfetch event. See Handling Fetching Delays on page 42. Optional. Specifies the URI of background audio to be played during fetching. Note that this attribute and related properties affect whether queued prompts are played first. See Background Audio on page 42 for important details. Optional.
311
TAGS
Attribute maxage maxstale
Description New in VoiceXML 2.0. Specifies the maximum acceptable age, in seconds, of the cached resource. See Maximum Age on page 44. Optional. New in VoiceXML 2.0. Specifies the maximum acceptable time, in seconds, during which an expired cached resource can still be used. See Maximum Stale Time on page 44. Optional.
(VoiceXML 1.0 only) The following attributes can be used in applications in which the version attribute of the <vxml> tag is set to 1.0. Attribute caching Description VoiceXML 1.0 only. Specifies the caching policy for the resource being fetched. See Fetch Hints on page 41. Optional. Used in place of the VoiceXML 2.0 attributes maxage and maxstale. Tip: If a JavaScript expression contains any of the characters <, >, or &, that character must be replaced with the corresponding escape sequence <, >, or &. For more information, see JavaScript Quick Reference.
Usage Parents <bevocal:foreach> <block> <catch> <error> <filled> <help> <if> <noinput> <nomatch> See Also VoiceXML 2.0 Specification: <submit> Related tags: <goto>, <send> Children None.
312
<submit>
Examples <?xml version="1.0" ?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" "http://cafe.bevocal.com/libraries/dtd/vxml2-0-bevocal.dtd"> <vxml version="2.0" xmlns="http://www.w3.org/2001/vxml"> <form id="form-record"> <field name="name" type="boolean"> <prompt> Say yes or no to submit a request to Snoop Servlet </prompt> <filled> <if cond="name"> <submit maxage="0" method="post" next="http://www.yoursite.com:8080/docroot/SomeServlet"/> <else/> <prompt> Good bye! </prompt> <exit/> </if> </filled> </field> </form> </vxml>
313
TAGS
<tag>
New in VoiceXML 2.0. XML grammar element that specifies how to interpret the user input. Syntax <tag> slotName="value" </tag> Description This tag is used to define grammars in the XML form of the W3C Speech Recognition Grammar Format. In general, a tag is an arbitrary string that may be included within any rule expansion. You can include as many tags as you want within a single expansion. Tags do not affect what constitutes a legal utterance for a rule nor do they affect how the recognition proceeds. You use tags to return informationa semantic interpretationabout a recognition to the element that invoked the grammar. Upon successful recognition, the BeVocal VoiceXML interpreter will create a JavaScript object whose properties (slotname) and values (value) are determined by the tags occurring in the matched rule. If you want the BeVocal VoiceXML interpreter to make use of your tags for this purpose, they must be of the format specified above. See Chapter 1, Using VoiceXML Grammars in the Grammar Reference for information on how the interpreter will use the semantic interpretation. Usage Parents <rule> <item> Example The following grammar sets two different tags: <grammar ...> <rule id="coloredObject"> <ruleref id="color"/> <ruleref id="object"/> </rule> <rule id="color"> <one-of> <item> red <item> pink <item> yellow <item> canary <item> green <item> khaki </one-of> </rule> Children None.
<tag> <tag> <tag> <tag> <tag> <tag>
color="red" color="red" color="yellow" color="yellow" color="green" color="green"
</tag> </tag> </tag> </tag> </tag> </tag>
</item> </item> </item> </item> </item> </item>
314
<tag>
<rule id="object"> <one-of> <item> <tag> object="vehicle" </tag> <one-of> <item>truck</item> <item>car</item> </one-of> </item> <item> <tag> object="toy" </tag> <one-of> <item>ball</item> <item>block</item> </one-of> </item> <item> <tag> object="clothing" </tag> <one-of> <item>shirt</item> <item>blouse</item> </one-of> </item> </one-of> </rule> </grammar> This grammar recognizes phrases such as "green truck" or "khaki car". For both of those phrases, it will return the same semantic interpretation: { color: green; object: vehicle; } See Also Speech Recognition Grammar Specification: <tag> Chapter 1, Using VoiceXML Grammars and Chapter 4, XML Speech Grammar Format in the Grammar Reference
315
TAGS
<throw>
Throws an event. Syntax <throw event="event" eventexpr="js_expression" message="string" messageexpr="js_expression" /> Description You can throw either a predefined event or an application-specific event. For more information, see Chapter 3, Event Handling. Attribute event eventexpr message messageexpr Description Event to throw. Optional (as alternative to eventexpr). New in VoiceXML 2.0. JavaScript expression that evaluates to the event to throw. Optional (as alternative to event). New in VoiceXML 2.0. Message string providing additional context about the event being thrown. Optional. New in VoiceXML 2.0. XML expression that evaluates to the message string. Optional.
One and only one of the attributes event or eventexpr must be specified. At most one of the attributes message or messageexpr may be specified. Within an event handler that catches the thrown event, the variable Tip: If a JavaScript expression contains any of the characters <, >, or &, that character must be replaced with the corresponding escape sequence <, >, or &. For more information, see JavaScript Quick Reference.
Usage Parents <bevocal:foreach> <block> <catch> <error> <filled> <help> <if> <noinput> <nomatch> See Also
316
Children None.
VoiceXML 2.0 Specification: <throw> Related tags: <catch>, <error>, <help>, <noinput>, <nomatch>, <rethrow>
<throw>
Examples See other catch and throw examples under <catch> and <rethrow>. <?xml version="1.0" ?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" "http://cafe.bevocal.com/libraries/dtd/vxml2-0-bevocal.dtd"> <vxml version="2.0" xmlns="http://www.w3.org/2001/vxml"> <catch event="myEvent"> <prompt> Catch of my event at document scope. But here the value of my number has gone away. It sounds like <value expr="mynumber"/> </prompt> <reprompt/> </catch> <catch event="nomatch noinput" > <prompt> Catch at document scope. </prompt> <reprompt/> </catch> <property name="universals" value="help" /> <help> <prompt> Catch for help at document scope. </prompt> <reprompt/> </help> <form id="numbers"> <catch event="myEvent"> <prompt> Catch of my event at form scope. <value expr="_message"/> The value of my number has not gone away. It sounds like <value expr="mynumber"/> </prompt> <clear/> <rethrow/> </catch> <block name="numbergame"> This is a test of the throw tag. </block> <field name="mynumber" type="number"> <prompt> Tell me a number greater than ten. </prompt> <filled> <if cond="mynumber < 10"> <throw event="myEvent" messageexpr="mynumber + is less than ten"/> <elseif cond="mynumber > 10" />
317
TAGS
<prompt> You said <value expr="mynumber"/> </prompt> <else/> <throw event="myEvent" message="Ten is not greater than ten"/> </if> </filled> </field> </form> </vxml>
318
<token>
<token>
New in VoiceXML 2.0. XML grammar input element that specifies actual words to be spoken or DTMF keys to be pressed. Syntax <token xml:lang="lang" > Content </token> Description This tag is used to define grammars in the XML form of the W3C Speech Recognition Grammar Format. In a voice grammar, a token is consists of CDATA and cannot be enclosed in double quotes. In a DTMF grammar, a token must be one of the following characters: 0 1 2 3 4 5 6 7 8 9 * # A B C D Attribute xml:lang Description The language and optional country local identifier for the token. Optional (default is the language of the enclosing element). The accepted language identifiers are: enEnglish en-USUnited States English esSpanish es-USUnited States Spanish fr-caFrench Canadian If an unsupported language is specified, an error.unsupported.language event is thrown. Usage Parents <rule> <item> See Also Example <rule id="trigger"> <ruleref special="GARBAGE"/> <token>Trigger Fish</token> </rule> <rule id="affirmative"> <token>yes</token> <token xml:lang="es">si</token> </rule> Speech Recognition Grammar Specification: <token> Chapter 4, XML Speech Grammar Format in the Grammar Reference Children None.
319
TAGS
<transfer>
Transfers the users call to a third party at another destination. Syntax <transfer name="string" expr="js_expression" cond="js_expression" dest="URI" destexpr="js_expression" connecttimeout="time_interval" maxtime="time_interval" bevocal:maxtimeexpr="js_expression" bridge="boolean" bevocal:type="blind"|"bridge"|"supervised" type="blind"|"bridge"|"consultation" transferaudio="URI" aai="string" aaiexpr="js_expression" > Child Elements </transfer> Description Input item for transferring the user to another phone number. The transfer may be done in one of three ways: Transfer Method Bridge transfer Blind transfer Supervised or consultation transfer Description The current session with the interpreter resumes after the call with the third party completes. The current session terminates as soon as it starts the transfer, regardless of the success of the transfer. The current session terminates as soon as the transfer successfully connects the outbound call. If the transfer is unsuccessful, control returns to the application.
Note: To use blind or supervised transfers, contact BeVocal Customer Support. Only one outbound call can be in progress at a given time. The call placed by one <transfer> or <bevocal:dial> tag must be terminated before another <transfer> or <bevocal:dial> tag can be executed. Caf customers can use a <transfer> tag to place local and long-distance calls only; international calls are not allowed. (Hosting customers are allowed to make international calls.) During execution of the <transfer> element, the user and the called third party talk to each other. The application is quiet. No universal grammars or grammars in higher scopes are active. During a bridge transfer, if the <transfer> element includes any child grammars, the application listens to the user. If a user utterance matches a child grammar, the transfer is terminated and the input variable is set to the status near_end_disconnect. The bevocal.hotwordmin and bevocal.hotwordmax
320
<transfer>
properties specify the minimum and maximum time duration, respectively, for an utterance that can terminate the call. During execution of the <transfer> element, the bargein type is always hotword. That is, the interpreter ignores the value of the bargeintype property and the value of the bargeintype attribute of any enclosed <prompt> element. The application ignores all speech and DTMF signals from the called third party. Attribute name Description Name of the input variable used to store the result of the transfer. The variable name may not be a JavaScript reserved keyword. The input variable has dialog (form) scope; its name must be unique among all VoiceXML and JavaScript variables within the forms scope. Possible values for the variable are: busyThe call was refused by the endpoint (the number was busy). noanswerThere was no answer within the time allowed for making the connection. network_busyThe call was refused by an intermediate network. near_end_disconnectThe call completed because the user hung up or said something that matched a child grammar. far_end_disconnectThe call completed because the called third party hung up. network_disconnectThe call completed and was terminated by the network. maxtime_disconnectThe call was terminated because the maximum time (specified by the maxtime attribute) was exceeded. expr JavaScript expression that assigns the initial value of the input variable. Optional (default is undefined). If you set the input variable to a value other than undefined, youll need to clear it before the <transfer> can execute. cond JavaScript boolean expression that also must evaluate to true for the <transfer> to execute. Optional (default is true). If not specified, the value of the input variable alone determines whether or not the <transfer> can execute.
321
TAGS
Attribute dest
Description URI of the destination (for example, phone, IP telephony address). Optional (as alternative to destexpr). You can specify the URI using any of the following formats: phone://8005551212 800-555-1212 phone://800-555-1212 tel:800-555-1212 tel:800-555-1212;postd=1234 This format allows you to specify an extension as post-dial digits (postd). After the call is answered, the specified digits (1234 in this example) are sent to the called third party as DTMF. sip:+16506414924@fwd.pulver.com SIP URIs are specified as: sip:<destination number>@<domain value>:<port> where the default port is 5060. SIP URIs are valid only for VoIP calls.
Note: A leading 1 on the phone number is optional and will be ignored. Note: You must specify one of dest or destexpr, but not both. destexpr JavaScript expression that evaluates to the URI of the destination. Optional (as alternative to dest). Note: You must specify one of dest or destexpr, but not both. connecttimeout Time to wait for the outbound call to connect before returning a value of noanswer. Optional (default is 20 seconds). Express the time interval as an unsigned number followed by s for time in seconds; ms for time in milliseconds (the default). maxtime How long the outbound call is allowed to last. Optional: The default for Caf customers is 60 seconds. This is an absolute maximum. If the time specified with this attribute is longer than 60 seconds, the interpreter sets the limit to 60 seconds. The default for hosting customers is 0, signifying no limit. Express a time interval as an unsigned number followed by s for time in seconds; ms for time in milliseconds (the default). bevocal:maxtimeexpr Extension. A JavaScript expression which resolves to the maxtime value. Optional. Again, the default for Caf customers is 60 seconds. This is an absolute maximum. If the time specified with this attribute is longer than 60 seconds, the interpreter sets the limit to 60 seconds. The default for hosting customers is 0, signifying no limit.
322
<transfer>
Attribute bridge
Description Determines whether the current session will resume after the transferred call completed. For more flexibility, you can instead specify a value for the bevocal:type attribute. Optional . The default value is false. However, to use blind transfers, you must make special arrangements with BeVocal Customer Support. If you set this attribute to false and arrangements have not been made to support blind transfers, an error.unsupported.transfer.blind event is generated. If you set this attribute to false and arrangements have been made to support blind transers, the current session terminates by throwing a transfer event (connection.disconnect.transfer in VoiceXML 2.0, telephone.disconnect.transfer in VoiceXML 1.0) when the transfer is made.
bevocal:type
Extension. Determines how much control the platform retains over a transferred call. Optional; if not specified, the platform uses the value of the bridge attribute. If the value is blind, as soon as the transfer starts, the current VoiceXML session ends and relinquishes control to the outbound call. Even if the transfer does not connect the call, the session is over. If the value is bridge, then the current VoiceXML session remains active. At the end of the outbound call, the session returns control to the application to resume processing. If the value is supervised, an intermediate path occurs. The current VoiceXML session monitors the progress of the outbound call until it is connected. If the call cannot be connected for some reason such as no answer or line busy, the session remains active and returns control to the application. If the call is connected, then the session ends, just as for a blind transfer. You cannot specify both the bridge and bevocal:type attributes. You can specify either bevocal:type or type, but not both. If you specify both, a parse error is thrown. Note: To use blind or supervised transfers, please contact BeVocal Customer Support.
323
TAGS
Attribute type
Description Determines how much control the platform retains over a transferred call. Optional. The default is bridge. If the value is blind, as soon as the transfer starts, the current VoiceXML session ends and relinquishes control to the outbound call. Even if the transfer does not connect the call, the session is over. If the value is bridge, then the current VoiceXML session remains active. At the end of the outbound call, the session returns control to the application to resume processing. If the value is consultation, an intermediate path occurs. The current VoiceXML session monitors the progress of the outbound call until it is connected. If the call cannot be connected for some reason such as no answer or line busy, the session remains active and returns control to the application. If the call is connected, then the session ends, just as for a blind transfer. You cannot specify both the bridge and type attributes. You can specify either bevocal:type or type, but not both. If you specify both, a parse error is thrown. Note: To use blind or consultation transfers, please contact BeVocal Customer Support.
transferaudio
The audio specified by the URI is played while the call connection attempt is in progress. Note: This attribute is available only for VoIP calls. Application-to-application information. A string containing data sent to an application on the far-end, available in the session variable session.connection.aai. The transmission of aai data may depend upon signaling network gateways and data translation (for example, ISDN to SIP); the status of data sent to a remote site is not known or reported. On the BeVocal platform, on ISDN calls, the value transmitted is UUI (User-to-User Information)
aai
aaiexpr
A JavaScript expression yielding the AAI data.
Exactly one of dest or destexpr must be specified.
324
<transfer>
Properties of the Shadow Variable. Corresponding to the input variable name is a shadow variable called name$. After the input variable is filled, some additional information is available in the following properties of this shadow variable: Property utterance inputmode duration recording Description A string representation of the words actually spoken by the user. The mode in which input was provided, one of voice or dtmf. After the transfer is complete, the floating point duration of the call in milliseconds. If the recordutterance or bevocal.audio.capture property is set to true and the users speech matched the field grammar, the recording property contains an audio capture of the users speech. If no audio is collected, this variable is undefined. You can send the captured audio to a server using a <data> element. Doing so is useful if you need a record of the users speech for legal reasons. recordingsize recordduration The size of the recording in bytes, or undefined if no audio is collected. The duration of the recording in milliseconds, or undefined if no audio is collected.
For a field whose name is name, you access the property propName of the shadow variable with the syntax: name$.propName For example, you access the duration property for a field named services field as: services$.duration
325
TAGS
Events. The following events may be thrown during the execution of a <transfer> element: Event connection.disconnect.hangup connection.disconnect.transfer telephone.disconnect.hangup telephone.disconnect.transfer error.badfetch error.connection.baddestination error.telephone.baddestination error.connection.noauthorization Description The user hung up. New in VoiceXML 2.0. The user was transferred unconditionally to another line and will not return. New in VoiceXML 2.0. The user hung up. VoiceXML 1.0 only. The user was transferred unconditionally to another line and will not return. VoiceXML 1.0 only. The application specified a value for both bridge and bevocal:type. The destination URI specified by dest or destexpr is not valid. New in VoiceXML 2.0. The destination URI specified by dest or destexpr is not valid. VoiceXML 1.0 only. A Caf customer attempted to make an international call. (The outbound call is not placed.) New in VoiceXML 2.0. A Caf customer attempted to make an international call. (The outbound call is not placed.) VoiceXML 1.0 only. Another outbound call is already in progress. (A new outbound call is not placed.) New in VoiceXML 2.0. Another outbound call is already in progress. (A new outbound call is not placed.) VoiceXML 1.0 only.
error.telephone.noauthorization
error.connection.noresource error.telephone.noresource
Usage Parents <form> Children <audio> <catch> <enumerate> <error> <filled> <grammar> <help> <noinput> <nomatch> <prompt> <value>
See Also VoiceXML 2.0 Specification: <transfer>
326
<transfer>
Examples
Related tags: <bevocal:dial>, <disconnect>
<?xml version="1.0" ?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" "http://cafe.bevocal.com/libraries/dtd/vxml2-0-bevocal.dtd"> <vxml version="2.0" xmlns="http://www.w3.org/2001/vxml"> <form id="foo"> <block> Now taking you to BeVocal Consumer Services. </block> <transfer name="services" bridge="true" connecttimeout="300" dest="phone://14088502255" /> <block> Welcome back to Cafe! </block> </form> </vxml>
327
TAGS
<value>
Inserts the value of a expression into audio output. Syntax <value expr="js_expression" /> Description Attribute expr Description JavaScript expression to evaluate and insert in the prompt. Note: In VoiceXML 2.0, the recorded audio stored in the input variable of a <record> item can now be played with <audio expr="..."> . The expr attribute of <value expr="..."> is deprecated for playing recorded audio, and in the next release will no longer be able to play it. (VoiceXML 1.0 only) The following attributes can be used in applications in which the version attribute of the <vxml> tag is set to 1.0. Attribute class type Description VoiceXML 1.0 only. The <say-as> type of the variable for interpretive purposes. Optional. The type attribute is the VoiceXML 1.0 standard. These two attributes are identical; only one of them should be used. Note: A VoiceXML 2.0 application should not use either of these attributes, but should instead enclose the <value> tag within a <say-as> element. There is no implicit <say-as> type invoked when you use <value>. See <say-as> for details. The following event may be thrown during the execution of a <say-as> element: error.badfetchThrown in VoiceXML 2.0 when any of the following attributes are present: class, mode or recsrc. These attributes are not valid for version 2.0. Tip: If a JavaScript expression contains any of the characters <, >, or &, that character must be replaced with the corresponding escape sequence <, >, or &. For more information, see JavaScript Quick Reference.
328
<value>
Usage Parents <audio> <bevocal:foreach> <bevocal:listen> <bevocal:register> <bevocal:verify> <bevocal:whisper> <block> <catch> <choice> <emphasis> <enumerate> <error> <field> <filled> <help> <if> <initial> <log> <menu> <noinput> <nomatch> <p> <paragraph> <prompt> <prosody> <record> <s> <say-as> <sentence> <subdialog> <transfer> <voice> See Also VoiceXML 2.0 Specification: <value> Related tags: <assign>, <var>, <field>, <record>, <subdialog>, <transfer>, <block>, <initial> <say-as> Children None.
329
TAGS
Examples Example 1with input variable: <?xml version="1.0" ?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" "http://cafe.bevocal.com/libraries/dtd/vxml2-0-bevocal.dtd"> <vxml version="2.0" xmlns="http://www.w3.org/2001/vxml"> <form> <field name="state"> <prompt> Do you want texas or california? </prompt> <grammar type="application/x-nuance-gsl"> [ texas california ] </grammar> </field> <block> <prompt> Your answer was <value expr="state"/> </prompt> </block> </form> </vxml> Example 2using assignment: <?xml version="1.0" ?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" "http://cafe.bevocal.com/libraries/dtd/vxml2-0-bevocal.dtd"> <vxml version="2.0" xmlns="http://www.w3.org/2001/vxml"> <property name="universals" value="help" /> <form id="myCalculator"> <var name="result"/> <field name="op">  <prompt> BeVocal calculator. Choose add, subtract, multiply, or divide. </prompt> <grammar type="application/x-nuance-gsl"> [add subtract multiply divide] </grammar> <help>Say add, subtract, multiply, or divide.</help> <filled> <prompt>Okay, lets <value expr="op"/> two numbers.</prompt> </filled> </field> <field name="a" type="number">  <prompt>Whats the first number?</prompt> <help> Please say a number. This number will be used as the first operand. </help> <filled>
330
<value>
<prompt><value expr="a"/></prompt> </filled> </field> <field name="b" type="number">  <prompt>and the second number?</prompt> <help> Please say a number. This number will be used as the second operand. </help> <filled> <prompt><value expr="b"/> Okay.</prompt>  <if cond="op==add">  <assign name="result" expr="Number(a) + Number(b)"/> <prompt> <value expr="a"/> plus <value expr="b"/> equals <value expr="result"/> </prompt> <elseif cond="op==subtract"/> <assign name="result" expr="a - b"/> <prompt> <value expr="a"/> minus <value expr="b"/> equals <value expr="result"/> </prompt> <elseif cond="op==multiply"/> <assign name="result" expr="a * b"/> <prompt> <value expr="a"/> times <value expr="b"/> equals <value expr="result"/> </prompt> <elseif cond="op==divide"/> <assign name="result" expr="a / b"/> <prompt> <value expr="a"/> divided by <value expr="b"/> equals <value expr="result"/> </prompt> </if> <clear/> </filled> </field> </form> </vxml>
331
TAGS
<var>
Declares a variable. Syntax <var name="string" expr="js_expression" /> Description Declaration of a variable in the scope of the enclosing element. Attribute name Description Variable name, which must be a valid JavaScript identifier that does not begin with the underscore character (_) or end with the dollar-sign character ($); it may not be a reserved keyword in either JavaScript or Java. JavaScript expression that assigns the initial value of the variable. Optional (default is either undefined or the current value of the variable, if already declared in this scope).
expr
If a <var> element names a variable that is already in scope, it declares a new variable with the same name. If the <var> element has an expr attribute, the variable is assigned the specified value; otherwise, the variable is assigned the value undefined. Tip: If a JavaScript expression contains any of the characters <, >, or &, that character must be replaced with the corresponding escape sequence <, >, or &. For more information, see JavaScript Quick Reference.
Usage Parents <bevocal:foreach> <block> <catch> <error> <filled> <form> <help> <if> <noinput> <nomatch> <vxml> See Also VoiceXML 2.0 Specification: <var> Related tags: <assign>, <value> Children None.
332
<var>
Examples <?xml version="1.0" ?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" "http://cafe.bevocal.com/libraries/dtd/vxml2-0-bevocal.dtd"> <vxml version="2.0" xmlns="http://www.w3.org/2001/vxml"> <form id="foo"> <var name="a"/> <var name="b"/> <var name="result"/> <block> <assign name="a" expr="10"/> <assign name="b" expr="35"/> <assign name="result" expr="a * b"/> </block> <block> <prompt> <value expr="a"/> multiplied by <value expr="b"/> equals <value expr="result"/> </prompt> </block> </form> </vxml>
333
TAGS
<voice>
New in VoiceXML 2.0. Speech Synthesis Markup Language element that requests a change in speaking voice. Syntax <voice gender="male"|"female"|"neutral" age="integer" variant="integer" name="string" xml:lang="string" > Text </voice> Description Attributes describe the preferred characteristics of the voice to speak the contained text. Attribute gender Description The preferred gender. Optional. Possible values are: male female neutral age variant name Not Implemented. The preferred age. Optional. Not Implemented. A preferred variant of the other voice characteristics (for example, the second or next male child voice). Optional. A platform-specific voice name to speak the contained text or a space-separated list of names ordered by decreasing preference. Currently limited to one of the following: jenniferfemale, American English lauriefemale, Vocalizer English katarinafemale, RealSpeak German mariafemale, Vocalizer Spanish markmale, American English reedmale, Vocalizer American English Optional (default is current value of the bevocal.voice.name property). xml:lang Errors An error.noresource event is thrown when an invalid TTS voice is specified in the name attribute. New in VoiceXML 2.0. Language and locale for the content. Optional. (default is en-US).
334
<voice>
Usage Parents <audio> <bevocal:whisper> <choice> <emphasis> <enumerate> <p> <paragraph> <prompt> <prosody> <s> <sentence> <voice> Children <audio> <break> <emphasis> <enumerate> <mark> <p> <paragraph> <phoneme> <prosody> <s> <say-as> <sentence> <value> <voice>
See Also VoiceXML 2.0 Specification: <voice> Related tags: <break>, <emphasis>, <mark>, <paragraph>, <phoneme>, <prosody>, <say-as>, <sentence>
335
TAGS
<vxml>
Contains the VoiceXML code of a document. Syntax <vxml application="URI" xml:base="URI" xml:lang="lang" version="version_number" xmlns="namespace" xmlns:bevocal="bvnamespace" > Child Elements </vxml> Description Top-level element in each VoiceXML document. Note that in VoiceXML 2.0 the xmlns attribute has been added as a required attribute. Attribute application Description URI of this applications root document. Optional (default is not to have an application root document). If the specified document also has an application attribute, an error.semantic event is thrown. xml:base xml:lang Base URI. Optional. New in VoiceXML 2.0. Language and locale for this document. Optional. The language identifier for this document. A language identifier labels information content as being of a particular human language variant. A legal language identifier is identified by an RFC 3066 code. BeVocal currently supports English, Spanish, and French Canadian. Valid values are: enEnglish en-USUnited States English esSpanish es-USUnited States Spanish fr-caFrench Canadian The default value is en-US. If an unsupported language is specified, an error.unsupported.language event is thrown. If your application consists of multiple documents and is using a language other than the default language, each document must specify the xml:lang attribute. version VoiceXML version used in this document. The version number must be either 1.0 or 2.0.
336
<vxml>
Attribute xmlns
Description New in VoiceXML 2.0. The designated namespace for VoiceXML 2.0. The namespace for VoiceXML is defined as http://www.w3.org/2001/vxml; this must be the value of this attribute. Required. Extension. Defines the XML namespace bevocal. Optional (default is not to define the namespace). The only valid value for this attribute is http://www.bevocal.com/. You must specify this attribute if this document includes any BeVocal VoiceXML extension tags in the bevocal namespace, such as <bevocal:dial>.
xmlns:bevocal
(VoiceXML 1.0 only) The following attributes can be used in applications in which the version attribute of the <vxml> tag is set to 1.0. Attribute lang Description VoiceXML 1.0 only. Language and locale for this document. Optional. This attribute is ignored. Used in place of the VoiceXML 2.0 xml:lang attribute. base Usage Parents None. Children <catch> <data> <error> <form> <help> <link> <menu> <meta> <metadata> <noinput> <nomatch> <property> <script> <var> VoiceXML 1.0 only. Base URI. Optional.
See Also Examples <?xml version="1.0" ?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" "http://cafe.bevocal.com/libraries/dtd/vxml2-0-bevocal.dtd"> <vxml version="2.0" xmlns="http://www.w3.org/2001/vxml"> <form id="foo"> <block> <prompt> Hello Developers! You are the VXML Gurus. Please keep using our services.
VoiceXML 2.0 Specification: <vxml>
TAGS
</prompt> </block> </form> </vxml>
338
12
Properties
This chapter describes the VoiceXML properties that can be set with the <property> tag. See: Property Summary on page 340 for an overview of the properties, grouped by function Property Index on page 341 for an alphabetical list of all properties
Each property description explains what the property does, what possible values the property may have, and what default value is used if the application does not set the property. In the cases where the BeVocal VoiceXML interpreter deviates from the VoiceXML 2.0 Specification, the difference is clearly marked below in the following ways: Not implementedFunctionality not currently available. ExtensionAdded functionality. DeprecatedNon-standard or superseded feature that was supported by an earlier version but has been replaced by a new feature. VoiceXML 1.0 onlyProperty is part of the VoiceXML 1.0 standard, but has been removed from VoiceXML 2.0.
PROPERTIES
Property Summary
Properties control the following facet of the interpreter. Facet User Input Properties bargein bargeintype New in VoiceXML 2.0 bevocal.hotwordmax Extension bevocal.hotwordmin Extension bevocal.incrementErrorOnNSP Extension inputmodes timeout bevocal.audio.capture Extension bevocal.audio.outputvolume Extension bevocal.finaltimeout Extension bevocal.grammar.interpretationtype Extension bevocal.grammar.phoneticpruning Extension bevocal.grammar.weightfactor Extension bevocal.grammar.wordtransitionpenalty Extension bevocal.maxdialogerrors Extension bevocal.maxerrors Extension bevocal.maxinterpretations Extension bevocal.sounds.listening Extension bevocal.sounds.maskrecognitionlatency Extension bevocal.sounds.recognition Extension bevocal.utterance.prefix Extension bevocal.vxml.maxrecognitionlatency Extension completetimeout confidencelevel incompletetimeout maxnbest New in VoiceXML 2.0 maxspeechtimeout recordutterance recordutterancetype sensitivity speedvsaccuracy bevocal.voice.name Extension bevocal.dtmf.flushbuffer Extension bevocal.transfer.terminatetones Extension interdigittimeout termchar termtimeout bevocal.fetchaudio.allfetches Extension bevocal.fetchaudio.extend Extension bevocal.fetchaudio.flushqueue Extension bevocal.fetchaudio.sounds Extension fetchaudio fetchaudiodelay New in VoiceXML 2.0 fetchaudiominimum New in VoiceXML 2.0
Speech Recognition
Text-to-Speech Output DTMF Recognition
Background Audio
340
Property Index
Facet Fetching Resources
Properties audiofetchhint audiomaxage New in VoiceXML 2.0 audiomaxstale New in VoiceXML 2.0 caching VoiceXML 1.0 only datafetchhint Extension datamaxage Extension datamaxstale Extension documentfetchhint documentmaxage New in VoiceXML 2.0 documentmaxstale New in VoiceXML 2.0 fetchtimeout grammarfetchhint grammarmaxage New in VoiceXML 2.0 grammarmaxstale New in VoiceXML 2.0 scriptfetchhint scriptmaxage New in VoiceXML 2.0 scriptmaxstale New in VoiceXML 2.0 ssmlfetchhint Extension ssmlmaxage Extension ssmlmaxstale Extension universals New in VoiceXML 2.0 bevocal.logging Extension bevocal.securelogging.enabled Extension bevocal.securelogging.key Extension bevocal.goback Extension bevocal.mingoback Extension bevocal.security.key Extension bevocal.locale
Universal Grammars Logging Calls
Go-Back Facility Access to BeVocal Platform services Language/Locale
Property Index
The following table lists the properties in alphabetical order. Property audiofetchhint audiomaxage New in VoiceXML 2.0 audiomaxstale New in VoiceXML 2.0 bargein bargeintype New in VoiceXML 2.0 bevocal.audio.capture Extension bevocal.audio.outputvolume Extension bevocal.dtmf.flushbuffer Extension bevocal.fetchaudio.allfetches Extension Controls Fetching Fetching Fetching User Input User Input Speech recognition Speech recognition DTMF recognition Background audio
341
PROPERTIES
Property bevocal.fetchaudio.extend Extension bevocal.fetchaudio.flushqueue Extension bevocal.fetchaudio.sounds Extension bevocal.finaltimeout Extension bevocal.grammar.interpretationtype Extension bevocal.grammar.phoneticpruning Extension bevocal.grammar.weightfactor Extension bevocal.grammar.wordtransitionpenalty Extension bevocal.goback Extension bevocal.hotwordmax Extension bevocal.hotwordmin Extension bevocal.incrementErrorOnNSP Extension bevocal.locale bevocal.logging Extension bevocal.maxdialogerrors Extension bevocal.maxerrors Extension bevocal.maxinterpretations Extension bevocal.mingoback Extension bevocal.securelogging.enabled Extension bevocal.securelogging.key Extension bevocal.security.key Extension bevocal.sounds.listening Extension bevocal.sounds.maskrecognitionlatency Extension bevocal.sounds.recognition Extension bevocal.transfer.terminatetones Extension bevocal.utterance.prefix Extension bevocal.voice.name Extension bevocal.vxml.maxrecognitionlatency Extension caching completetimeout confidencelevel datafetchhint Extension datamaxage Extension datamaxstale Extension documentfetchhint documentmaxage New in VoiceXML 2.0 documentmaxstale New in VoiceXML 2.0 fetchaudio
342
Controls Background audio Background audio Background audio Speech recognition Speech recognition Speech recognition Speech recognition Speech recognition Go-back facility User Input User Input User Input Language/Locale Logging calls Speech errors Speech errors Speech errors Go-back facility Logging calls Logging calls Access to BeVocal Platform services Speech recognition Speech recognition Speech recognition DTMF recognition Speech recognition Text-to-speech output Speech recognition Fetching Speech recognition Speech recognition Fetching Fetching Fetching Fetching Fetching Fetching Background audio
Property Descriptions
Property fetchaudiodelay New in VoiceXML 2.0 fetchaudiominimum New in VoiceXML 2.0 fetchtimeout grammarfetchhint grammarmaxage New in VoiceXML 2.0 grammarmaxstale New in VoiceXML 2.0 incompletetimeout inputmodes interdigittimeout maxnbest New in VoiceXML 2.0 maxspeechtimeout recordutterance recordutterancetype scriptfetchhint scriptmaxage New in VoiceXML 2.0 scriptmaxstale New in VoiceXML 2.0 sensitivity speedvsaccuracy ssmlfetchhint Extension ssmlmaxage Extension ssmlmaxstale Extension termchar termtimeout timeout universals New in VoiceXML 2.0
Controls Background audio Background audio Fetching Fetching Fetching Fetching Speech recognition User Input DTMF recognition Fetching Speech recognition Speech recognition Speech recognition Fetching Fetching Fetching Speech recognition Speech recognition Fetching Fetching Fetching DTMF recognition DTMF recognition User Input Universal grammars
This section contains property descriptions is alphabetical order. audiofetchhint The audiofetchhint property tells the interpreter whether it can attempt to optimize dialog interpretation by prefetching audio files. The value is one of: safeFetch audio files only when they are needed, never before. prefetchPermit, but do not require, the interpreter to prefetch audio files.
The default value is prefetch.
343
PROPERTIES
audiomaxage New in VoiceXML 2.0. The audiomaxage property specifies the maximum acceptable age, in seconds, of cached audio resources. The value is a time interval expressed as an unsigned number followed by s for time in seconds (the default); ms for time in milliseconds. An unexpired cached file that does not exceed the maximum age will be used; a cached file that exceeds the maximum will be fetched again. When no value is set for this property: If the <vxml> tags version attribute is 2.0 or greater, unexpired cached audio files are used. If the <vxml> tags version attribute is 1.0, the caching property controls whether unexpired cached audio files are used.
By default, no value is set for this property. audiomaxstale New in VoiceXML 2.0. The audiomaxstale property specifies the maximum acceptable time, in seconds, during which expired cached audio resources can still be used. The value is a time interval expressed as an unsigned number followed by s for time in seconds (the default); ms for time in milliseconds. A cached file that has been expired for more that the maximum stale time will be refetched; one that has been stale for less than or equal to the maximum stale time will be used. The default value is 300 seconds (5 minutes). This default results in much faster performance for most applications, since it greatly reduces the number of times the interpreter must send Get-If-Modified requests to the HTTP server. If you want the behavior defined in the VoiceXML specification (always do a Get-If-Modified request if there was no Expires header), set audiomaxstale to 0. bargein The bargein property indicates whether the user can barge in during prompts and audio output from the application. Set this property to true to allow barge-in; set it to false to disallow barge-in. When this property is set to false, any DTMF input buffered in a transition state is deleted from the buffer. It is not saved for use in the next recognition state. The default value is true. bargeintype New in VoiceXML 2.0. The bargeintype property indicates what kind of input can interrupt the prompt or audio output. This property is relevant only when bargein is true. The value is one of: hotwordUser input that doesnt match the grammar is ignored and only speech that matches a grammar can interrupt the prompt. speechAny user utterance can interrupt the prompt.
Inside a bridge transfer, the value of the bargeintype property is ignored. For a bridge transfer, the bargein type is always hotword. If this property is set to hotword, the bevocal.hotwordmax and bevocal.hotwordmin properties determine how long a hotword utterance can be. The default value is speech.
344
bevocal.audio.capture Extension. The value of the bevocal.audio.capture property controls whether the interpreter captures spoken audio for each recognition. If this property is true, the interpreter captures spoken audio; if this property is false, the interpreter does not capture audio. The captured audio is available to the application in variables: When a successful recognition occurs for a field, the audio capture is available in the audio property of the fields shadow variable (fieldname$.audio) and in the application variable application.lastaudio$. When a no-match event occurs in a field, the audio capture is available in application.lastaudio$ only; fieldname$.audio is cleared. When a no-input event occurs in a field, both application.lastaudio$ and fieldname$.audio are cleared. When the users speech successfully matches a form grammar, a menu choice, or a link grammar, the audio capture is available in application.lastaudio$.
You can send the captured audio to a server using a <data> element. Doing so is useful if you need a record of the users speech for legal reasons. You can also access captured utterance files for later analysis, using the Media Access Service. For this purpose, you may want to use the bevocal.utterance.prefix property to modify the filenames of the utterance files. Note: Capturing audio causes a minor delay after each speech recognition for which bevocal.audio.capture is true. You should set this property to true only locally inside specific fields in which the capability is needed. The default value is false. bevocal.audio.outputvolume The value of the bevocal.audio.outputvolume property controls the output volume from the platform. Higher values play the audio prompts louder. You may need to tune this property upward if you expect your application to be used in noisy environments. The default value is 0.5. The minimum is 0.0 and the maximum is 1.0. bevocal.dtmf.flushbuffer Extension. The bevocal.dtmf.flushbuffer property controls whether or not DTMF key presses that are queued during the transitioning state are available for the next recognition state. If this property is true, the DTMF buffer is cleared upon entering the next recognition state. If false, the keys in the buffer are used for recognition in the next recognition state. For more details, see Collecting Input and Playing Prompts on page 19. The default value is true. bevocal.fetchaudio.allfetches Extension. The bevocal.fetchaudio.allfetches property controls whether background audio is played during the execution of a BeVocal VoiceXML extension tag that performs a fetch. If this property is true, any background audio is played during execution of a BeVocal VoiceXML extension tag. If this property is false, the background audio is not played for extension tags. Currently <data> is the only tag for which this property is relevant.
345
PROPERTIES
The default value is false. bevocal.fetchaudio.extend Extension. The bevocal.fetchaudio.extend property controls whether background audio is played until the next audio event. If this property is false, any background audio is played until the fetch operation completes. If this property is true, the background audio is continued until the next audio event, for example, a prompt played by the interpreter, or user input that triggers speech recognition. Setting bevocal.fetchaudio.extend to true can give your application a more seamless audio interface. The default value is false. bevocal.fetchaudio.flushqueue Extension. The bevocal.fetchaudio.flushqueue property allows queued prompts to be played during fetches even if the fetchaudio property is not set. If the bevocal.fetchaudio.flushqueue property is true then the behavior of playing queued prompts is as if the fetchaudio property is set. That is, when true, queued prompts will be played during a <goto>, <submit>, or other operation which fetches VoiceXML data (or in a special case <data>). When false then queued prompts will be played or not played during fetches based on the fetchaudio property. See Queued Prompts when Fetching on page 43. The default value is false. bevocal.fetchaudio.sounds Extension. The bevocal.fetchaudio.sounds property controls whether the interpreter plays a sound before and after playing background audio. If this property is true, a short sound is played just before and after the background audio. If this property is false, background audio is played without any sounds to delimit its start and end. If you sent any bevocal.sounds.* properties to true, you may wish to set this property to true also. See bevocal.sounds.recognition and bevocal.sounds.listening. The default value is false. bevocal.finaltimeout Extension. The bevocal.finaltimeout property is the amount of time to wait for additional input after the speech-recognition engine has recognized speech which matches one of the input grammars but that match is not the longest legal match. That is, the user has said something that matches the grammar, but if he continues speaking, the longer utterance might also match the grammar. This property interacts with the completetimeout and incompletetimeout properties. Consider the following very simple ABNF grammar: #ABNF 1.0; $rule = why hello [world]; Which property is in play depends upon what the user has said up until this point: If the user has said "why", the incompletetimeout property is in play but neither of the other two properties is in play, because that utterance cannot match the grammar. If the user has said "why hello", the bevocal.finaltimeout property is in play because that utterance may be a complete match to the grammar but may also be a partial match. The incompletetimeout property is not in play because "why hello" does match the grammar and the
346
completetimeout property is not in play because the user still has a legal path to the longest legal match to the grammar. If the user has said "why hello world", only the completetimeout property is in play because this is the longest legal match to the grammar.
The default value is 0.75s; that is, three-quarters of a second. bevocal.goback Extension. The bevocal.goback property specifies whether the parent element is a legal go-back destination. Set this property to true to make the parent element a legal go-back destination; set it to false to disallow going back to the parent element. You can set this property to false in a form to disable the go-back facility in that form. You can set it to false in a document to disable the go-back facility in that entire document. Note: The go-back facility is an experimental extension to BeVocal VoiceXML. See Go-Back Facility on page 79. The default value is true. bevocal.grammar.interpretationtype Extension. The bevocal.grammar.interpretationtype property specifies whether to use the NL engine in standard mode (full) or in robust mode (robust). Setting this property to robust facilitates the NL engines interpretation of more spontaneous utterances from SLM grammars. Note: This property is relevant only when recognizing against SLM grammars. It should always be set to its default value when recognizing against conventional grammars. For information on using SLM grammars, see Chapter 8, Nuance SayAnything Grammars in the Grammar Reference. You should set this property to robust only for recognizing against SLM grammars. You should set it back to full when recognizing against normal grammars. The default value is full. bevocal.grammar.phoneticpruning Extension. The bevocal.grammar.phoneticpruning property specifies whether the recognizer should perform phonetic pruning. For SLM grammars, set this parameter to true except for grammars with small vocabularies. Note: This property is relevant only when recognizing against SLM grammars. It should always be set to its default value when recognizing against conventional grammars. For information on using SLM grammars, see Chapter 8, Nuance SayAnything Grammars in the Grammar Reference. The default value is true. bevocal.grammar.weightfactor Extension. The bevocal.grammar.weightfactor property controls the relative weighting of acoustic and linguistic scores during recognition. As this value increases, the recognizer runs faster and hence the value of the speedvsaccuracy property should be increased to get better recognition. The corresponding speech engine property is in the range between 0 and 100. For well-trained SLM grammars, the optimum value is between 0.58 and 0.6, corresponding to the range of 9-10 in the speech engine.
347
PROPERTIES
Note: This property is relevant only when recognizing against SLM grammars. It should always be set to its default value when recognizing against conventional grammars. For information on using SLM grammars, see Chapter 8, Nuance SayAnything Grammars in the Grammar Reference. The default value is 0.5. This maps to a setting of 5 in the speech engine. bevocal.grammar.wordtransitionpenalty Extension. The bevocal.grammar.wordtransitionpenalty property controls the word transition weight. This is the trade-off between inserted and deleted words. For SLM based grammars, the optimal value is in the range 0 to 50. Note: This property is relevant only when recognizing against SLM grammars. It should always be set to its default value when recognizing against conventional grammars. For information on using SLM grammars, see Chapter 8, Nuance SayAnything Grammars in the Grammar Reference. The default value is -200. bevocal.hotwordmax Extension. The bevocal.hotwordmax property specifies the maximum time duration of an utterance that can be recognized as a request to interrupt an operation. Recognition of a user utterance can interrupt the following operations: Playing a prompt or audio output when the bargein property is true. An outbound call performed by a <transfer> element that contains a child grammar. An outbound call initiated by a <bevocal:dial> element that contains a child grammar.
If you want to allow users to interrupt with multisyllable words, or multiword commands, you can adjust bevocal.hotwordmax upward. The default value is 1.7 seconds. bevocal.hotwordmin Extension. The bevocal.hotwordmin property specifies the minimum time duration of an utterance that can be recognized as a request to interrupt an operation. Recognition of a user utterance can interrupt the following operations: Playing a prompt or audio output when the bargein property is true. An outbound call performed by a <transfer> element that contains a child grammar. During the execution of a <bevocal:listen> element that contains a child grammar.
If you want to allow users to interrupt only with very short, single-syllable words, you can adjust bevocal.hotwordmin downward. The default value is 0.1 seconds. bevocal.incrementErrorOnNSP Extension. The bevocal.incrementErrorOnNSP property specifies whether to count "no speech" timeouts (NSP) as an error. If the property is set to a value of false, the NSPs are not counted as errors. The default value is true. bevocal.locale Extension. The bevocal.locale property specifies the locale to use for recognition, allowing the application to switch to a different locale within a document. The value must be a legal language identifier as identified by an RFC 3066 code. BeVocal currently supports English and Spanish. Valid values are:
348
enEnglish en-USUnited States English esSpanish es-USUnited States Spanish fr-caFrench Canadian
The default value is en-US. If an unsupported language is specified, an error.unsupported.language event is thrown. Note: The xml:lang attribute of the <vxml> tag overrides the value of this property. Also, this property may be deprecated in a future release. Therefore, as a rule, you should use the xml:lang attribute of the <vxml> tag for specifying a locale. bevocal.logging Extension. The bevocal.logging property allows you to control whether all user input is recorded to the trace log which you see when you use the Log Browser tool. The value is one of: trueall user input is recorded in the log. falseuser input is not recorded.
If you want to prevent recording user input a particular field, dialog, or document for security reasons, you can set this property to false. The default value is true. bevocal.maxdialogerrors Extension. The value of the bevocal.maxdialogerrors property is the maximum number of speech errors that can occur within a particular execution of a dialog. A speech error is either a recognition error, which normally results in a no-match event, or a timeout while waiting for user input, which normally results in a no-input event. If you set the bevocal.maxdialogerrors property to 5, then on the fifth error in a particular form, an error.bevocal.maxdialogerrors_exceeded event is thrown (instead of a no-match or no-input event). The default value is 50. A value of 0 means that there is no limit on the number of errors. bevocal.maxerrors Extension. The value of the bevocal.maxerrors property is the maximum number of speech errors that can occur during the entire call. A speech error is either a recognition error, which normally results in a no-match event, or a timeout while waiting for user input, which normally results in a no-input event. If you set the bevocal.maxerrors property to 10, then on the tenth error in the call, an error.bevocal.maxerrors_exceeded event is thrown (instead of a no-match or no-input event). The default value is 100. A value of 0 means that there is no limit on the number of errors. bevocal.maxinterpretations Extension. The value of the bevocal.maxinterpretations property is used to enable or disable multiple interpretations: If the value is undefined or less than 1, both multiple-recognition features are controlled by the maxnbest property. If maxnbest is 1, both features are disabled; if maxnbest is greater than 1, both features are enabled. If the value is 1, multiple interpretations are disabled (independently of whether N-best recognition is enabled).
349
PROPERTIES
If the value is greater than 1, multiple interpretations are enabled (independently of whether N-best recognition is enabled).
For additional information about multiple recognition results, see Chapter 5, Using Multiple-Recognition. The value of bevocal.maxinterpretations is used in combination with the value of maxnbest to determine the maximum number of results that can be returned by the speech-recognition engine. See Maximum Array Size on page 366. The default value is undefined. Tip: For better performance, if you anticipate that the spoken inputs do not sound similar but that a valid spoken input might be ambiguous, leave the value of the maxnbest property as 1 and set the bevocal.maxinterpretations property to a number greater than 1.
bevocal.mingoback Extension. The value of the bevocal.mingoback property is the minimum size of the go-back stack. The interpreter keeps at least this many entries on the stack, except at the beginning of the call when fewer steps have been executed, and after the user has said go back so many consecutive times that the stack has been depleted. Note: The go-back facility is an experimental extension to BeVocal VoiceXML. See Chapter 7, Go-Back Facility. The default value is 0, meaning that the go-back stack is always empty and the go-back facility is effectively disabled. bevocal.securelogging.enabled Extension. The bevocal.securelogging.enabled property allows you to control whether the log files and utterance files associated with the call are encrypted before being printed to the log. If secure logging is enabled and utterances are captured, then the captured utterances are also encrypted. The value is one of: truelog contents are encrypted. falselog contents are not encrypted.
The default value is false. Note: If securelogging is enabled for a particular call, you will not be able to use VocalPlayer to listen to the call. You must download the desired files using the Media Access and Vendor Log Platform Services and decrypt them locally. For more information on secure logging, see bevocal.securelogging.key on page 350. bevocal.securelogging.key Extension. For secure logging, you must provide a public part of the public/private key pair using the bevocal.securelogging.key property. The public part of the key must be specified in Base64 encoded format. For example: <property name="bevocal.securelogging.key" value="MIGfMA0GCSqGSIb3DQEBAQUAA4GNADCBiQKBgQCDfG7T9R3YGMKlw4SnlWUSYqhDDxhWN9S oLOdGKQ1JctTQf0uvh8HJjuC8EO+a44kfS+5z7PrI9n5MM3D8Sb+6jsEgZqe6NqWzqEXhlOehfchMD kls6Mr7wLhECaZnFEjDvLKQ404McPtDEynhpvKlJKlRzcx3O/ROh1A0J9/eDwIDAQAB"/> If you are using an implementation of Java Cryptography Extension (JCE) to generate a public/private key pair, get the encoded bytes of the public key and do a Base64 encode of the bytes.
350
If you are using OpenSSL to create a public/private key pair, generate the key in DER format and encode the binary bytes with Base64 encoding. Alternately, you can copy the ASCII string from a PEM-formatted version of the key, which is effectively a Base64 representation of the DER format. The BeVocal VXML Interpreter generates a sessionKey and encrypts it using this public key using the RSA/ECB/OAIPPadding algorithm. This public/private encryption has been tested with Bouncy Castles implementation of JCE. Using the sessionKey, the relevant parts of the log are encrypted and printed to the log. The encrypted sessionKey is printed to the log in HEX format. You decrypt the sessionKey using the private key and then decrypt the log contents using the DESEDE/ECB/PKCS5Padding algorithm. bevocal.security.key Extension. Access to BeVocal Platform services and to certain features of the interpreter is restricted with a 128-bit security key. To use such features, you must set the value of this property to your key. The Enrollment Facility uses a security key. In this case, applications using one security key are prevented from accessing enrollment grammars created by an application using a second key. When you develop applications for one of BeVocal's commercial hosting services such as Enterprise Hosting, you will need a security key in order to use enrollment. When you develop on Caf, you can use enrollment without a key; however there are limitations. For details, see Chapter 9, Voice Enrollment Grammars in the Grammar Reference. By default, no value is set for this property. bevocal.sounds.listening Extension. The value of the bevocal.sounds.listening property controls whether the interpreter plays background music while listening for user input. If this property is true, the interpreter starts to play background music after playing a prompt; it stops when the user responds or when the timeout occurs, before throwing a no-input event. If this property is false, the interpreter does not play music while waiting for user input. The default value is false. bevocal.sounds.maskrecognitionlatency Extension. This property indicates whether, after an utterance, a sound clip is played back until the recognition result is returned. Use this property when you expect latencies from large grammars. If the value is true, then the sound clip is played. If the value is false, then nothing is played back. The sound clip played is determined by the fetchaudio property. The fetchaudiodelay property determines the minimum time to wait before the sound clip begins. The fetchaudiominimum property determines the minimum time to play the sound clip once it is started. The default value is false. bevocal.sounds.recognition Extension. The value of the bevocal.sounds.recognition property controls whether the user is given audio feedback that speech recognition was successful. Set this property to true to play a short sound after each successful recognition. This sound lets the user know that the input has been heard, which makes the application seem more responsive to typical users. If this property is false, no sound is played after successful recognition. The default value is false.
351
PROPERTIES
bevocal.transfer.terminatetones Extension. This property exists because DTMF grammars are not supported during transfer. This property defines the DTMF sequences which can terminate a bridged <transfer> (or <dial>). The value can be string of DTMF tones to recognize, or a space-delimited list of such strings. The DTMF tones are identified by their numeric or string value. For example: <property name="bevocal.transfer.terminatetones" value="12345 ***"/> will terminate the call if the DTMF sequence of "12345" is recognized or the sequence "***". The default value is the empty string; that is, no tones. bevocal.utterance.prefix Extension. If utterance capture is enabled in the scope of this property, utterances are available both through the variables discussed with the bevocal.audio.capture property and through audio files when using the Media Access Service. The filename of a saved utterance is prefixed with the value of this property inside parentheses. For example, if the value of bevocal.utterance.prefix is restaurant, then the filename will start with (restaurant). This property can help identify an utterance with a specific application context. For example, in an application state that recognizes against a restaurant grammar, you can set the value to restaurant. When control leaves that application state, you can remove the value (by setting it to the empty string) or set it to a new value for the new state. Later, when analyzing audio files using the Media Access Service, you can safely assume that audio files whose names start with (restaurant) contain a restaurant name spoken by a user. If the underlying platform does not support filenames containing characters used in this property, then the utterance file is not created at all. Consequently, you should only use alphanumeric characters in your prefix. This property is relevant only if you have bevocal.audio.capture set to true. The default value is NONE; that is, by default filenames start with (NONE). bevocal.voice.name Extension. The name of the voice to be used for speaking prompts. In a previous BeVocal VoiceXML version, this applied to only the TTS voice used. Now the same property can be used to specify Recorded Voices. See Chapter 8, TTS and Recorded Voice Selection for details. The default value is the empty string, requesting the interpreter to use the most appropriate voice for each type. bevocal.vxml.maxrecognitionlatency The bevocal.vxml.maxrecognitionlatency property controls the maximum amount of time that the platform will wait for the recognizer to recognize any one utterance. If this time is exceeded, a NOMATCH is thrown. The default value is 15 seconds. caching VoiceXML 1.0 only. If the <vxml> tags version attribute is 1.0, the caching property controls the use of unexpired cache files when the relevant maximum-age property is not set. The caching property can
352
be set to either safe to ignore any cached file when fetching, or fast to use any cached file instead of fetching. Note: The caching property is ignored when the <vxml> tags version attribute is 2.0 or greater. In that case, the application uses an unexpired cached copy of a file if the relevant maximum-age property is not set. The default value is fast. completetimeout The value of the completetimeout property is the amount of time to wait for additional input after the speech-recognition engine has recognized speech matching one of the input grammars. See bevocal.finaltimeout for a description of how the completetimeout property interacts with the bevocal.finaltimeout and incompletetimeout properties. See Appendix D of the VoiceXML 2.0 specification for a description of how the timeout property interacts with the completetimeout and incompletetimeout properties. The default value is 0.5 seconds. The minimum possible value is 0.1 seconds and the maximum possible value is 10 seconds. confidencelevel The value of the confidencelevel property is the confidence threshold required for the speech-recognition engine to decide whether the input speech matches a grammar. In practice, most applications use a confidencelevel very near the default value of 0.5. The minimum is 0.0 and the maximum is 1.0. The default value of 0.5 maps to 35% confidence in the Nuance speech-recognition engine. Setting this property higher will result in fewer false recognitions at the cost of many more no-match events. Setting it lower will reduce the number of no-match events at the risk of more false recognitions. datafetchhint Extension. The datafetchhint property tells the interpreter whether XML data files may be prefetched. The value is one of: safeFetch XML data files only when they are needed, never before. prefetchPermit, but do not require, the interpreter to prefetch XML data files.
The default value is safe. datamaxage Extension. The datamaxage property specifies the maximum acceptable age, in seconds, of cached XML data files. The value is a time interval expressed as an unsigned number followed by s for time in seconds (the default); ms for time in milliseconds. An unexpired cached file that does not exceed the maximum age will be used; a cached file that exceeds the maximum will be fetched again. When no value is set for this property: If the <vxml> tags version attribute is 2.0 or greater, unexpired cached data files are used. If the <vxml> tags version attribute is 1.0, the caching property to controls whether unexpired cached data files are used.
353
PROPERTIES
By default, no value is set for this property. datamaxstale Extension. The datamaxstale property specifies the maximum acceptable time, in seconds, during which expired cached XML data files can still be used. The value is a time interval expressed as an unsigned number followed by s for time in seconds (the default); ms for time in milliseconds. A cached file that has been expired for more that the maximum stale time will be refetched; one that has been stale for less than or equal to the maximum stale time will be used. The default value is 0 seconds. documentfetchhint The documentfetchhint property tells the interpreter whether VoiceXML documents may be prefetched. The value is one of: safeFetch document files only when they are needed, never before. prefetchPermit, but do not require, the interpreter to prefetch document files.
The default value is safe. documentmaxage New in VoiceXML 2.0. The documentmaxage property specifies the maximum acceptable age, in seconds, of cached VoiceXML document files. The value is a time interval expressed as an unsigned number followed by s for time in seconds (the default); ms for time in milliseconds. An unexpired cached file that does not exceed the maximum age will be used; a cached file that exceeds the maximum will be fetched again. When no value is set for this property: If the <vxml> tags version attribute is 2.0 or greater, unexpired cached VoiceXML document files are used. If the <vxml> tags version attribute is 1.0, the caching property to controls whether unexpired cached document files are used.
By default, no value is set for this property. documentmaxstale New in VoiceXML 2.0. The documentmaxstale property specifies the maximum acceptable time, in seconds, during which expired cached VoiceXML document files can still be used. The value is a time interval expressed as an unsigned number followed by s for time in seconds (the default); ms for time in milliseconds. A cached file that has been expired for more that the maximum stale time will be refetched; one that has been stale for less than or equal to the maximum stale time will be used. The default value is 0 seconds. fetchaudio The value of the fetchaudio property is the URI of the audio clip to play while waiting for a resource to be fetched. This property affects the fetching of some, but not all, resource types: This property is relevant when the interpreter fetches VoiceXML documents with the tags: <choice>, <goto>, <link>, <subdialog>, and <submit>.
354
Extension. If the bevocal.dtmf.flushbuffer property is true, this property affects the fetching of XML data with the <data> tag. Background audio is never played while the interpreter fetches grammar, audio, or script files.
The default value is not to play any audio. fetchaudiodelay New in VoiceXML 2.0. The value of the fetchaudiodelay property is the amount of time to wait after a VoiceXML download is started before its fetchaudio source is played in the background. This property is useful if you want a short period of silence before the audio starts playing. If the download completes during the period of silence, the audio will not be played at all. The default value is 0 seconds. fetchaudiominimum New in VoiceXML 2.0. The value of the fetchaudiominimum property is the minimum time interval to play the fetchaudio source, once started, even if the fetch operation completes during play. If the value is greater than 0, it prevents the user from hearing a short clip of background audio which is immediately cut off. The default value is 0 seconds. With the default value, the interpreter interrupts the audio playback as soon as the resource is fetched, and resumes normal processing fetchtimeout The value of the fetchtimeout property is the timeout period for fetches. The interpreter will wait this long for the resource to be returned before throwing an error.badfetch event. The value is a time interval expressed as an unsigned number followed by s for time in seconds; ms for time in milliseconds (the default). If you set this property to 0, the interpreter waits indefinitely. The default value is 1 minute. grammarfetchhint The grammarfetchhint property tells the interpreter whether grammars may be prefetched. The value is one of: safeFetch grammar files only when they are needed, never before. prefetchPermit, but do not require, the interpreter to prefetch grammar files.
The default value is prefetch. grammarmaxage New in VoiceXML 2.0. The grammarmaxage property specifies the maximum acceptable age, in seconds, of cached grammar resources. The value is a time interval expressed as an unsigned number followed by s for time in seconds (the default); ms for time in milliseconds. An unexpired cached file that does not exceed the maximum age will be used; a cached file that exceeds the maximum will be fetched again. When no value is set for this property: If the <vxml> tags version attribute is 2.0 or greater, unexpired cached grammar files are used. If the <vxml> tags version attribute is 1.0, the caching property to controls whether unexpired cached grammar files are used.
355
PROPERTIES
By default, no value is set for this property. grammarmaxstale New in VoiceXML 2.0. The grammarmaxstale property specifies the maximum acceptable time, in seconds, during which expired cached grammar resources can still be used. The value is a time interval expressed as an unsigned number followed by s for time in seconds (the default); ms for time in milliseconds. A cached file that has been expired for more that the maximum stale time will be fetched again; one that has been stale for less than or equal to the maximum stale time will be used. The default value is 0 seconds. incompletetimeout The value of the incompletetimeout property is amount of time to wait for additional speech input when the user has begun speaking but the input does not yet match a complete grammar. The default value is 1.5 seconds. The minimum possible value is 0.2 seconds and the maximum possible value is 10 seconds. See bevocal.finaltimeout for a description of how the completetimeout property interacts with the bevocal.finaltimeout and incompletetimeout properties. See Appendix D of the VoiceXML 2.0 specification for a description of how the timeout property interacts with the incompletetimeout and completetimeout properties. inputmodes The inputmodes property specifies which input modes to enable: dtmf and/or voice. To disable speech recognition, set inputmodes to dtmf. To disable DTMF, set it to voice. To enable both input modes again after disabling one, specify both values, separated by a space: inputmodes="dtmf voice" You can use this property: To turn off speech recognition in noisy environments. To conserve speech recognition resources by turning them off where the all input is expected to be DTMF.
The default value is "dtmf voice"; that is, accept both DTMF and voice input. interdigittimeout The value of the interdigittimeout property is the amount of time that the speech-recognition engine waits for another DTMF digit before it decides that the user has finished entering digits and returns a recognition or a no-match event. The default value is 3 seconds.
356
The interpreter uses a combination of the interdigittimeout, timeout, termtimeout, and termchar properties for DTMF recognition, as described here: Under these conditions... The user does not press any keys and timeout seconds elapse The user presses the termchar character The interpreter does this... Generates a no-input event Recognition immediately ends and the interpreter either returns the valid input or throws a no-match event Throws a no-match event Immediately returns the valid input
The keys the user has pressed so far are not a valid input and interdigittimeout seconds elapse The keys the user has pressed so far are a valid input and The grammar does not allow the input to be extended to make a longer valid input and There is not an active termchar character The keys the user has pressed so far are a valid input and The grammar does not allow the input to be extended to make a longer valid input and There is an active termchar character The keys the user has pressed so far are a valid input and The grammar does allow the input to be extended to make a longer valid input and maxnbest
Waits termtimeout seconds before returning the valid input
Waits interdigittimeout seconds before returning the valid input
New in VoiceXML 2.0. The value of the maxnbest property is used to enable or disable N-best recognition. If the value is 1, N-best recognition is disabled; if the value is greater than 1, N-best recognition is enabled. For additional information about multiple recognition results, see Chapter 5, Using Multiple-Recognition. Depending on the value of the bevocal.maxinterpretations property, maxnbest may also enable or disable multiple interpretations: If bevocal.maxinterpretations is undefined or less than 1, the maxnbest property enables and disables both N-best recognition and multiple interpretations together. If bevocal.maxinterpretations is 1 or greater, the maxnbest property enables and disables N-best recognition alone; the value of bevocal.maxinterpretations determines whether multiple interpretations are enabled.
The value of maxnbest is used in combination with the value of bevocal.maxinterpretations to determine the maximum number of results that can be returned by the speech-recognition engine. See Maximum Array Size on page 366. Note: As the value of the maxnbest property increases, recognition can slow dramatically, because the speech-recognition engine is unable to prune lower-confidence hypotheses until much later in the recognition process. The default value is 1. Tips: Leave this property set to 1 except for those fields and mixed-initiative forms for which you anticipate that spoken inputs may sound similar to more than one expected response. When you do set this property, use a low value, namely the number of possible results that your application is prepared to handle.
357
PROPERTIES
Set this property at the lowest possible level. Typically, you set this property in an individual <field> element or in a particular mixed-initiative <form> element when different expected responses can sound similar. For better performance, if you anticipate that the spoken inputs do not sound similar but that a valid spoken input might be ambiguous, leave the value of the maxnbest property as 1 and set the bevocal.maxinterpretations property to a number greater than 1.
maxspeechtimeout This property specifies maximum duration of user speech. If this time elapsed before the user stops speaking, the maxspeechtimeout event is thrown. The default duration is 6 seconds. recordutterance The value of the recordutterance property controls whether the interpreter captures spoken audio for each recognition. This property is equivalent to the bevocal.audio.capture property. If this property is true, the interpreter captures spoken audio; if this property is false, the interpreter does not capture audio. The captured audio is available to the application in variables: When a successful recognition occurs for a field, the audio capture is available in the audio property of the fields shadow variable (fieldname$.audio) and in the application variable application.lastaudio$. When a no-match event occurs in a field, the audio capture is available in application.lastaudio$ only; fieldname$.audio is cleared. When a no-input event occurs in a field, both application.lastaudio$ and fieldname$.audio are cleared. When the users speech successfully matches a form grammar, a menu choice, or a link grammar, the audio capture is available in application.lastaudio$.
You can send the captured audio to a server using a <data> element. Doing so is useful if you need a record of the users speech for legal reasons. You can also access captured utterance files for later analysis, using the Media Access Service. For this purpose, you may want to use the bevocal.utterance.prefix property to modify the filenames of the utterance files. Note: Capturing audio causes a minor delay after each speech recognition for which recordutterance is true. You should set this property to true only locally inside specific fields in which the capability is needed. The default value is false. recordutterancetype The recordutterancetype property allows an application to record an utterance in a specified MIME type. BeVocal strongly recommends that this property is not used, as it causes unnecessary delays in transcoding the audio files. Supported media type values for the recordutterancetype property are:
358
Value audio/x-wav audio/x-alaw-basic audio/vnd.wave;codec=1 The default value is audio/x-wav. scriptfetchhint
Description WAV (RIFF header) 8 KHz 8-bit mono mu-law WAV (RIFF header) 8 KHz 8-bit mono a-law WAV (RIFF header) 8 KHz 8-bit mono linear
The scriptfetchhint property tells the interpreter whether scripts may be prefetched or not. The value is one of:. safeFetch script files only when they are needed, never before prefetchPermit, but do not require, the interpreter to prefetch script files.
The default value is prefetch. scriptmaxage New in VoiceXML 2.0. The scriptmaxage property specifies the maximum acceptable age, in seconds, of cached script resources. The value is a time interval expressed as an unsigned number followed by s for time in seconds (the default); ms for time in milliseconds. An unexpired cached file that does not exceed the maximum age will be used; a cached file that exceeds the maximum will be fetched again. When no value is set for this property: If the <vxml> tags version attribute is 2.0 or greater, unexpired cached script files are used. If the <vxml> tags version attribute is 1.0, the caching property to controls whether unexpired cached script files are used.
By default, no value is set for this property. scriptmaxstale New in VoiceXML 2.0. The scriptmaxstale property specifies the maximum acceptable time, in seconds, during which expired cached script resources can still be used. The value is a time interval expressed as an unsigned number followed by s for time in seconds (the default); ms for time in milliseconds. A cached file that has been expired for more that the maximum stale time will be refetched; one that has been stale for less than or equal to the maximum stale time will be used. The default value is 0 seconds. ssmlfetchhint Extension. The ssmlfetchhint property tells the interpreter whether SSML documents may be prefetched. The value is one of: safeFetch SSML files only when they are needed, never before. prefetchPermit, but do not require, the interpreter to prefetch SSML files.
The default value is safe.
359
PROPERTIES
ssmlmaxage New in VoiceXML 2.0; Extension. The ssmlmaxage property specifies the maximum acceptable age, in seconds, of cached SSML files. The value is a time interval expressed as an unsigned number followed by s for time in seconds (the default); ms for time in milliseconds. An unexpired cached file that does not exceed the maximum age will be used; a cached file that exceeds the maximum will be fetched again. When no value is set for this property, unexpired cached SSML files are used. By default, no value is set for this property. ssmlmaxstale New in VoiceXML 2.0; Extension. The ssmlmaxstale property specifies the maximum acceptable time, in seconds, during which expired cached SSML files can still be used. The value is a time interval expressed as an unsigned number followed by s for time in seconds (the default); ms for time in milliseconds. A cached file that has been expired for more that the maximum stale time will be refetched; one that has been stale for less than or equal to the maximum stale time will be used. The default value is 0 seconds. sensitivity The value of the sensitivity property is the sensitivity of the speech recognition. In effect, this is the gain or input volume used for the speech input. Higher values will allow the speech-recognition engine to recognize softer speech but will also pick up more background noise. You may need to tune this property downward if you expect your application to be used in noisy environments, or upward if you expect it to be used in very quiet environments. The default value is 0.5. The minimum is 0.0 and the maximum is 1.0. speedvsaccuracy The speedvsaccuracy property controls the trade-off between speed and accuracy in the speech-recognition engine. Lower values cause potential results with a low probability to be pruned early in the search process, resulting in faster recognition speed. Higher values retain potential results longer, resulting in slower but more accurate recognitions. If you have extremely large grammars that cause speech recognition to take too long, you can experiment with lower values for this property. Otherwise, you probably do not need to adjust this property. The default value of 0.5 maps to a fairly accurate setting (1200 on a scale of 0 to 1400 in the Nuance speech-recognition engine). The minimum is 0.0 and the maximum is 1.0. termchar The value of the termchar property is the (optional) terminating character for DTMF recognition. If this property is set, the speech-recognition engine will wait termtimeout seconds for the termination character before returning recognized DTMF input. See interdigittimeout for a description of how the interpreter uses a combination of the interdigittimeout, timeout, termtimeout, and termchar properties for DTMF recognition. The default value is #. termtimeout If the termchar property is set, the speech-recognition engine will wait termtimeout seconds for the termination character before returning recognized DTMF input.
See interdigittimeout for a description of how the interpreter uses a combination of the interdigittimeout, timeout, termtimeout, and termchar properties for DTMF recognition Note: When you have a fixed-length input field with a DTMF grammar, you may wish to set this property to 0 seconds to avoid a pause after the last digit is entered. The default value is 0 seconds. timeout The value of the timeout property is the maximum time the application waits for user input. After the specified amount of time has lapsed, the interpreter throws a no-input event. See Appendix D of the VoiceXML 2.0 specification for a thorough description of the way this property interacts with the completetimeout and incompletetimeout properties. See interdigittimeout for a description of how this property interacts with the interdigittimeout, termtimeout, and termchar properties for DTMF recognition. The default value is 5 seconds. universals New in VoiceXML 2.0. The universals property specifies which universal grammars should be active; it deactivates all other universal grammars. The value is one of: allmake all universal grammars active nonedeactivate all universal grammars Space-separated list of the universal grammar namesmake the specified grammars active; deactivate all other universal grammars.
The predefined universal grammars are help, exit, cancel and goback. You can define additional universal grammars by setting the universals property in <grammar> elements. The following <property> element deactivates the help grammar and activates the other predefined universal grammars: <property name="universals" value="exit cancel goback"/> For additional information, see Universal Commands and Grammars on page 14. If the <vxml> tags version attribute is 2.0 or greater, the default value is none. If the version attribute is 1.0, the default value is all.
361
PROPERTIES
362
13
Variables
This chapter describes variables defined by VoiceXML. See: Variable Summary on page 363 for an overview of the variables, grouped by scope Variable Index on page 364 for an alphabetical list of all variables
In the cases where the BeVocal VoiceXML interpreter deviates from the VoiceXML 2.0 Specification, the difference is clearly marked below in the following ways: Not ImplementedFunctionality not currently available. ExtensionAdded functionality. DeprecatedNon-standard or superseded feature that was supported by an earlier version but has been replaced by a new feature.
Variable Summary
The following table organizes variables according to scope in which each variables value is available: Session variables are available to all applications that are executed in a particular session with the VoiceXML interpreter. Application variables are available throughout a particular application. Event-related variables are available in event handlers only. Scope Session Variable session.bevocal.timeincall Extension session.bevocal.version Extension session.iidigits Not Implemented session.telephone.ani session.telephone.dnis application.lastaudio$ Extension application.lastresult$ New in VoiceXML 2.0 _event New in VoiceXML 2.0 _message New in VoiceXML 2.0
Application Event handler
VARIABLES
Variable Index
The following table lists the variables in alphabetical order. Variable _event New in VoiceXML 2.0 _message New in VoiceXML 2.0 application.lastaudio$ Extension application.lastresult$ New in VoiceXML 2.0 session.bevocal.timeincall Extension session.bevocal.version Extension session.iidigits Not Implemented session.telephone.ani VoiceXML 1.0 only session.telephone.dnis VoiceXML 1.0 only Scope Event handler Event handler Application Application Session Session Session Session Session
Variable Descriptions
This section contains variable descriptions is alphabetical order. _event New in VoiceXML 2.0. Within the anonymous scope of an event handler, the JavaScript variable _event is set to the name of the event that was thrown. For example: <error> <prompt>event is <value expr="_event"/></prompt> </error> If the event is error.badfetch.http.500, this handler will say, event is error.badfetch.http.500. _message New in VoiceXML 2.0. Within the anonymous scope of an event handler, the JavaScript variable _message is set to the message string that provides additional context about the event that was thrown. If no message was supplied when the event was thrown, this variable has the value undefined. application.lastaudio$ Extension. If the bevocal.audio.capture property is set to true, the interpreter captures spoken audio for each recognition. When the users speech matches a field grammar, a form grammar, a menu choice, or a link grammar, the application.lastaudio$ variable contains an audio capture of the users speech. When a no-match event occurs in a field, the application.lastaudio$ variable contains an audio capture of the users speech. When a no-input event occurs in a field, the application.lastaudio$ variable is cleared.
You can send the captured audio to a server using a <data> element. Doing so is useful if you need a record of the users speech for legal reasons.
application.lastresult$ New in VoiceXML 2.0. When the user provides input during the execution of a <field> element or the <initial> element of a mixed-initiative form, the interpreter invokes the speech-recognition engine on the users response. The most likely recognized utterance is used to set the relevant input variables. This utterance is chosen on the basis of speech-recognition engine confidence levels as well as grammar weighting and scoping rules. If this utterance matches multiple rules in an ambiguous grammar, the input variables are set according to an arbitrary one of those rules. Additional information from the speech-recognition engine is available in the read-only variable application.lastresult$. This variable has a dual interpretation: It acts like a normal object whose properties describe the most likely recognized resultthat is, the one that was used to set the input variables. It also acts like an array of objects, each describing one of the likely recognition results. This array always has at least one element. If multiple recognition is enabled, it may contain additional elements. See Maximum Array Size for further details.
The application.lastresult$ variable, and each member of the application.lastresult$ array, is a JavaScript object with the following properties: Property confidence utterance inputmode Description The recognition confidence level of this result (with 0.0 representing the lowest confidence and 1.0 representing the highest). The raw string of words that were recognized, for example "portland oregon". The mode in which user input was provided, one of: dtmf - DTMF input voice - spoken input The interpretation of this result. This property contains a JavaScript object whose properties correspond to the slots that can be set by the matched grammar rule.
interpretation
If the application.lastresult$ array contains more than one element, each element has a different combination of utterance and interpretationthat is, different elements differ in the utterance, the interpretation, or both. Each element corresponds to one interpretation of one likely utterance. The same utterance may have different interpretations, and two or more different utterances may have a common interpretation. The elements of the application.lastresult$ array for different possible utterances are sorted by descending order of confidence level; elements for the different interpretations of a given utterance or for multiple utterances with the same confidence are in an undefined order. You can use the expression application.lastresult$.length to get the number of elements in the application.lastresult$ array. The application.lastresult$ variable holds information about the last recognition that occurred within the application. Before the interpreter enters a waiting state (a recognition, record, or transfer) the variable is set to undefined. When a nomatch event is thrown, application.lastresult$ is set to the nomatch result. When a noinput event is thrown, application.lastresult$ is not reset to undefined. An application can check the variable in the <filled> element of the field or form for which input was received.
365
VARIABLES
Maximum Array Size The maxnbest and bevocal.maxinterpretations properties determine which multiple-recognition features are enabled and the maximum number of elements in the application.lastresult$ array. The following table shows the possible combinations of values for these properties. maxnbest 1 1 Greater than 1 Greater than 1 Greater than 1 bevocal.maxinterpretations Unset, 1, or less than 1 Greater than 1 Unset or less than 1 1 Greater than 1 Maximum Array Size 1Both features are disabled bevocal.maximuminterpretationsOnly multiple interpretations is enabled maxnbestBoth features are enabled maxnbestOnly N-best recognition is enabled maxnbest * bevocal.maximuminterpretations Both features are enabled
If both features are disabled, the application.lastresult$ array contains a single element with index 0 and application.lastresult$[0] is identical to application.lastresult$. If one or both of the features are enabled, the array may contain multiple elements, up to the maximum specified in the table. If N results were found, the array contains N elements with indexes from 0 to N-1. Variable Structure The entire structure of the application.lastresult$ variable is as follows: application.lastresult$ { confidence utterance inputmode interpretation { slotname1 slotname2 ... } } application.lastresult$[0] { confidence utterance inputmode interpretation {...} } ... application.lastresult$[n] { ... } Recognition Results You typically examine the application.lastresult$ object if multiple recognition is disabled. You typically examine the application.lastresult$ array if multiple recognition is enabled. Whether or not multiple recognition is enabled, application.lastresult$.utterance is the most likely recognized utterance and application.lastresult$.interpretation is the chosen interpretation of that utterancean object whose properties are the slots that were filled in, corresponding to the input variables that were set.
366
Tip: Remember not to access properties of a particular element of the application.lastresult$ array until you have verified that the element exists. If you try to access application.lastresult$[i].utterance when i is greater than or equal to the number of results, an error.semantic event is thrown.
Multiple Recognized Utterances If the speech-recognition engine finds multiple possible utterances, the application.lastresult$ array contains at least one element for each utterance. The elements for different utterances are ordered by speech-recognition engine confidence levels alone. When multiple utterances are recognized, the value of application.lastresult$ is always identical to application.lastresult$[0]. For example, suppose the user muttered something that sounded like Austin or Boston as the initial input to a form with an unambiguous grammar. The application.lastresult$ variable might be set as follows. Recognition Result Most likely utterance Chosen (and only) interpretation of the utterance Utterance with the highest confidence level First (and only) interpretation of the utterance Utterance with the second highest confidence level First (and only) interpretation of the utterance Property application.lastresult$.confidence application.lastresult$.utterance application.lastresult$.inputmode application.lastresult$.interpretation.city application.lastresult$.interpretation.state application.lastresult$[0].confidence application.lastresult$[0].utterance application.lastresult$[0].inputmode application.lastresult$[0].interpretation.city application.lastresult$[0].interpretation.state application.lastresult$[1].confidence application.lastresult$[1].utterance application.lastresult$[1].inputmode application.lastresult$[1].interpretation.city application.lastresult$[1].interpretation.state Value .38 "austin" "voice" "Austin" "TX" .38 "austin" "voice" "Austin" "TX" .37 "boston" "voice" "Boston" "MA"
With the application.lastresult$ variable set as shown in the preceding table, the city input variable would be set to Austin and the state input variable would be set to TX. The application could either proceed with those settings, or examine the application.lastresult$ array to determine that another interpretation of the input is possible. Multiple Interpretations If a particular utterance matches a single grammar rule, the application.lastresult$ array contains a single element for that utterance. The interpretation property of this elements gives the slot values set by the matched grammar rule. If the utterance matches multiple rules in an ambiguous grammar, the application.lastresult$ array contains multiple elements for that utterance; the interpretation property of each of these elements gives the slot values set by one of those grammar rules. The order of these elements within the array is undefined.
VARIABLES
For example, suppose the user clearly said Portland. The chosen interpretation for this recognized result might be Portland, Oregon, and the application.lastresult$ variable might be set as follows. Recognition Result Most likely (and only) utterance Chosen interpretation of the utterance Most likely (and only) utterance First interpretation of the utterance Property application.lastresult$.confidence application.lastresult$.utterance application.lastresult$.inputmode application.lastresult$.interpretation.city application.lastresult$.interpretation.state application.lastresult$.confidence application.lastresult$.utterance application.lastresult$.inputmode application.lastresult$.interpretation[0].city application.lastresult$.interpretation[0].state Most likely (and only) utterance Second interpretation of the utterance application.lastresult$.confidence application.lastresult$.utterance application.lastresult$.inputmode application.lastresult$.interpretation[1].city application.lastresult$.interpretation[1].state Value .8 "portland" "voice" "Portland" "OR" .8 "portland" "voice" "Portland" "ME" .8 "portland" "voice" "Portland" "OR"
With the application.lastresult$ variable set as shown in the preceding table, the city input variable would be set to Portland and the state input variable would be set to OR. The application could either proceed with those settings, or examine the application.lastresult$ variable to determine that another interpretation of the input is possible. Multiple Utterances and Interpretations If the speech-recognition engine finds multiple possible utterances that match an ambiguous grammar, one or more of the utterances may have multiple interpretations. For example, suppose the user muttered something that sounded like Austin or Boston. If the speech-recognition engine found two interpretations for the utterance Austin and one for Boston, the application.lastresult$ variable might be set as follows. Recognition Result Most likely utterance Chosen (and only) interpretation of the utterance Property application.lastresult$.confidence application.lastresult$.utterance application.lastresult$.inputmode application.lastresult$.interpretation.city application.lastresult$.interpretation.state Value .37 "boston" "voice" "Boston" "MA"
368
Recognition Result Utterance with the highest confidence level First (and only) interpretation of the utterance Utterance with the second highest confidence level First interpretation of the utterance Utterance with the second highest confidence level Second interpretation of the utterance
Property application.lastresult$[0].confidence application.lastresult$[0].utterance application.lastresult$[0].inputmode application.lastresult$[0].interpretation.city application.lastresult$[0].interpretation.state application.lastresult$[1].confidence application.lastresult$[1].utterance application.lastresult$[1].inputmode application.lastresult$[1].interpretation.city application.lastresult$[1].interpretation.state application.lastresult$[2].confidence application.lastresult$[2].utterance application.lastresult$[2].inputmode application.lastresult$[2].interpretation.city application.lastresult$[2].interpretation.state
Value .37 "boston" "voice" "Boston" "MA" .36 "austin" "voice" "Austin" "TX" .36 "austin" "voice" "Austin" "CA"
With the application.lastresult$ variable set as shown in the preceding table, the city input variable would be set to Boston and the state input variable would be set to MA. The application could proceed with those settings, or examine the application.lastresult$ variable to determine that other interpretations of the input are possible. session.bevocal.timeincall Extension. The number of milliseconds since the beginning of this call. session.bevocal.version Extension. The version number of the VoiceXML interpreter (for example, 1.2.3). session.iidigits Not Implemented. Information Indicator Digits The session.iidigits variable is set to information about the callers location (pay phone, and so on), when available. Complete list available in Local Exchange Routing Guide published by Telecordia. session.telephone.ani Automatic Number Identification The session.telephone.ani variable is set to the callers telephone number, when available. session.telephone.dnis Dialed Number Identification Service The session.telephone.dnis variable is set to the number the caller dialed, when available.
369
VARIABLES
370
14
JavaScript Functions and Objects
This chapter describes JavaScript functions and objects that are available to BeVocal VoiceXML applications. These are all extensions to the VoiceXML specification. For general information on using JavaScript in BeVocal VoiceXML applications, see JavaScript Quick Reference.
JavaScript Constants
bevocal.outboundrequestid Extension. For calls initiated using the Outbound Notification Service, the value of bevocal.outboundrequestid provides the Outbound Request ID associated with the call. This value can be used to map specific application invocations with the request used to initiate the application. bevocal.sessionid Extension. Provides a unique ID for each call. This value can be used to keep track of call-specific information in the server-side component of your application if cookies cannot be used for any reason.
_addHeader
Extension. Method of a JavaScript SOAP proxy object. Specifies an additional SOAP header for the SOAP service that the proxy object represents. Once the SOAP header is set on a service object, the SOAP header element is part of the SOAP message for every method executed on that service object. Syntax _addHeader( String headerNamespace, String headerName, Object headerValue, String actor, String mustUnderstand ) Parameters Parameter headerNamespace headerName headerValue Description The namespace component of the desired header's name. The local component of the desired header's name. The value of the header.
JAVASCRIPT FUNCTIONS AND OBJECTS
Parameter actor mustUnderstand See Also Example
Description The value of the actor attribute for this header. Optional. The value of the mustUnderstand attribute for this header. Optional.
Chapter 10, SOAP Client Facility
You can use the Call Detail Record Access Service to access information about individual calls to a VoiceXML application. You might decide to write another VoiceXML application that presents information retrieved using the CDR service. Because this is a BeVocal platform service, you must pass security information in SOAP request headers to the service. You could do so as follows: <script> var service = bevocal.soap.serviceFromWSDL( "http://cafe.services.bevocal.com/CDRAccessService_v2/services/CDRAccessServic e_v2?WSDL", "CDRAccessService_v2", "http://www.bevocal.com/soap/services", "CdrAccessService", "http://cafe.services.bevocal.com/CDRAccessService_v2/services/CDRAccessServic e_v2" ); service._addHeader( "http://www.bevocal.com/soap/headers/", "platformServicesSessionID", "ACB035CB-C307-472e-99BA-3D5B8468BB76"); ... </script> After the call to _addHeader, SOAP headers using that proxy object would include this information: <soapenv:Header> <ns1:platformServicesSessionID xsi:type="xsd:string" xmlns:ns1="http://www.bevocal.com/soap/headers/" > ACB035CB-C307-472e-99BA-3D5B8468BB76 </ns1:platformServicesSessionID> </soapenv:Header>
bevocal.cookies.addClientCookie
Extension.Sets a cookie on the VXML client. The created cookie is treated as HTTP cookies and is passed over the subsequent fetches in the application. The cookie lasts for the duration of the call or the session. Syntax bevocal.cookies.addClientCookie( String domain, String key,
372
bevocal.cookies.addClientCookie
String value, ) Parameters Parameter domain Description A valid domain. If a valid domain is passed, the created cookie is passed as a HTTP header for subsequent fetches matching the same domain. key value See Also Example This example demonstrates the bevocal.cookies.addClientCookie, bevocal.cookies.deleteClientCookie, bevocal.cookies.getClientCookie, and bevocal.cookies.getClientCookies methods. Note that the first two cookies are sent as HTTP cookies because the domain parameter is set to cafe.bevocal.com.  <?xml version="1.0" encoding="iso-8859-1"?> <vxml version="2.0" xmlns="http://www.w3.org/2001/vxml" xmlns:bevocal="http://www.bevocal.com/"> <var name="user_1"/> <var name="user_2"/> <var name="user_3"/> <var name="user"/> <form id="form1"> <block> <prompt>Testing client cookies</prompt> <script> bevocal.log("setting cookies now"); bevocal.cookies.addClientCookie("cafe.bevocal.com", "user_1", "1234"); bevocal.cookies.addClientCookie("cafe.bevocal.com", "user_2", "hello world"); bevocal.cookies.addClientCookie(null, "user_3", "xyz123"); user_1 = bevocal.cookies.getClientCookie(null, user_2 = bevocal.cookies.getClientCookie(null, user_3 = bevocal.cookies.getClientCookie(null, bevocal.log("user_1 = " + user_1 + "; user_2 = user_3); "user_1"); "user_2"); "user_3"); " + user_2 + "; user_3 = " + bevocal.cookies.deleteClientCookie bevocal.cookies.getClientCookie bevocal.cookies.getClientCookies The name of the cookie The value of the cookie.
user = bevocal.cookies.getClientCookies(null); bevocal.log("The value from map is " + user.get("user_1")); </script> </block> <block>
<prompt><value expr="user_1"/><value expr="user_2"/><value expr="user_3"/> </prompt> <goto next="clientcookies_2.vxml"/> </block> </form> </vxml>  <?xml version="1.0" encoding="iso-8859-1"?> <vxml version="2.0" xmlns="http://www.w3.org/2001/vxml" xmlns:bevocal="http://www.bevocal.com/"> <var name="user_2"/> <form id="form1"> <block> <prompt>Testing client cookies in second document </prompt> <script> user_2 = bevocal.cookies.getClientCookie(null, "user_2"); bevocal.log("user_2 = " + user_2); bevocal.cookies.deleteClientCookie("user_2"); </script> </block> <block> <prompt><value expr="user_2"/></prompt> <goto next="clientcookies_3.vxml"/> </block> </form> </vxml>  <?xml version="1.0" encoding="iso-8859-1"?> <vxml version="2.0" xmlns="http://www.w3.org/2001/vxml" xmlns:bevocal="http://www.bevocal.com/"> <var name="user_2"/> <var name="user_3"/> <form id="form1"> <block> <prompt>Testing client cookies in third document </prompt> <script> user_2 = bevocal.cookies.getClientCookie(null, "user_2"); user_3 = bevocal.cookies.getClientCookie(null, "user_3"); VOICE bevocal.log("user_2 = " + user_2 + "; user_3 = " + user_3);XML PROGRAMMER S GUIDE </script> </block> <block>
374
bevocal.cookies.deleteClientCookie
bevocal.cookies.deleteClientCookie
Extension.Deletes a cookie on the VXML client. Syntax bevocal.cookies.deleteClientCookie( String domain, String key, ) Parameters Parameter domain Description A valid domain. If a valid domain is passed, a cookie with name key that matches the domain is deleted. Otherwise, a cookie with that name is deleted. key See Also Example See bevocal.cookies.addClientCookie on page 372 bevocal.cookies.addClientCookie bevocal.cookies.getClientCookie bevocal.cookies.getClientCookies The name of the cookie
bevocal.cookies.getClientCookie
Extension. Gets the value of a cookie. Syntax bevocal.cookies.getClientCookie( String domain, String key, ) Parameters Parameter domain Description A valid domain. If a valid domain is passed, a cookie with name key that matches the domain is returned. Otherwise, a cookie with that name is returned. key See Also bevocal.cookies.addClientCookie bevocal.cookies.deleteClientCookie The name of the cookie
375
Example
bevocal.cookies.getClientCookies
See bevocal.cookies.addClientCookie on page 372
bevocal.cookies.getClientCookies
Extension. Returns a HashMap of all key/value cookie pairs. Syntax bevocal.cookies.getClientCookies( String domain, ) Parameters Parameter domain Description A valid domain. If a valid domain is passed, cookies matching that domain are returned. Otherwise, all cookies are returned. See Also Example See bevocal.cookies.addClientCookie on page 372 bevocal.cookies.addClientCookie bevocal.cookies.deleteClientCookie bevocal.cookies.getClientCookie
bevocal.enroll.removeEnrolledPhrase
Extension. Deletes enrolled phrases from grammars. Syntax bevocal.enroll.removeEnrolledPhrase(grammar, speakerid, phraseid, key,
376
bevocal.getProperty
xml:lang) Parameters Parameter grammar speakerid phraseid key Description The name of the grammar containing the enrolled phrase to delete. Required. The id of the speaker who enrolled the phrase to be deleted. Required. The id of the phrase to be deleted. Required. The security key for accessing the specified enrollment grammar. Optional when running on the BeVocal Caf. (You must pass a key argument, but it can be null or an empty string). Required when running in other environments such as Enterprise Hosting. The language of the enrolled phrase to delete. Optional. The value of this attribute must be the same as the value of the xml:lang attribute of the <vxml> tag for the document. Description If you are using enrollment to maintain a voice address book or other dynamic lookup mechanism, you need to be able to delete phrases from the grammar in addition to adding them.
xml:lang
bevocal.getProperty
Extension. Gets the value of a VoiceXML property. Syntax bevocal.getProperty(String name) Parameters Parameter name Returns The value of the specified property in the current scope. Description The name of the property.
bevocal.getVersion
Extension. Returns the VoiceXML interpreter version. Syntax bevocal.getVersion() Parameters None.
377
Returns The version number of the VoiceXML interpreter, as a string. Currently, this function always returns 2.4.
bevocal.log
Extension. Writes a message to the BeVocal Caf website. Syntax bevocal.log(String message) Parameters Parameter message Description The log function writes the specified message to the BeVocal Caf call log, which you can view on the Caf website. This function performs the same operation as a <log> VoiceXML element with no label or expr attribute. Description The message to be written to the website.
bevocal.soap.serviceFromWSDL
Extension. Method of the bevocal.soap object. Given the URL for a WSDL file, create an object to act as a proxy for one of the services described in the file. This is the preferred way to create a service proxy object. The WSDL file allows the interpreter to pre-load the proxy object with information about all of the service's methods and their parameter types, making the mapping from JavaScript types to SOAP encodings much more accurate. Syntax bevocal.soap.serviceFromWSDL( String WSDLUrl, String port, String svcNamespace, String svcName, String endpointURL ) Parameters Parameter WSDLUrl port Description The URL of the WSDL file describing this service. Information on the endpoint, method names, argument types, and namespaces will be retrieved from the file. The port name of the service, as given in a <port name="..."> element inside the <service> element for the service.
378
bevocal.soap.serviceFromEndpoint
Parameter svcNamespace
Description The namespace used on the <service> element for the desired service. If the <service> element does not have its own namespace (that is, it does not have its own xmlns attribute), use the value of the targetNamespace attribute from the WSDL files <definition> element. If neither of these is present, leave the parameter blank. The name of the service, as given in a <service name="..."> element for the service. Optional. If provided, the SOAP endpoint for this service. If this parameter is not provided, the endpoint retrieved from the WSDL will be used.
svcName endpointURL Error Handling
If an error occurs while locating a SOAP service or creating a SOAP proxy object, an exception of type bevocal.soap.SoapException will be thrown. Possible errors include: See Also Chapter 10, SOAP Client Facility Receives a malformed URL. Cannot open a network connection or retrieve a resource. Referenced WSDL file is malformed or does not contain the requested service or port.
bevocal.soap.serviceFromEndpoint
Extension. Method of the bevocal.soap object. Given the name and endpoint URL of a SOAP service, create an object to act as a proxy for the service. This method should be used only when the WSDL for a service is not available, since the proxy it creates does not have information on method names and parameter types available, and thus has to make more assumptions when it makes SOAP method calls. Syntax bevocal.soap.serviceFromEndpoint( String endpointURL, String svcNamespace, String svcName ) Parameters Parameter endpointURL svcNamespace svcName Error Handling If an error occurs while locating a SOAP service or creating a SOAP proxy object, an exception of type bevocal.soap.SoapException will be thrown. Possible errors include: Receives a malformed URL. Cannot open a network connection or retrieve a resource. Description The SOAP endpoint for this service. The namespace component of the desired service's name. The local component of the desired service's name.
379
See Also Example
Referenced WSDL file is malformed or does not contain the requested service or port.
To use a temperature service available from www.xmethods.net, go there and see that the WSDL for this service is at http://www.xmethods.net/sd/2001/TemperatureService.wsdl. The content of the WSDL is: <?xml version="1.0"?> <definitions name="TemperatureService" targetNamespace="http://www.xmethods.net/sd/TemperatureService.wsdl" xmlns:tns="http://www.xmethods.net/sd/TemperatureService.wsdl" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:soap="http://schemas.xmlsoap.org/wsdl/soap/" xmlns="http://schemas.xmlsoap.org/wsdl/"> <message name="getTempRequest"> <part name="zipcode" type="xsd:string"/> </message> <message name="getTempResponse"> <part name="return" type="xsd:float"/> </message> <portType name="TemperaturePortType"> <operation name="getTemp"> <input message="tns:getTempRequest"/> <output message="tns:getTempResponse"/> </operation> </portType> <binding name="TemperatureBinding" type="tns:TemperaturePortType"> <soap:binding style="rpc" transport="http://schemas.xmlsoap.org/soap/http"/> <operation name="getTemp"> <soap:operation soapAction=""/> <input> <soap:body use="encoded" namespace="urn:xmethods-Temperature" encodingStyle="http://schemas.xmlsoap.org/soap/encoding/"/> </input> <output> <soap:body use="encoded" namespace="urn:xmethods-Temperature" encodingStyle="http://schemas.xmlsoap.org/soap/encoding/"/> </output> </operation> </binding> <service name="TemperatureService"> <documentation>Returns current temperature in a given U.S. zipcode </documentation> <port name="TemperaturePort" binding="tns:TemperatureBinding"> <soap:address location="http://services.xmethods.net:80/soap/servlet/rpcrouter"/> </port> </service> </definitions>
380
bevocal.soap.locateService
The bold lines in the defintion provide the rest of the information you need to create a proxy object for this service in your VoiceXML application: <script> <![CDATA[ var service = bevocal.soap.serviceFromWSDL( // WSDL URL "http://www.xmethods.net/sd/2001/TemperatureService.wsdl", // name attribute of port "TemperaturePort", // targetNamespace attribute of definitions "http://www.xmethods.net/sd/TemperatureService.wsdl", // name attribute of service "TemperatureService", // location attribute of soap:address child of port element "http://services.xmethods.com:80/soap/servlet/rpcrouter" // ); ]]> </script>
bevocal.soap.locateService
Extension. Method of the bevocal.soap object. Uses the BeVocal SOAP service registry to locate either the standard BeVocal SOAP service whose id is serviceId or a particular version of that service. Returns a proxy object that can be used to call its methods. Currently, there are not services in the BeVocal SOAP service registry. Syntax bevocal.soap.locateService( String serviceId, String version ) Parameters Parameter serviceId version Error Handling If an error occurs while locating a SOAP service or creating a SOAP proxy object, an exception of type bevocal.soap.SoapException will be thrown. Possible errors include: Receives a service ID or version number that is not in the registry. Cannot open a network connection or retrieve a resource. Description The ID of the SOAP service you wish to locate and create a proxy for. Optional. The desired version of the service. If not present, returns a proxy for the highest-numbered version of the service (if there multiple versions).
381
See Also
Referenced WSDL file is malformed or does not contain the requested service or port.
bevocal.soap.SoapException
Extension. Exception object. If an error occurs while creating a SOAP proxy object, an exception of type bevocal.soap.SoapException will be thrown. This object has two dynamic properties and a number of constants: Property type message cause BAD_SERVICE_ID BAD_VERSION INVALID_URL NETWORK_ERROR Description String constant. bevocal.soap.SoapException. String. A message describing the reason for the exception. Number. A number giving the reason for the exception. It will be one of the following constants. Number constant. The serviceId passed to locateService could not be found in the registry. Number constant. The version number passed to locateService could not be found in the registry. Number constant. The endpoint or WSDL URL was malformed. Number constant. A network error occurred while accessing the registry or the WSDL file. This can include an error fetching the WSDL file, a timeout error, and so on. Number constant. The exact error could not be determined; see the message for details. Currently, errors parsing the WSDL fall into this category; in a future release we hope to split them into a separate category.
UNKNOWN_ERROR
Example <script> var service; try { service = bevocal.soap.locateService("myservice", "2.0"); } catch (error if error.type == "bevocal.soap.SoapException") { if (error.cause == error.BAD_VERSION) { service = bevocal.soap.locateService("myservice", "1.0"); } else { throw error; } } </script> If the exception is not caught in JavaScript code in the <script>, it will propagate to the VoiceXML interpreter and be re-thrown as an error.semantic in VoiceXML. See Also Chapter 10, SOAP Client Facility
382
bevocal.soap.SoapFault
bevocal.soap.SoapFault
Extension. Fault object. This object has the following properties: Property type faultString faultCode faultActor detailString Description String constant. bevocal.soap.SoapFault. String. A message describing the reason for the exception. String. A code describing the reason for the failure. See section 4.4.1 of the SOAP 1.1 specification for details. Number. The serviceId passed to locateService could not be found in the registry. String. A String representation of the detail element of the fault. The detail element is present whenever a fault is caused by processing of the Body element. FaultDetails. An object whose properties are the detail elements; that is, the immediate child elements of the detail element in the fault element. See section 4.4 of the SOAP 1.1 specification. String constant. SOAP fault code: A child of the SOAP Header element that contained a mustUnderstand attribute equal to "1" was not understood. String constant. SOAP fault code: The processor found an invalid namespace for the SOAP Envelope. String constant. SOAP fault code: Processing of the call failed due to an error on the server. The failure was not directly related to the contents of the message, and the message may succeed at a later point in time. String constant. SOAP fault code: Processing of the call failed due to the contents of the message. This failure code is returned when a function is called with invalid parameters, bad data, and so on.
details
MUST_UNDERSTAND VERSION_MISMATCH SERVER
CLIENT
See Also Chapter 10, SOAP Client Facility
bevocal.soap.FaultDetails
Extension. FaultDetails object. Essentially, bevocal.soap.FaultDetails is an object whose property names are the local components (without namespaces) of the names of the elements under the details element, and whose property values are the text contents of those elements. Property type (String) Type String constant String Description bevocal.soap.FaultDetails If you try to use a FaultDetails object as a String, the BeVocal interpreter returns a String representation of the details elements. This is intended primarily for debugging.
383
Property (Number)
Type Number
Description If you try to use a FaultDetails object as a Number, the BeVocal interpreter returns the number of detail elements represented by the FaultDetail object. This is intended primarily for debugging. Given a property name that is the local name (without namespace) of one of the child elements of the details element, the property value is the text contents of the child element.
element id
String
See Also Chapter 10, SOAP Client Facility
384

VXML

Uploaded by

Copyright:

Available Formats

You might also like

VXML

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

VXML

Uploaded by

Copyright:

Available Formats

VoiceXML Programmers Guide

VOICEXML PROGRAMMER S GUIDE

VOICEXML PROGRAMMER S GUIDE

VOICEXML PROGRAMMER S GUIDE

3. Event Handling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .25

4. Fetching and Caching Resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

VOICEXML PROGRAMMER S GUIDE

6. Controlling Outbound Calls . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69

VOICEXML PROGRAMMER S GUIDE

8. TTS and Recorded Voice Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89

9. Dynamic SSML . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .95

VOICEXML PROGRAMMER S GUIDE

10. SOAP Client Facility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .101

11. Tags . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .109

VOICEXML PROGRAMMER S GUIDE

12. Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 339

VOICEXML PROGRAMMER S GUIDE

13. Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 363

VOICEXML PROGRAMMER S GUIDE

14. JavaScript Functions and Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 371

VOICEXML PROGRAMMER S GUIDE

How to Use This Guide

VOICEXML PROGRAMMER S GUIDE

VOICEXML PROGRAMMER S GUIDE

VOICEXML PROGRAMMER S GUIDE

VOICEXML PROGRAMMER S GUIDE

VOICEXML PROGRAMMER S GUIDE

VOICEXML PROGRAMMER S GUIDE

VOICEXML PROGRAMMER S GUIDE

VOICEXML PROGRAMMER S GUIDE

VOICEXML PROGRAMMER S GUIDE

VOICEXML PROGRAMMER S GUIDE

VOICEXML PROGRAMMER S GUIDE

VOICEXML PROGRAMMER S GUIDE

VOICEXML PROGRAMMER S GUIDE

VOICEXML PROGRAMMER S GUIDE

Collecting Input and Playing Prompts

Collecting Input and Playing Prompts

VOICEXML PROGRAMMER S GUIDE

VOICEXML PROGRAMMER S GUIDE

VOICEXML PROGRAMMER S GUIDE

VOICEXML PROGRAMMER S GUIDE

VOICEXML PROGRAMMER S GUIDE

error.connection.noresource error.noresource error.unsupported.format error.unsupported.element

VOICEXML PROGRAMMER S GUIDE

Default Event Handlers

Default Event Handlers

VOICEXML PROGRAMMER S GUIDE

Event Handler telephone.disconnect.hangup

Description Exit the interpreter.

Application-Defined Event Handlers

VOICEXML PROGRAMMER S GUIDE

Note: In VoiceXML 2.0, all subdialogs are modal.

VOICEXML PROGRAMMER S GUIDE

VOICEXML PROGRAMMER S GUIDE

Fetching and Caching Resources

How Fetching and Caching Work

FETCHING AND CACHING RESOURCES