Professional Documents
Culture Documents
Interaction Devices
Interaction Devices
Interesting body-context-sensitive.
► Ex.hold PDA by ear = phone call
answer.
Novel Devices
► Miscellaneous
Shapetape – reports 3D
shape.
► Tracks limbs
► Engineer for specific
app (like a gun trigger
connected to serial
port)
Pros: good affordance
Cons: Limited general
use, time
Speech and Auditory Interfaces
► There’s the dream
► Then there’s reality
► Practical apps don’t really require freeform
discussions with a computer
Goals:
► Low cognitive load
► Low error rates
► Smaller goals:
Speech Store and Forward (voice mail)
Speech Generation
Currently not too bad, low cost, available
Speech and Auditory Interfaces
► Bandwidth is much lower than visual displays
► Ephemeral nature of speech (tone, etc.)
► Difficulty in parsing/searching (Box 9.2)
► Types
Discrete-word recognition
Continuous speech
Voice information
Speech generation
Non-speech auditory
► If you want to do research here, lots of research in the
audio, audio psychology, and DSP field you should
understand
Discrete-Word Recognition
► Individual words spoken by a specific person
► Command and control
► 90-98% for 100-10000 word vocabularies
► Training
Speaker speaks the vocabulary
Speaker-independent
► Still requires
Low noise operating environment
Microphones
Vocabulary choice
Clear voice (language disabled are hampered, stressed)
Reduce most questions to very distinct answers (yes/no)
Discrete-Word Recognition
► Helps:
Disabled
Elderly
Cognitive challenged
User is visually distracted
Mobility or space restrictions
► Apps:
Telephone-based info
► Study: much slower for cursor movement than mouse or keyboard
(Christian ’00)
► Study: choosing actions (such as drawing actions) improved
performance by 21% (Pausch ’91) and word processing (Karl ’93)
However acoustic memory requires high cognitive load (> than hand/eye)
► Toys are successful (dolls, robots). Accuracy isn’t as important
► Feedback is difficult
Continuous Speech Recognition
► Dictation
► Error rates and error repair are still poor
► Higher cognitive load, could lower overall quality
► Why is it hard?
Recognize boundaries (normal speech blurs them)
Context sensitivity
“How to wreck a nice beach”
► Much training
► Specialized vocabularies (like medical or legal)
► Apps:
Dictate reports, notes, letters
Communication skills practice (virtual patient)
Automatic retrieval/transcription of audio content (like radio, CC)
Security/user ID
Voice Information Systems
► Use human voice as a source of info
► Apps:
Tourist info
Museum audio tours
Voice menus (Interactive Voice Response IVR systems)
► Use speech recognition to also cut through menus
If menus are too long, users get frustrated
Cheaper than hiring 24 hr/day reps
► Voice mail systems
Interface isn’t the best
► Get email in your car
Also helps with non-tech savvy like the elderly
► Potentially aides with
Learning (engage more senses)
Cognitive load (hypothesize each sense has a limited ‘bandwidth’)
► Think ER, or fighter jets
Speech Generation
► Playback speech (games)
► Combine text (navigation systems)
► Careful evaluation!
Speech isn’t always great
► Door is ajar – now just a tone
► Use flash
► Supermarket scanners