Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 17

Presented By:

Funda Singh Raajput


Prakash Choudhary
Ajay Prakash
Definition:
 A system which takes as input a sequence of words and

converts them to speech

Applications:
 Services for the hearing impaired
 Reading email aloud

Commercial TTS Systems:


 Festival
 Bell Labs TTS
TTS System

words
Text
Prosody Concatenation
Pre-processing
Text Pre-Processing
 Input
– String of characters (sentence)
 Output
– String of diphone symbols
 Objective
– Perform sentence level analysis
• Punctuation marks
• Pauses between words
– Convert all input to corresponding diphones
Text Pre-Processing (Block Diagram)

Number Acronym Word Word to


Converter Converter Segmenter Diphone MLDS
Translator
(Phonetization)

Diphone
Dictionary
Number Converter
 Replace numerals with their textual
versions
100 one hundred

 Handle fractional and decimal


numbers
0.25 point two five
Text Pre-Processing (Block Diagram)

Number Acronym Word Word to


Converter Converter Segmenter Diphone MLDS
Translator
(Phonetization)

Diphone
Dictionary
Acronym Converter
 Replace acronyms with single letter
components
A.B.C. ABC

 Change abbreviations to full textual


format
Mr. Mister
Text Pre-Processing (Block Diagram)

Number Acronym Word Word to


Converter Converter Segmenter Diphone MLDS
Translator
(Phonetization)

Diphone
Dictionary
Word Segmenter
 Divide sentence into word segments
– Special delimiter to separate segments
(i.e. ‘||’)
 Segments can be:
– A single word
– An acronym
– A numeral
 Identify punctuation marks
Text Pre-Processing (Block Diagram)

Number Acronym Word Word to


Converter Converter Segmenter Diphone MLDS
Translator
(Phonetization)

Diphone
Dictionary
Word To Diphone Converter
(Phonetization)
 Purpose
– Translate words to their diphone
representations
 Resource
– Dictionary of words and their diphones
(derived from CMU phoneme database)
– Over 175,000 words supported
W-to-D Converter Cont’d
 Implementation
– Binary Search Algorithm in C
– Start with whole dictionary as search range
start index, end index, middle index
– If target word alphabetically less then middle
word,
then ignore second half (i.e. end index =
middle index)
else ignore first half (i.e. start index = middle
index)
– Repeat until word found or range contains zero
words
W-to-D Converter Cont’d
 Advantages
– Fast search times
• Search range decreases exponentially with
each iteration (max of 1 sec currently)
– Less complicated to implement
• Compared to indexing dictionary or
• Importing the dictionary to an internal
structure
Text Pre-Processing (Block Diagram)

Number Acronym Word Word to


Converter Converter Segmenter Diphone MLDS
Translator
(Phonetization)

Diphone
Dictionary
The Multi-Level Data Structure
 Contains all necessary data for the
next sub-system:
– Word
– Diphone representation
– Prosodic parameters for each diphone
• This reflects both word-level and sentence-
level prosody
 Allows for modularization
Summary

TTS System

words
Text
Prosody Concatenation
Pre-processing

 System modularized

You might also like