TTS Exogenous Fine Standards

You might also like

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 5

Acceptance criteria for the demand for

TTS exogenous fine standards in small


1. Background
In order to assess the quality of the exogenous fine standard data, the supplier is required to follow
the following specifications.

2. Format requirements
1. File Format: The text file format is .txt, encoded as UTF-8, and cannot be used with Bom.

2. Content format: odd behavior sentence line, even behavior phoneme line (explained below).

Plain Text

00001 Why am- I doing/everything alone%?

W AY1 / AE1 M / AY1 / D UW1 . IH0 NG / EH1 V . R IY0 . TH IH2 NG / AH0 . L OW1 N

00002 If the housing boom picks- up for one last leg%, regulators will be inclined to crack
down on questionable practices%.

IH1 F / DH AH0 / HH AW1 . Z IH0 NG / B UW1 M / P IH1 K S / AH1 P / F AO1 R / W AH1 N / L

AE1 S T / L EH1 G / R EH1 . G Y AH0 . L EY2 . T ER0 Z / W IH1 L / B IY1 / IH0 N . K L AY1 N D / T
UW1 / K R AE1 K / D AW1 N / AA1 N / K W EH1 S . CH AH0 . N AH0 . B AH0 L / P R AE1 K . T
AH0 . S AH0 Z

00003 That might have even called for- a tip/for the poor telegraph boy%, yikes%, can't let
that happen%!

DH AE1 T / M AY1 T / HH AE1 V / IY1 . V AH0 N / K AO1 L D / F AO1 R / AH0 / T IH1 P / F AO1 R /
DH AH0 / P UW1 R / T EH1 . L AH0 . G R AE2 F / B OY1 / Y AY1 K S / K AE1 N T / L EH1 T / DH
AE1 T / HH AE1 . P AH0 N

00004 He would carve each one- of them a special headstone%.

HH IY1 / W UH1 D / K AA1 R V / IY1 CH / W AH1 N / AH1 V / DH EH1 M / AH0 / S P EH1 . SH

AH0 L / HH EH1 D . S T OW2 N
00005 Champs don't become champs- on the court%, they are recognized- on the court%.

CH AE1 M P S / D OW1 N T / B IH0 . K AH1 M / CH AE1 M P S / AA1 N / DH AH0 / K AO1 R T /

DH EY1 / AA1 R / R EH1 . K AH0 G . N AY2 Z D / AA1 N / DH AH0 / K AO1 R T

00006 one euro%

W AH1 N / Y UW1 . R OW0

00007 Five years later%, negotiations have failed to produce a boundary accepted by both

F AY1 V / Y IH1 R Z / L EY1 . T ER0 / N IH0 . G OW2 . SH IY0 . EY1 . SH AH0 N Z / HH AE1 V / F
EY1 L D / T UW1 / P R AH0 . D Y UW1 S / AH0 / B AW1 N . D AH0 . R IY0 / AE0 K . S EH1 P . T
IH0 D / B AY1 / B OW1 TH / S AY1 D Z

00008 You wanna see me self defend myself%?

Y UW1 / W AA1 . N AH0 / S IY1 / M IY1 / S EH1 L F / D IH0 . F EH1 N D / M AY2 . S EH1 L F

I. Line of text
1. Format: " Sentence ID + TAB + Sentence Content

2. Sentence ID: Retains the IDof the original textand cannot be modified

3. Prosody annotations

a. Use "-" for read-along,"/" for prosotic phrases, and"%" for intonation phrases

b. "/" and "%" have no spaces with either the preceding and following words,and "-" has no
spaces with the preceding word and no spaces with the hindword

c. The sentence must end with "%",labeled inside the punctuation at the end of the sentence

II. Phoneme line/row

1. format

a. TAB+ phoneme sequence

b. Only the phonemes of the words are labeled, not punctuation; each phoneme is separated

c. Syllable boundaries use "." , the word borders use "/"and are separated from the front and
back symbols by SPACE

d. Ensure that the word boundaries of the phoneme line correspond to the word segmentation
information of the text line one-to-one

e. "/" or "." is no longer added after the last phoneme at the end of the sentence

2. Accents and tones

a. stress:

i. All segments that can assume accent characteristics have one and only one accent

ii. The primary accent is denoted by "1", the minor accent is denoted by "2", and the
unacmounted is denoted by "0"

iii. "1","2","0" follow the phoneme with no spaces

iv. Each word has and only one primary accent; except for alphabetic spelling, where
"ABC" may have multiple major accents

b. tone

i. All syllables have and only one tone label

ii. The last phoneme of the syllable is marked with "_X", "X" indicates the tonal type (but
1,2,3...). )

iii. "_X" follows the phoneme with no spaces

3. Labeling standards

I. Text Proofreading
• Syntax: The pronunciation person automatically modifies the grammar due to a text syntax error;
for example, if the text is 'She love cat' and the actual reading is 'She loves cats', the text needs to be
changed to 'She loves cats'; if the pronunciation person does not change it, it does not need to be

• Slip of the tongue: The reader pronounces a word into another; for example, 'Indus civilization'
has a bad look and accidentally reads 'indigenous civilization'and needs to change the text to

• Add words: You need to make up the added content on the text

• Missing words: The missing content needs to be deleted accordingly on the text

• Case:

○ The capitalized word is not capitalized: the case needs to be corrected

○ Words that shouldn't be capitalized are actually capitalized: sometimes the sentence will
appear 'I want to dance With you.'" Such typographical errors must be modified according to
the actual case requirements

• punctuation:

○ Illegal punctuation: If there is an illegal punctuation in the language, it needs to be legalized

according to the way it is read aloud.

○Missing punctuation at the end of a sentence: Punctuation needs to be added (adjusted

depending on the language specification); no changes can be made in the sentence.
○ space:

▪ Missing spaces: If there is a lack of spaces in the sentence with the function of participles
and sentences, you need to add them.

▪ Punctuation is not allowed to appear before and after spaces at the same time

• Before alignment:'Do you know which picture shows ''mouth'' ? '

• After alignment:Do you know which picture shows mouth?

▪ Punctuation in a sentence, followed by a space

• Before alignment: Hey, Is he your father?

• After alignment:Hey, is he your father?

○ Hyphens: Split or merge according to the sense of prosody discontinuity; for example, "T-
shirt" belongs to the same prosody word, or is labeled as a word. If the hyphen is preceded by
two prosody words, the hyphen is removed, a space is added, and the phoneme line must be
marked separately as two words.

Plain Text

Before alignment:


After alignment:

Pick me up.

P IH1 K / M IY1 / AH1 P

II. Phoneme Proofreading

• Phoneme Set: Specify the X-Sampa symbol set and the mapping table with IPA to label according
to the language provided by Party A. This symbol set cannot be modified, added, or subtracted.

• Pitch variation: Phoneme and tone annotation do not need to reflect more regular phoneme
variants or flow variations, but need to reflect irregular pitch changes due to personal pronunciation
habits, including accent drift.

• Exogenous words: Labeled strictly according to the actual pronunciation; if there is a phoneme
outside the phoneme set of the language, for example, if a foreign source word is pronounced in
English, do not use the approximate phoneme of the language for labeling, and the problem should
be recorded to Party A.

• Consistency: Phonemes, accents, syllable divisions, and tone annotations must be consistent
before and after.
III. Prosody Annotations
• Continuous reading: Party B is requested to give a reasonable and objective continuous reading
appraisal standard for the language, and implement it after consultation with Party A.

• Prosodic phrases: Combine the comprehensive characteristics of pauses and extensions at the end
of the phrase to mark prosody phrases.

• Intonation phrase: Combines the comprehensive characteristics of pause, phrase end extension,
pitch reset and other comprehensive features to mark intonation phrases.

• Consistency: Prosody annotations must maintain consistency in the sense of hearing before and

You might also like