PT-BR Transcription rules-0124-EN

Portuguese (Brazil) Transcription rules
Table of Content
1. Introduction of Platform Manual...........................................................................................2
explanation：.............................................................................................................................2
Keyboard shortcut:.....................................................................................................................3
2. Workflow....................................................................................................................................3
3. Annotation Guidelines...............................................................................................................3
3.1 Discard：.............................................................................................................................3
3.2 Segment：......................................................................................................................3
a) Part of gray area is unclear.............................................................................................3
b) Part of gray area is overlapped.......................................................................................4
c) Part of gray area is music, melodies, songs, animal or natural sounds:........................4
d) The selected speech should start with (and end with) up to 2 modal words................4
e) English in the audio:........................................................................................................4
f) Do not think about the completeness of the sentence..................................................5
3.3 Text transcribes...............................................................................................................5
a) Spaces are needed between words. Never wrap text....................................................5
b) punctuation.....................................................................................................................5
c) Capitalization..................................................................................................................5
d) Arabic numbers...............................................................................................................5
e) The final intercepted audio must contain at least two words (≥2)................................5
f) half pronounced word:...................................................................................................5
g) Repeated words and sentences......................................................................................5
h) Modal words ..................................................................................................................5
i) homophone:....................................................................................................................5
j) Simplified form/ spoken language/dialects...................................................................6
k) Child's voice, voice change, digital sound.......................................................................6
l) Bad language, abusing words, accelerated audio:.........................................................6
m) Double spaces between words are ok..........................................................................6
n) when the Portuguese words are the same with English words, but with different
pronunciation.........................................................................................................................6
3.4 Appendix：.....................................................................................................................6
3.5 Modal word list:..............................................................................................................6
1. Introduction of Platform Manual
Cut a section of clear human speech from the audio and transcribe the audio into text.
explanation：
• gray part: a piece of intercepted audio by default, we can ONLY revise the gray part.
• blue part: your segmented result, also you shall transcribe it into text.
• white part: the audio before and after gray part, no need to transcribe or/and segment;
but you could also listen to this part just for your reference.
• Audio classes:
￮ Speech - clear human voice
￮ Discard - audio does not meet ASR speech requirements.
• Text box: where text is entered.
• Video: helping to understand what the speaker means
Keyboard shortcut:
1 - continue to play where you left off.
2 - pause.
3 - play the entire audio.
5 - play default audio (gray area)
a - play cut (current cut-blue area)
s - start cut.
e - end cut.
2. Workflow
• Step 1. Listen to the intercepted audio.
• Step 2. Select audio category (speech or discard)
• Step 3-1. If you choose ‘discard’ classification, submitting this task directly. No need to
change the text below.
• Step 3-2. If you choose ‘speech’ classification, you need to determine whether to intercept
the audio or not. And then transcribe the audio.
3. Annotation Guidelines
3.1 Discard：
• the entire audio is in non-Portuguese language.
• the entire audio is either European Portuguese or African Portuguese (Do not discard
dialects with Brazil.)
• the entire audio is unclear or non-audible speech.
• the entire audio is song with lyrics in the background or non-human speech, which
includes melodies, animals' sounds and natural sounds.
• Only one word should be discarded.
• the entire audio is uncountable modal words, should be discard. If there are 2 or 3
modal words, we could still transcribe it.
Note: If you select “discard”, no need to transcribe, just click “submit” and go to next audio.
3.2 Segment：
a) Part of gray area is unclear.
• A segment should always start with clear words, if it’s unclear in the front part or the
back part, you have to intercept it.
• If it’s unclear in the middle of a speech, please cut either side.
For example: “Clear speech1 + unclear+ Clear speech2” -- either “Clear speech1 ” or
“Clear speech2” is accepted for a segment. Do not transcribe both.
But note: if there is noise/pause in the middle of a speech, we could include the noise/pause.
※Example: "speech1 + pause/noise+speech2". Transcription: speech 1+ speech2.

If the noise affects the content, intercept it, keep Portuguese audio and transcribe. If the noise
does not affect the content, ignore it and transcribe the entire audio.
b) Part of gray area is overlapped (2 or more speakers talking simultaneously)

• talk about different things simultaneously, but we CAN’T tell the content—please cut
this part out and keep the rest and clear part to transcribe;
• talk about the same words simultaneously and the words sound clear, you need
intercept this part in and transcribe it;
• not talk at the same time, the audio should be regarded as a normal speech case and
transcribe it;
• There is one main voice in a group conversation, the others are low or fuzzy, and the
sound articulation of the main speaker's speech does not be affected by others. So,
transcribe the main one, and regard others as background sound or noise.
c) Part of gray area is music, melodies, songs, animal or natural sounds:

• If the entire audio is a song or non-human voice, like music, melodies, the sound of
animal and nature and so on – discard this audio.
• if the speaker is singing a song without background melodies – transcribe it.
• if the speaker is singing a song which follows melodies – discard.
• If the background sound is a song with lyrics, cut this part out and reserve the clear
human speech part or discard the entire audio if it's hard to cut the audio.
• If the background sound is melodies without lyrics, keep it and transcribe the entire
audio.
***BUT, if the background sound does not affect the clarity of the speaker's speech, transcribe
the speaker's speech and ignore the background sound.
d) The selected speech should start with (and end with) up to 2 modal words.
• There is a paragraph laughing (around 10 "ha") at the beginning of speech, it's enough
to keep a fraction of this part in audio (around 2 "ha ha" in audio ).
• More transcription rules for modal words are shown in 3.3.7.
e) English in the audio:

• “English+ Portuguese” or “Portuguese + English” or “English + Portuguese +English”
---- Segment Portuguese part and transcribe.
※For example: If you hear the sound: “xxxxxx(Portuguese) I love you, Jenny." You
should intercept "I love you, Jenny." and only transcribe the Portuguese part:
xxxxxx(Portuguese).
• “Portuguese A + English + Portuguese B”
English part ≤3 words, segment & transcribe “Portuguese A + English + Portuguese B”.
English part should be transcribed as its corresponding Portuguese word with the
same pronunciation.
English part＞3 words, segment only “Portuguese A” or “Portuguese B”
f) Do not think about the completeness of the sentence, while cutting the audio.
3.3 Text transcribes.

a) Spaces are needed between words. Never wrap text.
b) No punctuation in text except apostrophe (') and hyphen (-)
c) Capitalization
• Do not capitalize the first letter of text except for proper nouns
• Proper nouns like city name, street name and etc. should be capitalized accordingly.
e.g. New York, Istanbul, Turkey, KFC, NBA
d) Arabic numbers should be transcribed into the word in Portuguese E.g. 1--> um
e) The final intercepted audio must contain at least two words (≥2).
f) half pronounced word:
g) Repeated words and sentences must be transcribed strictly according to the number of
times they get repeated.
h) Modal words need to be transcribed. eg. "ha ha", "hi", "yeap". (refer to modal word list,
Page 6)
• If you can clearly count the number of the modal words, you should transcribe.
For repeat modal words, write down the same number of modal words in the audio.
eg. 3 "ha" in the audio, you need to write "ha ha ha" in the text.
• Uncountable modal words ---- do not transcribe.
※ “errrrr” or “emmmm”, we can not count the number of time also with a different
length voice, so it's enough to add 2 letters in the end of modal for instance:
"emm,ohh”
i) homophone:
• Listen to the following default cut to confirm what the whole sentence is, write down
the correct word by context.
※Eg1. The current cut is "The hole (or other word that sounds like /həʊl/ but you can
not confirm) ", but you can know the sentence is "The whole town disagreed with the
mayor." from the following default cut. So the right transcription of this case is "the
whole".
• If there are multiple homophones whose meaning conforms to the meaning of the
default cut sentence, you can write any word.
※Eg2. The default cut is "where is my deer/dear." Both words match the meaning of the
sentence, you can write anyone.
j) Simplified form/ spoken language/dialects
Transcribing the corresponding form that speaker says. Transcribe what you hear, do not correct
grammar mistakes.
※Eg1. “I'm gonna do some sports”. According to the audio, must be written as "gonna", cannot
be written as "going to".
k) Child's voice, voice change, digital sound
• if the child's voice is clear but not standard, you should also transcribe it in standard pt-
BR.(only for children)
• If you can hear clearly with voice change or digital sound, please transcribe it.
l) Bad language, abusing words, accelerated audio:
• Both the abusing words and bad language need to be transcribed.
• if the accelerated audio can be heard clearly, just transcribe it.
m) Double spaces between words are ok.
n) when the Portuguese words are the same with English words, but with different
pronunciation.
e.g. Que extra que você tá que extra que você tá que extra
the word “extra” is a shop name here. They are actually not English words. Please transcribe the
word what you hear.
3.4 Appendix：
https://www.academia.org.br/sites/default/files/conteudo/o_acordo_ortogr_fico_da_lngua_po
rtuguesa_anexoi_e_ii.pdf
3.5 Modal word list:

Emoção Interjeições
alegria / satisfação Ah, Oh, Oba, Eba, Viva, Gol, Iupi
advertência Cuidado, Atenção, Devagar, Sentido, Calma, Alerta, Olha
agradecimento Grato, Obrigado, Graças a Deus
afugentamento Fora, Passa, Rua, Xô, Vaza, Cai fora, Vá embora
alívio Ufa, Uf
Ânimo, Avante, Coragem, Eia, Força, Vamos, Adiante, Firme, Toca, Vai
animação / estímulo
nessa
aplauso / aprovação Boa, Bis, Bravo, Apoiado, Muito bem
concordância Claro, Ótimo, Certo, Sim, Pois não, Tá, Hã-hã
desaprovação / Credo, Irra, Ih, Livra, Safa, Fora, Abaixo, Francamente, Xi, Aff, pela fé,
repulsa pelo amor de Deus
desculpa Perdão, Desculpe, Desculpa
desejo / intenção Tomara, Pudera, Oxalá, Quem me dera, Quisera eu
despedida Adeus, Até logo, Bai-bai, Tchau, Até amanhã, até mais, falou
dor Ai, Ui, Ai de mim
dúvida Hum, Hem, Hã, eh
espanto / admiração / Uai, Puxa, Nossa, Céus, Caramba, Quê, Meu Deus, Uol, Vixe, Oxe, Nossa
surpresa senhora, Opa, Carai, Caraca
impaciência /
Hum, Hem, Pô, Raios, Diabo, Irra, Ora, Raios o partam
contrariedade
Hei, Hã, Como, Que, Quem, Oi, Onde, Por que, Com mil demônios, O
interrogação
que inferno, o quê, Como assim
aversão Droga, Inferno
cansaço Ufa
pedido de auxílio Socorro, Aqui, Piedade
saudação /
Salve, Viva, Adeus, Olá, Oi, Alô, Ei, Tchau, Até logo, Até a próxima, Até
chamamento /
mais ver, Ô, Ó, Psiu, Valha-me Deus, Beleza, E aí
invocação
silêncio Quieto, Shh, Shiu
suspensão / cessação Basta, Chega, Pare, Stop, Alto, Não mais, pera aí
Credo, Cruzes, Jesus, Que horror, Que medo, Jesus Maria e José, Uh, Ui,
terror / medo
Barbaridade, Socorro, Francamente, misericórdia

PT-BR Transcription rules-0124-EN

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

PT-BR Transcription rules-0124-EN

Uploaded by

Copyright:

Available Formats

Portuguese (Brazil) Transcription rules

￮ Speech - clear human voice

￮ Discard - audio does not meet ASR speech requirements.

• Text box: where text is entered.

• Video: helping to understand what the speaker means

3 - play the entire audio.

5 - play default audio (gray area)

a - play cut (current cut-blue area)

• Step 2. Select audio category (speech or discard)

• the entire audio is unclear or non-audible speech.

• Only one word should be discarded.

• If it’s unclear in the middle of a speech, please cut either side.

※Example: "speech1 + pause/noise+speech2". Transcription: speech 1+ speech2.

b) Part of gray area is overlapped (2 or more speakers talking simultaneously)

c) Part of gray area is music, melodies, songs, animal or natural sounds:

• if the speaker is singing a song without background melodies – transcribe it.

• if the speaker is singing a song which follows melodies – discard.

• More transcription rules for modal words are shown in 3.3.7.

e) English in the audio:

English part＞3 words, segment only “Portuguese A” or “Portuguese B”

3.3 Text transcribes.

e.g. New York, Istanbul, Turkey, KFC, NBA

3.5 Modal word list:

You might also like