Download as pdf or txt
Download as pdf or txt
You are on page 1of 25

LOFT 2.

0 Introduction Outline
● How to Open LOFT 2.0 Platform
● How to Park an Question ID
● New Feature on Loft: Pinned Question
● Let’s Discover the Interface: Question ID
● Let’s Discover the Interface: Video Link
● Let’s Discover the Interface: Analytics
● Let’s Discover the Interface: Language Guidelines
● Let’s Discover the Interface: Shortcuts Guide
● Let’s Discover the Interface: Skip
● Let’s Discover the Interface: Submit
● LOFT 2.0 Useful and Important Links
● Most Common Mistakes and Significant Clarifications
● Quality Measure
● Rejected Tasks Criteria: Quality Rate is 1
How to Open LOFT 2.0 Platform
1- Open a new Google account or use your incognito page. (Do not use your
personal Google account, log out first!)

2- Write the account and password that is provided to you.


Example:
LoFT account: mu-ca2ece@sdoperapera.com

Password: Wy6hLZA9RjhA+o9PLt9xWXT0
3- Change your password and note somewhere. (If you lose the password, it
takes 2 weeks to reset the password and new account cannot be provided.)
4- Then, click on the platform link of LoFT 2.0:
Loft 2.0 Platform Link (this is where you will work on):

https://datacompute.google.com/w/speech_mu_worker
5- Watch the video that illustrates the above process:
https://drive.google.com/file/d/10PiBKyoMQnG26gvUe2hZP7OQzA-ZI2Fr/view?usp=sharing

6- Loft 2.0 Basics Video: This video will show you the basic features of the Loft 2.0 platform:
https://drive.google.com/file/d/1iLM30w1jbhklrIlhCLz5AM40J_O7ZUOY/view

7- Loft 2.0 User Guide: Before getting started, you must read the guidelines of Loft 2.0. This doc provides a
basic understanding of the Loft 2.0 tool as well as provides helpful tips to make transcription time more
efficient.

https://developers.google.com/speech-data-ops/guidelines/longform/generic_v2_test_set/
How to Park an Question ID and Work in Your
Parking Lot
After you open the LOFT 2.0 Platform, it is time to start the actual work!

1- To park a question id, you must click on P icon.

2- You will face with two options:

Park and go to the next Question


Park and hold the current Question on the Workbench

Please, choose the second option as Park and hold the current Question on the Workbench.

3- After you chose the second option, the question id that shows up on your screen will be Parked to your
account. It will be yours and you have to complete it in 72 hours according to the LOFT 2.0 and language
guidelines.
4- To double-check that the question id, you can click on Parking Lot icon and see the question id and
open it. Now, it means you work on your Parking Lot in the correct work! Each time you open the system,
go always to your Parking Lot!

5- The screenshot also shows that you are in your Parking Lot as P is bold.

6- Watch the video that illustrates the above process:


https://drive.google.com/file/d/1YA_0Xq_dwuFjydwubvVeBCQbSs0e4G7z/view

7- Parking functionality allows workers to stop and resume work on a given question at their convenience.
While the question is parked, it will not be assigned to another worker.

8- Workers are required to park the question as soon as it appears in their workbench before starting to
transcribe. Each member can only park 1 task.
9- Once you park your question id, please delete all pre-filled segments which are created and written by the
machine.

If you forgot to delete, you will get an error exclamation mark and will not be able to submit after you
complete. They are mostly wrong and must be edited by your side. Delete not to lower quality and to proceed
smoothly.
New Feature on Loft: Pinned Question
1. You can continue parking a task in your parking lot.

2. When you open your account, you can see one more task in your workbench as always (not parking lot)

3. If you would like to continue working on your parking lot, then you will open it and start working.

4. If you would like to work on the task that is on your workbench, now you have an option to "pin" it.

5. If you make a simple edit in the task on your workbench, it will be pinned to your account and it will save all
of your edits.

6. Even if you close and open your account, you will continue seeing it on your workbench.

7. Until you submit your "pinned" task, you will not be able to open your parked task or see any other tasks on
your workbench.

8. If you would like to unpin that task for any reason, you should contact me with the sdo account.
Let’s Discover the Interface: Question ID
It is mandatory to note your Question IDs to double-check your work quality and your payment calculation
when the pay-day arrives. Also, for you be sure that you work on the correct Question ID.

sample question id from transcription pool:


230feb218210e00753bd381f066c5f91+speech_mu+tr_tr_23_publicaudio+INTERNAL+en:14790783760723858319
Let’s Discover the Interface: Video Link
If you click on the Video Link part, it will direct to the YouTube link where shows your Question ID’s match
in YouTube, and usually question ids are first 30-minute of that Video Link.

It is beneficial:

To double-check your work while working, to distinguish speakers, understand speech better, and decide
on the right annotation.

Some question ids do not have Video Link.


Let’s Discover the Interface:
Analytics
1- Click on the Analytics symbol which is
located in the right-upper side of the
platform.

2- You will have two option to choose: last 7


Days or 30 Days contribution, you can see
the progress here about when you submitted
the task id and time spent that are the
important data to compare with your project
manager in case of any issue.

3- It is suggested for you to not only track


your progress here, but please note your
Question ID, duration and the date of
submission to compare your progress at the
end of the project.
Let’s Discover the Interface: Language Guidelines

If you click on the symbol that it is shown in the picture, it will direct you to WDC Language Guidelines,
which contains the general written rules for your language. It is advised to keep it open in a separate tab
while you're working to go back to it whenever you need. This file can only be opened with the account that
is provided to you.

Example link:

https://developers.google.com/speech-data-ops/guidelines/longform/ko_kr_test_set (This link changes


according to the language that you work on and can only be opened with the account that is provided to
you.)
Let’s Discover the Interface: Shortcuts Guide

When you click on the keyboard symbol, you


will reach out the Shortcuts Guide where
includes LOFT 2.0 system shortcut to make
your working process faster. Try to use them
while working, so your productivity can
increase.
Let’s Discover the Interface: SKIP
In this system, there is one major issue that is not forgivable: Skip or Submit: If you are to "submit" or
"skip" any task for review without completing it (no matter by mistake or not), your account will be
suspended for an uncertain period of time. If you do this twice, your contract will be immediately and
permanently terminated. These tasks naturally will not be compensated.
Always take permission from your project manager before you skip any Question IDs which are eligible
to!
Allowed to skip tasks: Below is a list of Valid reasons that you must skip the audio that may be available:

○ No Audio: The audio doesn't load.


○ No Sound: The waveform indicates there is audio but I can't hear anything.
○ Other Locale: All of the speech is in a different language.
○ Silent Audio: The entire utterance is silent.
○ Noisy Audio: The entire utterance is too noisy.
○ Other: Extreme profanity, extreme sexual content or extreme religious hatred
○ Please note that skipped tasks are not paid for.
Let’s Discover the Interface: SUBMIT
In this system, there is one major issue that is not forgivable: Skip or Submit: If you are to "submit" or
"skip" any task for review without completing it (no matter by mistake or not), your account will be
suspended for an uncertain period of time. If you do this twice, your contract will be immediately and
permanently terminated. These tasks naturally will not be compensated.

You can click on SUBMIT once you fully complete your Question ID!
LOFT 2.0 Useful and Important Links
Please, watch out each video per LOFT 2.0 rule. In this way, you can have a better understanding of
LOFT 2.0 system!

Pre-Filled Error:
https://drive.google.com/file/d/1NwZQECdmS0L4pt-lZBgTTWWpS401yxpS/view?usp=sharing
Turn Creation:
https://drive.google.com/file/d/1PKQ80UacOkLIwH3e6fsvKUWCzHU92Fi-/view?usp=sharing
30-Second Rule: https://drive.google.com/file/d/1e0pGdib3ACztkjY4FKStijr1WqIKeSVr/view?usp=sharing
0.5 Second Pause and Splitting Turns 1:
https://drive.google.com/file/d/17bX7_TVreMEcPefyImYkylotlNmH1qRm/view?usp=sharing
0.5 Second Pause and Splitting Turns 2:
https://drive.google.com/file/d/11imVoSe6XWsjnBOEDFfh5UWd5H3U6we5/view?usp=sharing
Spacing Errors:
https://drive.google.com/file/d/1cvwGq8c7V9NJKDAWoz8hOxrw-GYEEHBd/view?usp=sharing
speaker [Number]:
https://drive.google.com/file/d/1DklWGAzQM0060bAnSBkZp4LFigBhPhS9/view?usp=sharing
speaker [Name]:
https://drive.google.com/file/d/1LshT49ePEC70JXsuYbHxInSslEdohOGG/view?usp=sharing
unidentifiable speaker:
https://drive.google.com/file/d/10V0jfzKk0lovrClN9HrizY95V54NJLtj/view?usp=sharing
pre recorded speaker:
https://drive.google.com/file/d/11yXCSmgXwvdmNdNCsbJUTTuNpJCHC7D3/view?usp=sharing
20+ speakers:
https://drive.google.com/file/d/1fruW0FgNPpEHbW-uZTPXlFG3CmgCgmIk/view?usp=sharing
Annotations: Applause-DTMF-Ring tone:
https://drive.google.com/file/d/1IlRr9r0msXTJygcyEClJjeKLC1n_d0sq/view?usp=sharing
Annotations: Music-Laughter-Unknown:
https://drive.google.com/file/d/1nonaJce9ysuJoxO_zFpX8NBheK2P39gy/view?usp=sharing
Speech annotations: Foreign speech-Singing-Unintelligible:
https://drive.google.com/file/d/1EJ8eUphJ32M310AIYGM0bTLfjYPbXJx1/view?usp=sharing
Noise: https://drive.google.com/file/d/1rl53wZnviIqXl7Q7ulIJrrSTT9LiIB43/view?usp=sharing
PII [Personal Identifiable Information]:
https://drive.google.com/file/d/1SL9yFbcmMsRYy4QAj4teFyvl2y_XxjfA/view?usp=sharing
MOST COMMON MISTAKES & SIGNIFICANT CLARIFICATIONS

30-second Rule: No speaker turn should last for more than 30 seconds. This covers all situations, even those
where the speaker turn is unintelligible, foreign speech or singing. When a speaker talks, sometimes the
speaker turn can’t be split exactly at the 30-second mark because it means that a word would get cut off. In
such cases, you can end the segment as close to the 30-second mark as possible, but without cutting off
words. So, a segment can end at the 28-30-second mark, but never at the 30.5-second mark. The 30-second
rule does not apply to annotations. The annotations for noise, music, and laughter can last for more than 30
seconds.

0.5-second Rule: There should never be a speaker turn with a pause in the speech that lasts for more than 0.5
seconds. If a speaker stops (takes a breath or makes a pause in their speech for whatever reason) for more
than 0.5 seconds, and then continues their speech – the speaker turns should reflect that. The same goes for
annotations. Noise, laughter and music annotations should be ended if there is a 0.5 second pause in the
sounds that should go under these annotations. Buffers should NOT be taken into consideration while
determining 0.5 second. It should solely be based on when the sound/speech begins and ends.
PII: Personally Identifiable Information is information that is not publicly available such as first and last names,
personal phone numbers and home addresses. All first and last names should be PII. The only exception is
internationally famous figures such as Tom Hanks, Taylor Swift, Hillary Clinton, Lionel Messi, Mohamed Salah...
etc. This makes most vloggers' names PII. Fictional and historical characters are NOT PII, e.g., Sigmund Freud,
Leonardo da Vinci, Arya Stark, Buzz Lightyear. Please see the full PII list here:
https://developers.google.com/speech-data-ops/guidelines/longform/en_us_test_set/longform_generic_rules
#generic_longform_pii

NEW Update on PII

“When PII is heard, create a new speaker segment that captures the audio range of the PII speech. Add the
PII label and assign it to the appropriate speaker. The transcription tool will either allow you or deny you from
transcribing PII Speech. If the tool allows it, transcribe the PII audio.”

The change applies to LOFT transcription of all languages but to some projects only. The update is not
applicable to short form transcription. LOFT projects with the PII transcription requirement will start rolling
out in the coming weeks and the tool will automatically enable or disable LOFT PII transcription where
applicable. It’s important that transcribers always follow the instructions in the guidelines and only transcribe
PII Speech when the option is enabled by the tool.
100 ms buffer: You are encouraged to start speech turns 100ms or less before the speech actually starts so
you can ensure that the initial sounds are not cut off. The same goes for the end of the speech, you can add
extra 100ms or less before ending the turn. It's an optional way to ensure quality and avoid cutting the speech
off. However, adding more than 100 ms buffers is a quality error.

Annotations: Please annotate everything that you can hear, even if the line is flat. This means that all speech,
noises, music and laughter should be annotated whenever heard. The waveform is used just as a rough guide
and your only source to determine annotation is the sound itself. Additionally, same kind of annotation (for
example, two noise) cannot be overlapping. Even if you hear two different kind of noises at the same time,
you should express this with only one Noise annotation.
Vocalization/Voice Changing: There are multiple scenarios in this case:

o 1. Case: If the speaker voices different characters with minimal effort to change the voice (it’s easy to
determine that there is only one speaker just by listening to it) – speaker 1 (not pre recorded)

o 2. Case: Speaker changes the voice using software, but it’s still obvious that it’s the same speaker (short
phrases, sped up or slowed down sentences from a speech, etc.) - speaker 1 (not pre recorded)

o 3. Case: The speaker has a recording of himself and both the “primary” voice and the pre-recorded voices
can be heard: speaker 1 and pre recorded speaker 1 (pre recorded only for the recording within a recording)

o 4. Case: A speaker that uses advanced software and editing to change his voice so it’s impossible to
recognize without looking at the video: as many speaker labels as there are voices or characters.

What to do with 21st Speaker in the tasks:

All Loft transcription tasks should have a maximum of 20 speakers (the total of regular speakers, pre recorded
speakers and ONE unidentifiable speaker). The moment a 21st speaker is introduced, please do not transcribe
their speech and stop working on the task then. Once the 21st speaker is introduced, the task should be
considered done and can be submitted for review.
Speaker Naming #1: Please label speaker names as heard in the audio whether they are nicknames or real
names. However, nicknames shouldn't be PII. Also we do label speaker names as heard even if they are
fictional characters. E.g., speaker Unicon, speaker Bear, speaker Watermelon.

Speaker Naming #2: Once you find out the name of a speaker during the audio and change the speaker
label format from "speaker #" to "speaker Name", you should re-arrange the rest of the speaker numbers
accordingly. For example, if there are speaker 1 and speaker 2 in the audio, and you find out the name of
speaker is Ali after a certain point. Then you should change the speaker 1 to speaker Ali, and speaker 2 to
speaker 1.

Must-Read Doc: How to Avoid Low-Quality Work:


https://docs.google.com/document/d/1Jjz2Rw_xAiHBL1xwq155kiIzoRsoVhjLzoLvYsuInt8/edit?usp=sharing
Quality Measure
After you complete your Question ID, you will submit and it will go to the reviewer pool where one native
reviewer controls your work in detail. The reviewer writes a feedback for your own work and provide a
Quality Rate from 1 to 5.

Rejected Tasks Criteria: Quality Rate is 1

● Our reviewers may reject tasks if they have fatal errors and completely do not follow the
guidelines. Mentioned in the next page.
● If you get 2 rejected tasks for the same reason, the project manager will remove the member
from the team.
Rejected Tasks Criteria: Quality Rate is 1

If any of these types of mistakes occurred with the number of times referred next to it, then the task is
considered a rejected task and will not be paid for.

● Skipping a task that should not be skipped


● PII not marked - 10 times in one task
● 30-second rule not followed for the entire task
● 0.5 Rule not followed for the entire task
● No annotations at all for the entire task.
● Missing speaker turns - Speech not transcribed
More than 10 seconds of missing speech
● Pre-filled speaker turns not corrected
Pre filled speaker turns are around 1 minute long. If more than one pre filled speaker turn is not
adequately corrected, it’s grounds for rejection.
● Missing speaker labels
All of the speech transcribed under one speaker label with more speakers clearly heard.
Do not forget that all data here is confidential, it is not allowed to
share outside of the organization.

You might also like