Professional Documents
Culture Documents
Hindi Written Domain Conventions
Hindi Written Domain Conventions
https://speech.google.com/annotation/guidelines/hi_in_test_set/index.html 1/12
9/27/2018 hi-IN_TEST_SET
Transcription quality
Comply with the standard rules of the writing system.
Typo
Avoid making any typographical errors. Carefully check your work before marking items as "complete".
म घर जा रहा ं ।
NOT: मे घर जा रहा ं ।
Context error
Do not correct speaker's grammar if they intentionally say something, even if what they say does not follow the standard grammatical rules of the transcription language.
Do not transcribe words that are not spoken, even if they are obviously intended by the speaker. Avoid putting words in the speaker's mouth. However, do transcribe implied times and
units of currency.
Transcribe all words spoken, even if they are not intended by the speaker. For interjections and non-speech vocalizations, refer to Agreed Spelling > Interjections and Dif cult Utterances
> Hesitations and Truncations.
सिचन फल फल खा रहा है ?
Substitution
A substitution error occurs when another standard word is transcribed instead of what was meant by the speaker. If what the speaker said falls into another category (Context Error,
Proper Name, Media Title, etc.), see the relevant section.
Spacing
आपका नाम ा है ?
NOT: आपका नाम ा है ?
For most types of punctuation, do not put a space between the preceding word and the punctuation.
चुप करो!
NOT: चुप करो !
नम े, यह डॉ. दीपक ह।
NOT: नम े , यह डॉ. दीपक ह ।
For quotation marks and similar punctuation, put a space before the opening punctuation, but not necessarily after the closing punctuation.
https://speech.google.com/annotation/guidelines/hi_in_test_set/index.html 2/12
9/27/2018 hi-IN_TEST_SET
Punctuation
Follow the punctuation regulations of your locale. Additional conventions are outlined in this section.
Add punctuation where needed, but err on the side of keeping it minimal.
म तुमसे ार करता ं ।
NOT: म तुमसे ार करता ं
मुझे तुम पसंद हो। Includes subject and verb. Sounds like a whole utterance rather than just a conjunction to a larger sentence.
Sometimes a phrase which is not obviously grammatically a sentence should nevertheless be treated as a sentence because of its context, e.g. if it's an answer to a speci c question, or if
it's an example where dropping the subject sounds completely natural as a complete sentence.
खाने पर आ रहे हो कल? Although the subject is dropped, this still sounds completely natural and should be treated as a complete sentence.
िद ी का मौसम This is asking for information, but the most likely interpretation is as a sentence fragment on its own.
Interjections, greetings, and farewells said in isolation should be considered complete sentences and punctuated as such.
वाह! interjection
नम े। greeting
प ा। िफर िमलते ह। This includes both a yes/no word and a farewell, with a long pause between.
ओह अरे हाहा ओ हो
हे भगवान हाय धत
Do punctuate phrases that are intended to be used by the speaker as a web search, not as full sentences.
इं िडया का िच
NOT: इं िडया का िच ।
अिमताभ ब न की िफ
NOT: अिमताभ ब न की िफ ।
Capitalize sentence fragments that sound like the beginning of a sentence. Add end punctuation to sentence fragments that sound like the end of a sentence. For fragments that do not
clearly sound like the beginning or end of a sentence, leave out capitalization and punctuation. Note that sentence fragments may be a result of cut-off audio samples.
बोला िक इस बारे म संजय से बात मत करना। Audio was cut o at the beginning.
दु कान पर जा रहा ं । तु े पता है िक यह चाय िकतने की है ? Do not put a period, hyphen, or ellipsis, even if another sentence follows.
जा रहे थे परं तु Sounds like the middle of a sentence; beginning and end were cut o .
If an utterance is not clearly a sentence according to the above rules and examples, do not capitalize or punctuate it as a sentence.
Commas
Only use commas where required. Err on the side of minimal punctuation. Do not rely on intonation.
For complete sentences that follow a single word or phrase that focuses the meaning of a sentence, put a comma after the single word or phrase.
कद् दू , फल या स ी? topic-comment
Use a comma when a sentence starts with a discourse word, interjection, or yes/no word. However: If there is a long pause between a discourse word, interjection, or yes/no word and a
full sentence that follows it, treat that initial word as a separate sentence.
अ ा दो , जो भी करो सावधानी से करना। Yes/no word. Other examples of these types items include "हाँ ", "अ ा" and others.
शायद, पर मुझे प ा नहीं पता। Use a comma when there is no pause, or when there is a pause that isn't long.
शायद। पर मुझे प ा नहीं पता। Use a period when there is a substantial pause after "शायद".
Use commas for non-restrictive modi ers, but do not use commas for restrictive modi ers. The basic test for this is whether the modi er can be dropped from the sentence and still
keep basically the same meaning.
इं िडया के धान मं ी, नर मोदी, अमे रका गए Non-restrictive modi er. "नर मोदी" does not change the core meaning of "इं िडया के धान मं ी", it just adds additional information about the Indian prime
थे। minister.
Use commas in sign-offs, such as those at the end of a message. Do not use end punctuation.
तु ारी दो , सोनाली
Do not use commas in sentences that consist only of a greeting and an addressee. If a greeting occurs at the beginning of a sentence or fragment, place a comma after the greeting. If the
greeting includes an addressee, place the comma after the addressee.
नम े।
नम े िवनोद।
नम े नेहा। म पूजा बोल रही ं । Long pause between "नम े नेहा।" and "म पूजा बोल रही ं ।". Treat as separate sentences.
Except in greetings, sentence-initial and sentence- nal addressees should be separated by a comma.
https://speech.google.com/annotation/guidelines/hi_in_test_set/index.html 3/12
9/27/2018 hi-IN_TEST_SET
तू कैसी है , मनीषा?
The phrase "Ok Google" in isolation is transcribed without a comma or end punctuation. When the phrase appears before longer utterances, place a comma after "Google".
Ok Google
Intonation marks
Capitalize and punctuate the following as questions: 1) All queries syntactically built as questions, regardless of intonation. 2) All queries which sound like they are being used as
questions, regardless of sentence structure.
िद ी का मौसम Query uses rising intonation, but is most likely a web search rather than a true question.
If a speaker uses clearly exclamatory intonation, use an exclamation point. If there is any doubt, err on the side of using period.
चुप कर!
Use a comma between reported speech verbs and direct quotations. Do not put punctuation within quotation marks unless the punctuation belongs to the reported speech.
If the text in quotation marks quali es as a sentence, punctuate as if it were its own utterance. Do not alter its end punctuation even if the quote is within a sentence. Do not add excess
punctuation after end quotation marks.
Use a colon but no quotation marks in quotative voice actions when the quote follows the command. Use quotation marks when the quote is in the middle of the sentence.
च म "मुझे तुमसे ार है ।" कैसे कहते ह? The quote is in the middle of a sentence, so use quotation marks.
When speakers make a request for single words to be translated into another language, don't punctuate or capitalize the words, even if you'd consider the words as sentences in other
situations.
नम े।
Do not use quotation marks for metalinguistic uses of words or phrases. These uses include de ning the word, talking about the spelling of the word, or any other type of reference to
the word itself as a thing.
Other symbols
Apart from standard letters, you should not use any other symbol than: 0-9 äâàæÆçÇéèëêïîñÑôöŒœüûùμÿÄÂÀÉÈËÊÏÎÔÖÜÛÙŸ²³,?!'"_°:.()<>{}[]√/@#$€£₹+=%*&-.;
When two opposing teams are mentioned, include a hyphen between their names.
Use hyphen in phrases and compounds typically written with hyphen. If in doubt, use hyphen.
Spoken punctuation
For sentence-level spoken punctuation, write out the full word or words between curly brackets. Do not add punctuation symbols after spoken punctuation. Be careful with homonyms.
(See exceptions in the next rule.)
तुम कैसे हो { िच }
NOT: तुम कैसे हो?
"तुम कै से हो िच "
NOT: तुम कैसे हो िच
NOT: तुम कैसे हो िच ?
Don't spell out internal punctuation like hyphens in web pages, email addresses, addresses, phone numbers, or other word-level punctuation.
If a word that can refer to a punctuation mark is spoken in isolation, it should be written out between curly brackets.
{पूण िवराम}
{अ िवराम}
https://speech.google.com/annotation/guidelines/hi_in_test_set/index.html 4/12
9/27/2018 hi-IN_TEST_SET
Format
Transcribe numbers, abbreviations etc. following the formatting conventions in this section.
Number
Devanagari numerals should not be used, only Western Arabic numerals should be used.
Cardinals and ordinals from 0 to 9 are written with letters (except for measures and currency - see Currency and Unit). Use digits for cardinals and ordinals 10 and above, even if they are
coordinated with numbers under 10. Transcribe all decimal numbers as digits.
When two or more numbers refer to the same noun, and one number is 10 or greater, transcribe both as numerals.
वो 9 या 10 कु े लेकर आये।
If a large number consists of only a number followed by "हज़ार", "लाख", "करोड़", or higher, then transcribe as a numeral plus word. Otherwise, transcribe as numerals.
For long numbers (4+ digits) indicating quantity, insert the relevant separator (comma, decimal point, or space, depending on language).
In math expressions or units & measures, transcribe fraction words using numerals and slashes.
5*6
NOT: पां च * छह "पां च गुना छह"
NOT: 5 गुना 6
For mixed numbers in math and units & measures, use numerals with "and".
When referring to items (not units or measures), write fractions out in words. With mixed numbers, write the whole number part out in words if it is under ten, otherwise write it with
numerals.
For mixed numbers that represent currency amounts, always use decimals.
तुम मुझे $2.50 दे सकते हो? "तुम मुझे ढाई डॉलर दे सकते हो"
शीतल ने यह घर ₹7.5 करोड़ का ख़रीदा है । "शीतल ने यह घर साढ़े सात करोड़ पये का खरीदा है ।"
Transcribe percentages using numerals and the % sign. (In the unlikely case that you encounter a number of a million or greater used as a percentage, spell it out.)
1 िमिलयन ितशत
If a number appears in a context which calls for a certain formatting in your language, use that formatting. Otherwise, default to the general rule for transcribing numbers.
उसने 4.2 का ब ेबाजी औसत रखा। "उसने चार दशमलव दो का ब ेबाजी औसत रखा।"
Transcribe phone numbers using the most common format in the transcription language.
Transcribe phone numbers as you would write them down in their natural blocks. Do not use hyphens between blocks. When applicable, the STD code should be surounded by spaces.
+91 9897 034 241 " स नौ एक नौ आठ नौ सात शू तीन चार दो चार एक"
91 22 3988 3988 "नौ एक दो दो तीन नौ आठ आठ तीन नौ आठ आठ"
Transcribe alpha-digit sequences (product codes etc.) in their most natural way (possibly several ways accepted). Do not transcribe credit card numbers, etc.
XT 660 or XT660
If it really sounds like a math expression, then transcribe it with numbers and symbols, with spaces in between.
5 * 6 िकतना होता है ?
NOT: पां च बारी छह िकतना होता है ? "पां च गुना छह िकतना होता है "
NOT: 5 गुना 6 िकतना होता है ?
When a speaker uses words like "dollar" without specifying a quantity, spell them out.
बस थोड़े पए
https://speech.google.com/annotation/guidelines/hi_in_test_set/index.html 5/12
9/27/2018 hi-IN_TEST_SET
एक या दो ऑ े िलयाई डॉलर
मेरा प रवार 10 िक. . आलू लेकर आया है । "मेरा प रवार दस िक आलू लेकर आया है ।"
Transcribe all numeric values preceding units in numeral form, even if under 10.
वो मकान £1 करोड़ का है ।
NOT: वो मकान £1,00,00,000 का है ।
म वहां 6 महीने से ं ।
NOT: म वहां छह महीने से ं ।
If it is clear from context that a number or number sequence refers to currency or time, format it as such.
िकलो
वग िकलोमीटर - िकमी²
Write times in hh:mm format whenever possible, unless it would look unnatural to do so.
Address
Favor full spellings over abbreviations where natural, but use abbreviations when explicitly spoken.
मौसस, िद ी
धमशाला, राउरकेला
https://speech.google.com/annotation/guidelines/hi_in_test_set/index.html 6/12
9/27/2018 hi-IN_TEST_SET
Web
Write URLs, email addresses, and Twitter hashtags as they are spoken and don't capitalize them.
http://123.com "h t t p colon slash slash one two three dot com"
Do not correct speaker errors such as transcribing a slash when the user actually says "backslash".
http:\\mail.yahoo.com "h t t p colon backslash backslash mail dot yahoo dot com"
If the speaker drops a "w" or dots and it's an obvious URL, you should correct these errors. If the speaker doesn't say the "w"s at all, do not add them.
If a URL is spelled out in individual letters, transcribe without spaces between individual letters.
Abbreviation
J. C. Penney O cial brand name as seen in the privacy policy includes periods.
https://speech.google.com/annotation/guidelines/hi_in_test_set/index.html 7/12
9/27/2018 hi-IN_TEST_SET
Agreed spelling
Spelling conventions for words where several options are thinkable, as well as proper names.
Spelling out
If a word is spelled or obvious pauses are made between letters, spell it into letters as it is said (often done for foreign names or businesses, for example). Use lowercase letters for the
spelled-out portion. This rule does not apply to acronyms or initialisms, or to spelled-out web or email addresses.
Interjections
Transcribe words representing laughter or other non-speech vocalizations with up to three syllables, but no more.
heh, ha, haha, hahaha, hehe, hehehe, boo hoo, boo hoo hoo, lalala
हा हा हा "हा हा हा हा हा"
Ignore actual laughter that is included within speech. If the entire audio contains only laughter, use the [skip] tag in PeraPera or select the appropriate reason from the 'Cannot transcribe'
menu in Crowd Compute.
अरे ! actual laughter followed by "अरे " with clear exclamatory intonation
आह ओह अरे अजी
हे भगवान हे राम हट हाहा
बाप रे अहा
Proper names
Use of cial spelling, capitalization, and punctuation for proper names. Google them and pay attention to the correct format. Of cial format and spelling of a proper name may supercede
the usual written transcription conventions detailed in this document.
Format proper names as they are most commonly formatted on the entity's website (especially of cial documents), if available, or the Wikipedia or IMDb page. In cases of ambiguity,
defer to their privacy policy. If no other sources, use top Google hits.
YouTube
Burger King
Do not spell "Burger King" all in upper case as in the stylized form of the logo, stick to the o cial form as per the privacy policy.
NOT: BURGER KING
LEGO
NOT: Lego
https://speech.google.com/annotation/guidelines/hi_in_test_set/index.html 8/12
9/27/2018 hi-IN_TEST_SET
The phrase "Ok Google", as well as possible derivatives such as "Ok Google Now" and "Ok Glass", require their own particular spelling of "okay". This spelling is unique to these cases.
Ok Google
Ok Google Now
Ok Google, आलू
Okay.
Okay, मोिहत।
Media title
Refer to the Google Play Store for of cial spellings of media titles. For lm/television, IMDb is also available. If an utterance is ambiguous between a media title and a sentence or web
search, use your judgment for which is more likely; if truly unclear, default to media title.
Write media titles as they are most commonly written. Movie titles and English book titles should be written in Devanagari.
दाग द फायर
NOT: Daag: The Fire
गेम ऑफ़ ोंस
NOT: Game of Thrones
गोदान
NOT: Godaan
है री पॉटर
NOT: Harry Potter
राजनीित
NOT: Raajneeti
िकक िफ के िदखाएं ।
NOT: "िकक" िफ के िदखाएं ।
Transcribe all media titles with original punctuation. In cases where original punctuation falls at the end of a sentence, do not transcribe sentence-level punctuation. That is, media title
punctuation trumps sentence level punctuation when in con ict. If a popular media title consists of an entire sentence but the of cial spelling is without punctuation, then don't
punctuate the title. If an utterance is ambiguous between a media title and a sentence or web search, use your judgment for which is more likely and treat it accordingly.
Treat foreign titles the same way as titles in the transcription language if you understand them.
Y Tu Mama Tambien
Multiple spellings
गए
"ए" instead of "ये"
NOT: गये
जाइए
"ए" instead of "ये"
NOT: जाइये
गई
"ई" instead of "यी"
NOT: गयी
छाई
"ई" instead of "यी"
NOT: छायी
गएं
"एं " instead of "य"
NOT: गय
आएं
"एं " instead of "य"
NOT: आय
गईं
"ईं" instead of "यीं"
NOT: गयीं
बाईं
"ईं" instead of "यीं"
NOT: बायीं
Use anuswara, ◌ं , instead of half म when the next character is any of प series consonants प, फ, ब, भ.
भूकंप
NOT: भूक
चंबल
NOT: च ल
गंभीर
NOT: ग ीर
Use anuswara, ◌ं , instead of half न or half ण when the next character is श, ष, स, or any of the क, च, ट, त, series. The full set of these characters is श, ष, स, क, ख, ग, घ, च, छ, ज, झ, ट, ठ, ड, ढ, त, थ,
द, ध.
https://speech.google.com/annotation/guidelines/hi_in_test_set/index.html 9/12
9/27/2018 hi-IN_TEST_SET
मंच
NOT: म
िहं दी
NOT: िह ी
नीलकंठ
NOT: नीलक
रघुवंश
NOT: रघुव श
There is one exception to the above two rules. When the previous character is ◌ॉ, do not use anuswara, ◌ं .
"सॉ ग"
सॉ ग
If you followed the above rules, सॉ ग will transform into सॉंग. While the character sequence in the latter is actually स, ◌ॉ, ◌ं , ग, it looks like the sequence स, ◌ा, ◌ँ , ग, which has a di erent
NOT: सॉंग
pronunciation.
सॉ र
NOT: English word "sombre"
सॉंबर
Always use anuswara ◌ं . Since chandrabindu ◌ँ and anuswara ◌ं are commonly interchanged, only use anuswara ◌ं .
लंगूर
NOT: लँगूर
हं सो
NOT: हँ सो
आं सू
NOT: आँ सू
ं
NOT: ँ
लडिकयां
NOT: लड़िकयाँ
मु ु राएं
NOT: मु ु राएँ
If you hear a word that does not sound like a standard word of your language because there is a small sound change (i.e. accent, speech error, speech impairment, etc), transcribe the
intended word.
अिभषेक "अिभसेक"
Transcribe onomatopoeia when clearly spoken. Otherwise, use the [skip] tag in PeraPera or select the appropriate reason from the 'Cannot transcribe' menu in Crowd Compute.
If you hear a word that does not sound like a standard word of your language, but it is obviously based on real words, suf xes, or pre xes, transcribe as is.
If you hear a word that does not sound like a standard word of your language because it appears to be nonsense, rst perform a Google search for the word. If there is a clear candidate,
transcribe that word.
रामगड User says "रामगड". This might sound like nonsense at rst, but the transcriber guesses the spelling "रामगढ़" and is by corrected Google Search to "रामगड", a place in India. Transcribe रामगड.
भिड़या User says "भिड़या". Transcriber searches "बिढ़या", nds correct results. Transcribe भिड़या
If a word appears to be nonsense and a Google search returns no clear results but it is easy to spell and articulated clearly, transcribe it anyway.
रजनाल
If a word appears to be nonsense, a Google search returns no clear results, and the word is unintelligible or there is no single obvious spelling, mark as [skip] in PeraPera or select the
appropriate reason from the 'Cannot transcribe' menu in Crowd Compute.
"कु रा"
[skip]
or similarly unintelligible word
https://speech.google.com/annotation/guidelines/hi_in_test_set/index.html 10/12
9/27/2018 hi-IN_TEST_SET
Di cult utterances
Everything relating to problematic utterances (background noise, false starts, etc.) or di erent language varieties.
Skipping a prompt
The instructions in this section are for PeraPera. In Crowd Compute, instead of tagging as [skip] the utterances that cannot be transcribed, click in the 'Cannot transcribe' button and
select the appropriate reason.
If the prompt is dif cult to understand, listen to the audio several times to try to understand the speaker. If you can understand the speaker, transcribe the utterance. However, if after
replaying the audio you still cannot understand the speaker, skip.
Skip the utterance if it: contains at least some word(s) that cannot be understood; is in a different language typically not understood; contains no speech; contains only laughter; contains
singing; contains only synthesized speech (e.g. the voices of Google Now or Siri) and/or pre-recorded speech (e.g. TV or radio).
For utterances that contain both user-generated speech and pre-recorded or synthesized speech, transcribe user-generated speech and ignore the pre-recorded/synthesized speech.
कल का मौसम कैसा होगा? User asks, "कल का मौसम कैसा होगा?" Machine responds, "कल बा रश होगी"
If a prompt contains nonsense words, search them on the internet. If no clear results are found and the word is unintelligible (there is no single obvious spelling), [skip] it.
[skip] Speaker says, "जो भी करो सावधानी" and then says something unintelligible.
Click to copy
If the speaker sings, [skip]. Use the tag [music] if an entire utterance is music from an instrument, radio, TV, etc.
[skip] if audio contains only laughter. Ignore laughter that is interspersed with speech (transcribe only the speech).
Profanity should be fully transcribed. Under very rare circumstances, extremely offensive profanity can be skipped.
If the context of an alpha-digit sequence suggests it may be a password, credit card number, social security number, etc., then use [skip].
If a user repeats a sentence for the sake of the phone, format the repetition as a sentence if it's restating (as a sentence) what the person has said.
िद ी का मौसम िदखाओ। िद ीम
मौसम कैसा है ?
िकस मशीन से बगीचे को साफ करोगे? If the repeated phrase is part of the sentence that just happens to form a sentence on its own (possibly under a di erent interpretation), format it as a
बगीचे को साफ करोगे fragment. While "weed a garden" can be a command, it is ambiguous and is most likely a fragment in this context.
Complete words that have been truncated only if a very small portion of the word is missing (one syllable or less in a multisyllable word) and it is obvious what the word should be. In
cases of ambiguity, do not transcribe the cut-off word. Do not put punctuation at the end of truncated words.
If a truncation occurs mid-quote, use an end quotation mark even if there is possibly more intended content.
Transcribe repeated words as many times as uttered, but [skip] if a phrase is repeated more than ve times.
इस घर म पां च पां च लोग रहते ह। "इस घर म पां च अअअअ पां च लोग रहते ह।"
For numbers, stick to what is uttered, even if you know this is not all the speaker is going to say.
Do not transcribe ller words unless intended by the speaker to be transcribed. Never lengthen them.
वो ना ब त बदमाश है ।
Only transcribe foreground speech. A user's speech may go from the foreground to the background or vice versa (determined by change in volume) and can be accompanied by change
in speaker audience.
बगलु Speaker says loudly, "बगलु " and then quietly, "आप बस बोलो और वो ढू ं ढे गा बगलु ।"
शान होटल कहां है ? हम वहीं जा रहे है ना? The speaker changes audience but not volume, so transcribe both sentences.
If one person clearly speaks in the foreground and someone speaks in the background, transcribe the main speaker and ignore the rest.
राधा को कॉल करो। Foreground speaker said, "राधा को कॉल करो।"; background speaker said, "मीरा को कॉल करो।"
If two people take turns, without overlap, and are both in the foreground at roughly the same volume, transcribe the speech of both speakers. Separate the dialogue of different speakers
with end punctuation.
तुम कहां गए थे? म मंिदर गया था। First speaker asked "तुम कहाँ गए थे?", other person answered "म मंिदर गया था।"
पानी पूरी। पर मुझे तो आलू िट ी चािहए थी। "पानी पूरी" is a fragment, but use a period to separate the speech of di erent speakers.
If two or more people are speaking at once with no one clearly in the foreground, tag as [overlapping]. Do this for overlaps longer than one second. Use this tag even when one person is
a bit louder than the other(s) and you can tell what they're saying.
Foreign language
Do not skip utterances that contain words in English. Most of them should be transcribed using Devanagari characters. Only use Latin characters for English words if they are
measurements units, URLs, company names or tech words.
हे लो
Use Devanagari characters for common words in English
NOT: hello
mp3, jpeg
Use Latin characters for technical words
NOT: ्३, पेग्
YouTube. Samsung. Gmail. Use Latin characters for company names that are not normally written in Devanagari
https://speech.google.com/annotation/guidelines/hi_in_test_set/index.html 11/12
9/27/2018 hi-IN_TEST_SET
If words in a foreign language are included in a sentence of your target language, transcribe only if commonly understood by speakers of your language. Otherwise, [skip]. Foreign words
that are commonly used (and therefore should be transcribed) can include names of foreign foods or places, pop culture phrases like "capisce", and greetings or thank yous in prominent
world languages.
िप ा हम ओवन म बनाते ह।
हे लो सर, ा हाल है ? In Hindi, common English words and phrases like "hello" should be transcribed if they are included in Hindi sentences.
The following tips will help you if you're using Chrome input tools extension (https://chrome.google.com/webstore/detail/google-input-tools/mclkkofklk jcocdinagocijmpgbhab) to
generate Devanagari characters using an English keyboard. Please note that the correct characters may not be the rst choice, so please choose the correct word in accordance with
these guidelines.
How to type two of the same consonants in a row using an English keyboard.
ब ा bachcha
च च chammach
छ ा chatta
उ वल ujjval
िच ी chitthi
ाग tyaag
र jwar
ार pyaar
अ ा acchaa
पवत parvat
मं mantr
इं indr
Accents
Correct non-standard pronunciations to their standard ones. Non-standard pronunciations could be from speakers of regional dialects, language learners, or speakers from different
countries.
बगलु
Person said "बंगलूर" but it should still be spelled as standard.
NOT: बंगलूर
https://speech.google.com/annotation/guidelines/hi_in_test_set/index.html 12/12