Professional Documents
Culture Documents
Classify Intent of The Query
Classify Intent of The Query
Adult Classification
Queries that have adult intent and which yield sexual or violent content should be marked as ADULT. Both of these criteria
must be met for a query to be considered ADULT. Please note that in some rare cases you may see queries that are searching
for child pornography. If you encounter one of these, please mark it as ADULT and report a technical issue as well. Queries
seeking strictly informational results on sexual or violent content, or for which search engine results are strictly or primarily
informational do not fall into this category.
Junk Classification
Queries that contain gibberish should be marked as junk, even if only a portion of the query is gibberish. Even if a query
consists entirely of legible words, it should still be marked as JUNK if there is no clear intent. However, minor misspellings
should not be marked as junk. Simple removal of diacritics (e.g. é,à -> e,a) should NOT be marked as JUNK. URL fragments and
email addresses should NOT be marked as JUNK as long as it is clear that they are referring to actual websites or email
addresses. Keyboard errors, such as forgetting to turn on foreign typing settings, should NOT be marked as JUNK. In general,
you should defer to the intention of the query rather than the query text if there is ambiguity. If the query has a
legitimate intention, it should not be marked JUNK.
These are JUNK:
houw do I logg in343zfto face bokasf333324g4gggggggggggggggg
- This query contains a legitimate intent, but also has many spam characters interspersed
hwawl4jhkl44jk;lhl
- This query consists only of meaningless spam characters
this book book book can be book book book
- This query consists of actual English words, but has no meaning
axabraxasmaxas123.com login
- Query contains and is searching for a url for which there is no reasonably interpretable content for its intent under
the assumption that it is misspelled, and which does not exist under the assumption that it is not
misspelled. ?? Not sure abt this one C:\user\documents\programfiles txt
- This is not a query a user would type into a search engine and as such should be treated as junk
These are NOT FOREIGN (assuming expected language of ENGLISH, but your market
might be different):
how do I write “hi” in Chinese From: Washington, United States
- Although the user expects to see the word “hi” in Chinese, he/she is almost certainly looking for English guides on
the topic.
comprar una caña de pescar en Honolulu hawái From: Hawaii, United States
- Although this is entirely in Spanish, it is from the United States, and is requesting instructions on how to purchase
a fishing pole in the state of Hawaii, which is something that belongs in the en-us market.
breaking bad season 1 From: Beijing, China
- The user is searching for an American TV show in English, so it would be best to send this to an English market
even though it was searched from China.