Professional Documents
Culture Documents
CS Exam GuidelinesV1.2
CS Exam GuidelinesV1.2
Fr
ida
y
17 , Dec
2.1 em
6.2 be
54 r 1
.6
by 6, 20
IP Lu 22
o m Le from
f r iK
22 Kuo uo IP
2 0 i
16, u Le
Satisfaction
e
b yLr
e cem 4.6 b
d a y, D 16.25
i .
Fr 172
Guidelines
A guide to providing satisfaction ratings for search results
Version 1.2
Fr
ida
y
17 , Dec
2.1 em
6.2 be
54 r 1
.6
by 6, 20
Lu 22 IP
Le from o m
iK r
uo IP
0 2 2 f uo
2 iK
r 16, u Le
e L
c e mb 6 by
e .
a y , D 6.254
id .1
Fr 172
Introduction 4 Flights 17
Search Needs and Satisfaction 4 Movies/TV Shows/Books/Music ida
Fr 18
y
The Query 5 How to Assign Ratings 17 , Dec 19
2.1 em
6.2 be
54 r 1
Steps in the Grading Process 5 When to Grade Highly Satisfying (HS) .6
by 6, 20 19
IP Lu 22
Definitions from 6 When to Grade Satisfying (S) Le from 22
iK
uo IP
022 i Kuo
Result Validation
, 2
16 Le 8 When to Grade Somewhat Satisfying (SS) 24
m ber by Lu
eWrong
c e .6 Language 8 When to Grade Not Satisfying (NS) 25
a y , D 6.254
id .1
Fr 172 Content Unavailable 8 Grading Specific Situations & Result Types 27
Inappropriate 9 Ambiguous Queries (Multiple Interpretations) 28
Satisfaction Principles 11 Locale Sensitivity 30
Satisfaction Scale 11 English Results in Non-English Locales 31
Degrees of Separation 12 Redirected Pages 31
Think About the Meaning, Not Just Matching Words 13 Apps 32
Consider User Effort 13 News 33
Consider Source Quality 13 Maps 34
Overview of Result Types 14 Web Video 35
Web Results 14 Dictionary, Stocks, Weather, Knowledge / Answers , Sports 36
Apps 14 Web Results (also called Suggested Web Sites) 36
Maps 14 Web Images 36
Fr
y Stocks 15 Common Grading Mistakes 39
ida
17 , Dec
2.1 Dictionary 15 Failing to Use Web Search 39
6.2 embe
54 r 1
.6
by 6, 20
Weather 15 Failing to Visit Destination Page 40
Lu 22 IP
L fro m
Sports ei Kuo m IP 15 Ignoring Time and Place 2 2 fro o 40
u
6 , 20 Lei K
News 16 Ignoring Conceptual Distance 1
ber by Lu
40
c e m
Web Images 16 Ignoring Relevance Grading Principles y, De .254.6 41
ida .16
Web Video 16 Examples: Satisfaction Rating Fr 172 43
Answers and Knowledge 17 Highly Satisfying 43
Satisfying Examples 45
Somewhat Satisfying Examples 48 Fr
ida
y
Not Satisfying Examples 50 17 , Dec
2.1 em
6.2 be
54 r 1
Other Aspects Related to Search Satisfaction Grading 51 .6
by 6, 20
IP Lu 22
Overall Preference
m Rating (OPR) 51 Le from
fro iK
uo IP
022 i Kuo
Writing 16 Comments
, 2
Le 52
b er y Lu
OPRDe&cem Comment
4 .6 b Examples 53
y , 6.2 5
id a .1
Fr 172
Fr
ida
y
17 , Dec
2.1 em
6.2 be
54 r 1
.6
by 6, 20
Lu 22 IP
Le from o m
iK r
uo IP
0 2 2 f uo
2 iK
r 16, u Le
e L
c e mb 6 by
e .
a y , D 6.254
id .1
Fr 172
Introduction ida
Fr
y
17 , Dec
2.1 em
6.2 be
54 r 1
Search Needs and Satisfaction 4 Search Needs and Satisfaction .6
by 6, 20
IP Lu 22
The Query m 5 Le from
2 fro o iK
uo IP
2 u
, 20 Lei K Process Search engine users are trying to accomplish a task (or achieve a goal)
Steps in the16Grading 5
m ber by Lu that requires some information or quick access to some other
Definitionse c e .6 6
y , D 6.254 resource, such as an app.
id a .1
Fr 172
A user s information need or search need is de ned as the
information or resource that the user needs in order to accomplish
A search service may return many di erent types of results. How are
their task. The user's query is an attempt to express that need to the
these graded? What is a satisfying search result? In these guidelines
search engine. If the search results enable the user to accomplish their
we talk about what constitutes a search query, the di erent types of
task, we say that the search need is satis ed.
results, and how to grade them. In addition we describe some typical
grading tasks that use the principles learned in satisfaction grading.
We say that a result is satisfying if it satis es the search need of a
query. Results can be more satisfying or less satisfying depending on
how well or how completely they satisfy the need.
Fr
ida
y
17 , Dec
2.1 em
6.2 be
54 r 1
.6
by 6, 20
Lu 22 IP
Le from o m
iK r
uo IP A search need → a search query 2f o
A search query → results returned , 202 i Ku
16 Le
er y Lu
You may assume all searches are made on an Apple iOS mobile b
m b
Dece 54.6
device. y ,
ida 72.16
.2
r
F 1
fi
fi
fi
ff
Fr
ida
y
17 , Dec
2.1 em
6.2 be
54 r 1
.6
by 6, 20
IP Lu 22
o m Le from
f r iK
22 Kuo uo IP
2 0 i
r 16, u Le
e
b yL A query and its associated information in the grading interface.
e cem 4.6 b
y, D 16.25
FrThe
i 72. Query
d a 1. Click on the Google and Bing web search links and scan the results
1
to make sure you understand what the query is about. Keep in mind
The grading interface displays each query together with additional queries can have more than one meaning.
information that provides useful context. As shown in the gure above,
2. Validate the result to make sure it can be graded, as explained in the
this includes the following components:
Result Validation section. Following step (1) is crucial for correct
• The query itself validation.
• Web Search links you will use to research the possible intents and 3. Assign the satisfaction rating per the guidelines outlined in
interpretations of the query
• Relevance Principles
• The language of the user. We do not want to return results in other • Assigning a Satisfaction Rating
languages • Special Situations
r 16, u Le
e L
Examples:
mb 6 by
Steps in the Grading Process • Query is “fac,” result is “facebook.com”. Grade
c e .
De if.25the
, as 4 query was “facebook.”
a y 1 6
• Query is “ted cruise,” result is a wikipedia F rid 7about
page . U.S. senator Ted Cruz.
1 2
The grading of results consists of the following steps. Grade as if the query was “ted cruz.”
fi
fi
Definitions
Fr
ida
The following terms are used throughout these guidelines: y
17 , Dec
2.1 em
6.2 be
54 r 1
.6
by 6, 20
IP Lu 22
o m Le from
f r iK
22 Kuo uo IP
16, u Le Term Definition Examples
2 0 i
e
b yLr
e cem 4.6 b • Stephen Curry
d a y, D 16.25
i
Fr 172
. • Yellowstone National Park
• Jupiter
• Médecins Sans Frontières
A person, place, organization, business, product, service, or • Starbucks
Named Entity event whose name would normally be capitalized in English. • Post-It Notes
(This includes ctional entities.) • Skype
• Super Bowl LI
• Boxer Rebellion
• Frodo Baggins
• photosynthesis
• elephant
A word or phrase describing a concept or object of study • ROC curve
Fr (other than a named entity) that users may wish to learn • linear algebra
ida
y
17 , Dec
more about. Knowledge terms may come from any eld of • cancer
2.1 em
6.2 be Knowledge Term study, including: science, technology, mathematics, medicine, • oligarchy
54 r 1
.6
by 6, 20 history, philosophy, literature, art, economics, etc. They are • veto
Lu 22 IP
Le from m
iK
uo IP most often noun phrases, but may also be other parts of • existentialism r o
2 f uo
0 2
speech. • metaphor 2 iK
r 16, u Le
e L
• impressionism c e mb 6 by
e .
y , D 6.254
• interest rate rid 72.1
a
F 1
fi
fi
Term Definition Examples
Fr
ida
y
17 , Dec
• Microsoft (company):
2 .16 emwww.microsoft.com
.25 ber
• U.S. Internal Revenue4.6Service1
by 6, 20 (government
IP L 22
r o m A website provided by a named entity (or their employer or organization): www.irs.govu Lei from
f Ku
22 Kuo o IP
6, Lei cial
O Site
2 0 organization) that represents how they want to be presented • Taylor Swift (performer): www.taylorswift.com
1
er Lu to the world online. • Henry Louis Gates Jr. (professor at Harvard
c emb 6 by
e .
a y , D 6.254 University): https://aaas.fas.harvard.edu/
id .1
Fr 172
people/henry-louis-gates-jr
Content Unavailable
Wrong Language
Flag result as content unavailable in any of these situations:
A result is in the wrong language if it is neither in English nor in the
language of the user s locale. • A result is a web/news or videos result but does not show a page
when clicked.
However, there are a few exceptions that are NOT considered wrong
language results: • Result requires log-in or subscription to access, speci cally where the
user would be able to see the content of the page by logging in, but
1. Result (e.g. amazon.co.jp) is the same country-speci c site as you cannot.
requested by the query ( amazon.co.jp ), even if the requested site
Fr • The browser presents a dialog box warning of a privacy or security
ida is not in your locale.
y
17 , Dec issue on the page.
2.1 em
6.2 be
2. Query54 rand result are in the same language, even though it s not the
.6 16,
by 20
primaryLulanguage
22 for this locale. • Required information for this result type is missing (e.g.Pno distance
Le from I
iK shown for Maps result). f rom
uo IP 2 o
3. User is visiting another country, query is for a local business or 6 , 202 ei Ku
1 L
attraction, result is in the language of the visited country (i.e. where b er y Lu
em .6 b
ecrating
query was submitted), and there is no equivalent result in the user s ⚠ Even if there is enough content to provide y , Da .25
4 but the page is behind
a 6
Fr Content Unavailable ag
a pay-wall/log-in, please check the id 72. 1
own locale language. 1
fl
fi
fi
Inappropriate
Fr
ida
A result is considered inappropriate if it has any of the following: y
17 , Dec
2.1 em
6.2 be
54 r 1
.6
by 6, 20
pornography,
o m
IP adult advertising/services, sex toys, illegal drugs, hate speech, gambling, spam/phishing, Lu 22
Le from
f r iK
, 2
6 Le
022 i Kuo pirated content(including those posing as free video streaming services), or gore/shock uo IP
1
m ber by Lu
D ece 54.6
In ,
y 16.2
r ida general,
. we want to connect users with useful content for their topic attempt to arti cially boost their relevance (e.g., link farming,
F 172
of interest while protecting them from being exposed to harmful keyword stu ng, etc).
information summarized below.
• Results that do not contain original and useful content. Examples:
• Hateful: the result should not advocate discriminatory content that pages with content scraped from Wikipedia or otherwise
intentionally attacks someone s dignity. This can include references automatically-created content.
or commentary about religion, race, sexual orientation, gender,
national/ethnic origin, or other targeted groups. • Illegal: We also manually remove reported results in those
circumstances that are required by law in the corresponding locale
• Violent or harmful: the result should not intentionally incite imminent (e.g., images of child abuse, content related to sex tra cking,
violent, physically dangerous, or illegal activities, nor provide copyright infringement, etc.) and when action is required to keep
information that leads to immediate harm. people safe (e.g., involuntary posting of sensitive personal
information, etc). Movie streaming sites such as those posing as free
• Sexually explicit: the result should not have overtly sexual or movies are also part of this category
pornographic material, de ned by Webster s Dictionary as "explicit
descriptions or displays of sexual organs or activities that are ⚠ Content that might otherwise be considered inappropriate is acceptable
Fr principally intended to stimulate erotic without su cient
ida if it occurs in a medical, educational, ne art, or journalistic context, and
y, D
17 aesthetic
2. ece or emotional feelings. should not be agged (e.g Wikipedia).
16 mb
.25 er
4.6 16
• by , 20
Contradicting
Lu 22 expert consensus on public interest topics: the Examples IP
Le from m
result should i K not Icontradict well-established or expert consensus on r o
uo P
0 2 2 f uo
a popular topic or issue. This includes misleading or inaccurate • User searched for [tinyzone] and the result is
r
2 ei K
16,https://
L
b e y Lu
information. tinyzonetv.to/ which contains pirated content. m b
Dece 54.6
y , .2
ida 72.16
• Spam Results that are malicious, deceptive, or manipulative. • r
User searched for [sdc.com] and Fresult1 is http://sdc.com/, or user
Examples: pages that contain phishing schemes, install viruses, or searched [olga 24k gold] and the result is https://www.lelo.com/
fi
fi
ffi
ffi
blog/olga-24k-gold-review/. Both results contain adult advertising
and should be agged. Fr
ida
Irrespective of whether the user
17 , Decwas searching for
y
2.1 em
6 be
this, these results need to.25be
4.6 r 16agged.
, by 2
IP Lu 022
m Le from
2 fro o iK
uo IP
6 , 202 ei Ku
1 L
m ber by Lu
D ece 54.6
y , .2
r ida 72.16
F 1
Fr
ida
y
17 , Dec
2.1 em
6.2 be
54 r 1
.6
by 6, 20
Lu 22 IP
Le from o m
iK r
uo IP
0 2 2 f uo
2 iK
r 16, u Le
e L
c e mb 6 by
e .
a y , D 6.254
id .1
Fr 172
Satisfaction Scale
When judging how satisfying each result is, you ll use the following scale
Almost all users would want to see this result. Many users would be interested in seeing this Some users may nd this result useful, but it s This result has nothing to do with the query, or
It s authoritative, accurate, up-to-date, and result. Satisfying results often provide probably not what most searchers were looking provides incorrect information, and should not
addresses the most likely search need(s). If the supplementary information that is one step for. It s often only indirectly related to the be shown.
Fr
user
ida is asking a speci c question, the result away from the query topic. search need or assumes an uncommon
y,
72 Dethe
1gives correct answer clearly and concisely. For example, if the query is a restaurant, it interpretation of the query. All results agged as Inappropriate ,
.16 cemb
.25 er might be a review of the restaurant; if the Content Unavailable , or Wrong Language
4.6 16
by , 20 query is a company, it might be the current should be rated as Not Satisfying.
Lu 22 IP
Le from stock price, or news about the company. o m
iK r
uo IP
0 2 2 f uo
2 iK
r 16, u Le
e L
Satisfaction Scale c e mb 6 by
e .
a y , D 6.254
id .1
Fr 172
Degrees of Separation
Fr
ida
Results are often associated with concepts in the real world, and di erent concepts are connected by their relationships.y
17 , Dec
2.1 em
6.2 be
54 r 1
For example, the concept of the singer Beyoncé .6
by 6, 20
IP Lu 22
m Le from
f ro iK
• is related to02the
2 uoconcept of her album Lemonade, uo IP
K
1 6, 2 Lei
er y Lu
• whichc emb 6in
b turn is related to a review of the album in Rolling Stone magazine,
D e 54.
y, .2
r ida 72.16
F 1• which is related to the author of the review, Rob She eld.
Each time we pass through one of these relationships, we increase the distance from the original concept
A Rolling Stone magazine review of the album. The singer's o cial site and Rob She eld's Twitter.
Somewhat Satisfying
The reviewer Rob She eld's Twitter. Random article from same issue of Rolling Stone
Not Satisfying
Fr
ida
y Degrees of Separation
17 , Dec
2.1 em
6.2 be
We can 54 think
.6 16, of these relationships as degrees of separation so in this example, the review of the Lemonade album is two degrees of separation
r
by 2
from Beyoncé. Lu 022 IP
Le from o m
iK r
uo IP
0 2 2 f uo
When Grading results, each degree of separation from the concept mentioned in the query, that is, the number of relationships 1you ei K
6, 2 Lhave to traverse to
b r
e yL u
get to the result, lowers the grade by one level. See table above. m b
ece 54.6
y , D .2
r ida 72.16
F 1
ffi
ffi
ff
Think About the Meaning, Not Just Matching Words Consider Source Quality
Fr
ida
Note that some highly satisfying results may not contain all (or even Sources of results, including web sites 1andy,
72 Denews providers, can have
.16 cemb
any) of the query words; what matters is the meaning. For example: large di erences in quality. When you are grading
.25 er a result, particularly
4.6 16
by , 20 ̶ pay attention to
if the user s query is looking for speci c information
I P Lu 22
• The result www.premierleague.com/home
rom
is highly satisfying for the L
the quality of the source(see table Source Quality ).ei KFor
fro
m example, if
2 f o uo IP
query english, 202 ei Kpremier
u league soccer even though that result you are interested in getting news about an event that happened in a
1 6 L
doesn er y Lu
mbt containb
the words english or soccer. certain city, a story in that city s newspaper is generally more reliable
D ece 54.6
y, 16.2 than a blog post by a random person who doesn t live there. If the
idaThe
F 172 result https://music.apple.com/us/album/25/1544494115 is
•
r .
highly satisfying for the query adele s third album, even though it source of a result is low quality, you should assign a lower grade than
doesn t contain the word third. you would have otherwise.
It's also possible for a result to contain all the query words and not be High Quality Low Quality
fi
fi
Maps
These results help the user navigate to a place. Usually they have
address and distance from the user. If it s a business it often has hours
of operation.
Fr
ida
y
17 , Dec
Apps
2 .16 emb
.25 er
4.6 16
by , 20
22
Cards thatLutake Le fthe
r user to the Apple app store (or open an app on the m
IP
i K om I r o
device). Usually uthey o P have an icon of the app and the star ratings. 0 2 2 f uo
2 iK
r 16, u Le
e L
c e mb 6 by
e .
a y , D 6.254
id .1
Fr 172
Stocks Weather
Fr
ida
This card provides nancial information related to stocks. They should This card that shows the temperature of y
17 ,aDelocation (and sometimes
2.1 cem
show the ticker symbol, the company name and the stock price. When other weather conditions). When the user 6taps
.25 bethis card, they are
4.6 r 16
the user interacts with this card detailed stock information such shown detailed multi day weather forecasts. by L 2022 ,
IP uL
historic price graphs
ro m are displayed. ei K from
2f 02 i Kuo
uo IP
, 2
16 Le
b er y Lu
m b
D ece 54.6
y, .2
r ida 72.16
F 1
Sports
These cards are meant to display sports scores, or latest scores for a
Dictionary team (and dates of upcoming matches). Some examples
This card shows the de nition of word. When the user interacts with
this card it provides detailed usage.
Fr
ida
y
17 , Dec
2.1 em
6.2 be
54 r 1
.6
by 6, 20
Lu 22 IP
Le from o m
iK r
uo IP
0 2 2 f uo
2 iK
r 16, u Le
e L
c e mb 6 by
e .
a y , D 6.254
id .1
Fr 172
fi
fi
News Web Video
Fr
ida
These are often types of web results that are restricted to news sites The user can click on these results which y D
17 , play a video (usually taken
2.1 ecem
(sports, fashion, political and so on). The usually have age of news from video channels such as YouTube and6.Vimeo.
25 ber
4.6 1
indicator at the bottom. They are designed to be clicked on and take by 6, 20
IP Lu 22
m Le from
the user to the2 fdestination
ro news site. iK
uo IP
, 2 02 i Kuo
16 Le
b er y Lu
m b
D ece 54.6
y, .2
r ida 72.16
F 1
Web Images
Fr
ida
y
17 , Dec
2.1 em
6.2 be
54 r 1
.6
by 6, 20
Lu 22 IP
Le from o m
iK r
uo IP
0 2 2 f uo
2 iK
r 16, u Le
e L
c e mb 6 by
e .
a y , D 6.254
id .1
Fr 172
Fr
ida
y
17 , Dec
2.1 em
6.2 be
54 r 1
.6
by 6, 20
Lu 22 IP
Le from o m
iK r
uo IP
0 2 2 f uo
2 iK
r 16, u Le
e L
c e mb 6 by
e .
a y , D 6.254
id .1
Fr 172
fl
Movies/TV Shows/Books/Music
Fr
ida
Cards that provide the user a very rich experience for example to y
17 , Dec
2.1 em
watch movies/tv show, learn about the cast, social media links, links to 6.2 be
54 r 1
.6
media related sites P(e.g IMDB), listen to music, get lyrics for songs, by 6, 20
I Lu 22
Le from
read books. They f romusually show a picture, popularity ratings etc. Some iK
uo IP
022 uo
examples:16, 2 Lei K
er Lu
c emb 6 by
e .
a y , D 6.254
id .1
Fr 172
Fr
ida
y
17 , Dec
2.1 em
6.2 be
54 r 1
.6
by 6, 20
Lu 22 IP
Le from o m
iK r
uo IP
0 2 2 f uo
2 iK
r 16, u Le
e L
c e mb 6 by
e .
a y , D 6.254
id .1
Fr 172
Almost
y , D .2 all users would want to see this result. It s authoritative,
r ida 72.16
F accurate,
1 up-to-date, and addresses the most likely search need(s). • News results can never be HS, because people have di erent preferences for where they get their
news, so we can’t say that almost all users would want to see a given story
If the user is asking a speci c question, the result gives the correct • Results for advice or recommendation queries (e.g.,“how to lose weight”, “chicken parmesan recipe”,
answer clearly and concisely. “best beatles song”, “thai restaurant”) can never be HS, because we don’t know if almost all users
would agree with the recommendation.
Query is the name of a well-known app; result is the a. Query is “facebook”, result is the Facebook app.
1 App Query Official App
app with that name b. Query is “calculator,” result is the built-in Calculator app.
App Regularly Used Query is the name of a business; result is an app a. Query is “b of a,” result is the Bank of America mobile banking app.
2 Business to Interact with regularly used to interact with that business. See b. Query is “dominos,” result is the Domino’s Pizza app, which allows
Business details under “Apps” in “Additional Guidance”. users to place orders.
Fr
ida Query is looking for a specific location / business /
y
17 , Dec a. Query is “1234 market street sf”; result is a Map for that exact address
2.1 em institution / point of interest, or the closest example
6.2 be
54 r 1 b. Query is “new york public library”; result is a Map to that location
.6 of a chain business / type of business, and the
by 6, 20 c. Query is “larry and joe’s”; result is a Map to a restaurant with that name
Lu 22 result showed that location on a map. IP
Le from
i Maps in the same town where user is located o m
3 Ku Query Closest Map r
2 f the
o IP d. Query is “closest lowe’s”; result is a Map showing o
2 ei Ku Lowe’s store
0 2
Queries with a map intent often have a distance 6 ,
location closest to the user’s location. er 1 Lu L
qualifier e.g. "nearest", "closest", "near me". Also m b by
e. Query is “starbucks”; result is a Map c e showing
6 the closest Starbucks
such queries often relate to business where one y , De .254.
branch. ida .16
must physically go to e.g. gas stations, cinema halls Fr 172
ff
a. Query is “when did wwi end,” result is a direct answer or info card that
says “November 11, 1918”
b. Query is “dodgers score,” result is a sports info card that shows the
current score of the Dodgers’ baseball game in progress, or (if no
game is in progress), the final score of the most recent game they
Query is asking for a specific piece of information played.
Explicit Correct that has a simple right answer, and the result c. Query is “msft quote,” result is an info card showing the latest stock
7 Exact Question
Answer showed that information directly without the need price for Microsoft (which has the stock symbol MSFT).
for further user action. d. Query is “jet blue 334,” result is an info card showing the current
status of that airline flight.
e. Query is “define attenuated,” result is an info card showing the
definition of that word.
f. Query is “weather boston", result is an info card showing current
weather for that city.
Query is the name of a creative work (music album, movie, a. Query is “fleabag,” result is https://en.wikipedia.org/wiki/
Fr 4 Creative Work Performer/Creator etc.); result is a representation of the creator/performer (e.g., Phoebe_Waller-Bridge, the wikipedia page about the creator and
ida
y artist’s official site). star of that television series.
17 , Dec
2.1 em
6.2 be
54 r 1
.6
by 6, 20
Lu 22 IP
Le from o m
iK r
uo IP
0 2 2 f uo
2 iK
r 16, u Le
e L
c e mb 6 by
e .
a y , D 6.254
id .1
Fr 172
Query is a named entity, result is an authoritative page (other a. Query is “facebook,” result is news story “Facebook agrees to
6 Named Entity News than official online presence) providing news about that pay FTC $5 billion fine for various privacy violations,” dated
entity. the same day the search was performed.
Query is asking for specific piece of information with a simple a. Query is “barack obama age,” result is https://
Embedded Correct right answer, and the result contains that answer, but the en.wikipedia.org/wiki/Barack_Obama.
7 Exact Question
Answer user has to take an action (e.g., follow link to destination b. Query is “cambridge library hours,” result is https://
page and read it) to get the answer. www.cambridgema.gov/cpl/hoursandlocations.
a. Query is “zillow”, result is the video “Living Large in a Tiny Home” from
Query is the name of the entity; result is not their
Zillow’s YouTube channel.
official website, but is a site, page, video, or app
Company/Product/ Related Site/Video/ b. Query is “sonicare” (brand of electric toothbrush), result is website for
3 related to their business. For example, this might be
Named Entity App Oral-B (a competing brand of electric toothbrush).
a 3rd party site about that company or its products,
c. Query is “billy idol” (singer), result is wikipedia page for Generation X, a
or a site for a competing product or service.
band from the 1970s he was in before he became famous.
Query is the name of an event or named entity; a. Query is “super bowl news,” result is a news story “Patriots Come from.
Stale but Valid News result is a news story about an earlier event or early Behind to Defeat Falcons in Super Bowl LI.” The story is still accurate,
ida 4 Named Entity or Event
Fr
y, D Story news about the entity. The news story must still be but it describes something that happened in 2017, not in the most
17
2.1 ecem valid. recent or upcoming Super Bowl.
6.2 be
54 r 1
.6
by 6, 20 Query is the name of a general concept or event a. Query is “dogs”, result is wikipedia page for the dog breed Beagle.
Lu 22 IP
Le from Overly Specific (such as a TV show); result is about a specific b. Query is “suits” (a TV show that ran for 9 seasons), result o m is https://
5 iK
General fr o
uo IPQuery 2 2
Result instance of that concept or event (such as a www.peacocktv.com/watch-online/tv/suits/8003089882869075112/
0 Ku
1 6, 2 Lei
particular episode of that show). seasons/5, a page where viewers can stream ber by Luthe 5th season.
c e m 6
y , De .254.
ida .16
Fr 172
This result has nothing to do with the query, provides incorrect information, or fails the validation step, and should not be shown.
Fr
ida
y
17 , Dec
2.1 em
6.2 be
54 r 1
.6
by 6, 20
Lu 22 IP
Le from o m
iK r
uo IP
0 2 2 f uo
2 iK
r 16, u Le
e L
c e mb 6 by
e .
a y , D 6.254
id .1
Fr 172
Fr
ida
y
17 , Dec
2.1 em
6.2 be
54 r 1
.6
by 6, 20
Lu 22 IP
Le from o m
iK r
uo IP
0 2 2 f uo
2 iK
r 16, u Le
e L
c e mb 6 by
e .
a y , D 6.254
id .1
Fr 172
When one interpretation is much more popular than the interpretation, you should grade using the normal 2. The query is "apple", result is a map result for the
others. guidelines. apple store near the user, but not the closest.
Grade as S, since the dominant interpretation of the
query is the technology company.
Fr
ida 2. Query is “american eagle”, result is home page of
y,
72 Dece
1Dominant web developer americaneagle.com. Grade as SS
.16 mb Interpretation Exists.
Secondary Interpretation: If a result would be relevant
(rather than HS), since the dominant interpretation of
When .25oneer interpretation is much more popular than the (HS/S/SS) for a secondary interpretation, you should
4.6 16
by , 20 the query is clothing retailer American Eagle
others (cont’d) grade it as “SS”.
Lu 22
Le from Out tters.
m
IP
iK r o
uo IP 3. Query is “golden retriever”, result 2 2 f is oa song titled
, 2 ei Ku
0
Golden Retriever. Grade as 1 SSL(rather than S/HS),
6
b er y Lu
since the the song isecnot em .6theb dominant interpretation
4
of the query. The ydog .breed
, D 2 5 is the dominant
ida 72.16
interpretationFfor 1this query.
r
Fr
ida
y
17 , Dec
2.1 em
6.2 be
54 r 1
.6
by 6, 20
Lu 22 IP
Le from o m
iK r
uo IP
0 2 2 f uo
2 iK
r 16, u Le
e L
c e mb 6 by
e .
a y , D 6.254
id .1
Fr 172
Implicitly Locale-Sensitive.
Query does not explicitly ask for results in a particular Any results from a di erent locale (even if they’re in the Query is “ticketmaster”; user is located in US. Result is
locale, but the user need is inherently locale- correct language) should be automatically graded as ticketmaster.co.uk. Grade as NS, since user did not
speci c (e.g., local law information, country-speci c “NS”. express any interest in UK events.
merchant sites, nearby real-world business).
locale, but those in other locales may be somewhat less • “S” results should be downgraded to “SS”
provide di erent medical advice for their residents, the
useful. UK's advice would be less useful to a US resident than
• “SS” results should be downgraded to “NS”
ff
fi
fi
English Results in Non-English Locales Fr
ida
y
17 , Dec
2.1 em
6.2 be
English is a widely-understood second language in many countries, and all our international graders are uent in it. For this 5reason,
4.6 r 16 rather than simply
marking an EnglishIPresult in a non-English locale as wrong language, graders should go ahead and grade the result, with the bfollowing y L , 202
u L 2 fr locale-speci c
ro m ei K om
considerations. 2 fYou o will need to use your own knowledge of the locale to decide which guideline to apply. uo IP
, 2 02 i Ku
16 Le
b er y Lu
m b
ece 54.6 English Results in Non English Locales
y , D . 2
ida .16
Fr 172 Scenario Grade
The user’s locale is one where most users understand English uently (i.e. ES-US)
Grade the result normally, the same way you would if it were in the locale language.
and would likely be interested in English-language results.
Grade the result one level lower than you would if it were in the locale language.
The user’s locale is one where many users understand English uently (i.e. Western
⚠ Results that would have been NS should still be graded as NS
The user’s locale is one where relatively few users understand English uently and
Grade the result as NS.
would be unlikely to be interested in English-language results.
Redirected Pages
Fr
id
If1 athe
y, D result displayed URL gets redirected to a di erent URL, then you should grade the page you re redirected to as if that were the result.
72 e
.16 cemb
.25 er
4.6 16
by , 20
Lu 22 IP
Le from o m
iK r
uo IP
0 2 2 f uo
2 iK
r 16, u Le
e L
c e mb 6 by
e .
a y , D 6.254
id .1
Fr 172
ff
fl
fl
fl
fl
fi
Apps
Fr
ida
y
17 , Dec
2.1 em
6.2 be
54 r 1
When a user clicks these results it takes them to app store (usually Apple .6
by 6, 20
IP the app if present on the device. Lu 22
app store) or opens
rom
Le from
iK
f uo IP
2 0 22 Kuo
i
r 16, u Le
e
b yL
e cem 4.6 b
a y, D 16.25
i d . App Rating Guidance
Fr 172
Rule Additional Details
Rule 1 under HS refers to cases where the query is the name of a well-known app —
a service that is best known as an app.
Examples: Instagram, Spotify, and Candy Crush
⚠ A well-known app is not the same thing as a well-known company!
Rule 3 under HS refers to cases where the query is a business and the result is an
app “regularly used to interact with that business.” Meaning, the app is a common
way that customers or clients perform the ordinary tasks they need to do business
with that company.
1. If the query is the name of a bank, then the app should allow the user to
perform mobile banking tasks.
⚠ Just because a company has an app does not mean that it’s regularly used 2. If the query is the name of a restaurant chain, then the app should allow the
user to order food at that restaurant.
to interact with that business. For example, the query “dell” refers to the name
3. If the query is the name of an airline, then the app should allow the user to
of a computer company. But their app “Dell@Retail 2019” is described as “a
Fr make reservations, choose their seat assignment, and check ight status.
ida chance for our global retail partners to immerse themselves in the design,
y, D
17 performance, 4. If the query is the name of a retail chain, then the app should allow the user
2.1 ecem and vision driving Dell’s innovation.” This app is NOT used
6.2 be to browse and purchase items sold by that chain.
regularly
54 r 1 by Dell’s customers and should NOT be graded HS.
.6
by 6, 20
Lu 22 IP
Le from o m
iK r
uo IP
0 2 2 f uo
2 iK
r 16, u Le
e L
c e mb 6 by
e .
a y , D 6.254
id .1
Fr 172
fl
News Fr
ida
y
17 , Dec
2.1 em
6.2 be
News articles usually have the word News prepended to them. The are speci c web 54 r 1
.6
by 6, 20
results that link to Inews
P websites. Lu 22
m Le from
2 fro o iK
uo IP
• The relevance 6 , 202 ei Kugrade for a news article depends in part on the amount of time
1 uL
between ber by L the date the search was done and the date of the article.
m
ce 4.6 A news item result with the recency below the title
e
d a y, D 16.25
i 72. search date is shown in the result preview itself.
Fr• 1The
• Keep in mind validity ags (Inappropriate, Wrong language, and Content Unavailable).
Grading time Sensitive News Articles
Timely Article: up to 3 months older than the search date Either S or SS if it's about the query topic.
Current Event
May never be graded better than SS even if
Stale Article: more than 3 months older than the search date
it's about the query topic.
Time sensitivity does not impact the relevance grade of the results for these types of queries. Examples of historical events are
Historical Events
Notre Dame re, Harry and Meghan wedding, Sandy Hook shooting, Pope Benedict resigns, etc.
Fr
ida
y
⚠ You might see articles with dates in the future! For these rare occurrences, grade it the same way as a timely article,
17 , Dec
2.1 em
6.2 be as long as the date is not more than 3 months newer than the search date.
54 r 1
.6
by 6, 20
Lu 22 ⚠ News items are never HS. Why? one news organization – even one reporter – may actually write several stories IP
Le from o m
iK fr
uo IP
about the same event. Maybe one person wants to get an overview of an event while another wants the latest updates. , 2022 i Kuo
16 Le
Or one person only likes stories from Fox News while another prefers MSNBC. For these reasons, we can't say that m aber y Lu
b
given news story is one that almost everyone wants to see. So it is mistake to rate a news result as Highly Satisfying.
Dece 54.6
y , .2
r ida 72.16
F 1
fi
fl
fi
Maps
Fr
ida
y,
The relevance of Maps results depends in part on the distance from the user. You should check to see if the info card 1has
72 Dedistance displayed. If not,
.16 cemb
this result cannot be judged. .25 er
4.6 16
by , 20
I P Lu 22
m Le from
f ro iK
2 2 uo uo IP
20 K i
r 16, u Le
e
b yL
e cem 4.6 b
d a y, D 16.25
i .
Fr 172
Maps Results
Grading Maps
Maps result is correct and near the user, but is not the closest one S
Business
Maps result is correct, and is still accessible to the user but is not close. SS
• People looking for expensive, rarely purchased items (cars, furniture, etc.) are generally willing to travel longer distances to nd the right one than
people looking for inexpensive, common items (e.g., a cup of co ee). So if the query is Lexus dealer, a result 30 Fr miles away might be S (or even
ida
HS if it's the closest match), while if the query is donuts, it would be NS. y
1 , De 72
.16 cemb
.25 er
• People living in sparsely populated rural areas are generally willing to travel longer distances than people in cities. If the 4.query
6 b 16, 2 restaurants is
y L 02
issued in Wilsall, IP MT (population 237), then a result 39 miles away in Bozeman (population 39,860) might be S. But if the same 2f
u L query rom were issued
r o m e i
f K uo IP
in New 2York
, 022 i KuCity,
o a result 36 miles away in Greenwich, CT would be NS
16 Le
b er y Lu
3. Keep m b
D ece 54.6in mind Intent and Distance! For some queries, users are looking for a Maps result. For other queries, they aren't. If a Maps result is shown
y, 16.2
r ida 7for
. a non-Maps intent query, then grade it as NS. Use the distance to guide you. If a Maps result is very far away, that s often a sign that the user
F 1 2
was not looking for a map.
• Query is "prime video" and result description is: "prime time video, 2511 springs rd ne, hickory, nc 28601- distance: 529 mi
• Query is "Lakers" and result description is: "great lakes brewing company, 2516 market ave, cleveland, oh 44113 - distance: 2,165 miles
Web Video
• If a query speci cally refers to a particular video (e.g., lemonade o cial video,
stepanov elements of programming lecture ), the desired result should be
graded as Highly Satisfying regardless of its popularity.
• For other results, and for more general queries where many di erent video results
could satisfy the user's need (e.g., guitar lesson ), then popularity may factor into
Fr
idayour decision; you may want to grade a video with millions of views higher than a
y, D
17 similar
2.1 ecem one with only a handful.
6.2 be
54 r 1
.6 6
y L , 202
• When bdeciding
u L 2 fr on your grade, think about whether video results are what user is IP
o m
looking foreiwhen Ku m IPtyping the query. r o
2 f uo
o 0 2
2 iK
r 16, u Le
⚠ You are not required to watch the entire video to arrive at a rating
e
e
mb 6 by
L
e c .
a y , D 6.254
id .1
Fr 172
fi
ff
ff
ffi
fi
Dictionary, Stocks, Weather, Knowledge / Answers , Sports
Fr
ida
Grade these cards based on what is visible. Thee grader cannot click on them but a user is provided self contained snippets y
17 , Dec of information and which
2.1 em
can often be interacted with to learn more (e.g. the Stock card opens up to show historic prince graphs) 6.2 be
54 r 1
.6
by 6, 20
• Dictionary: Is the u L 22 fr it must be the
IPuser seeking a de nition or a concept? If the card precisely answers the need, this is Highly Satisfying. In all Lcases
m ei K om
2 fro o uo IP
correct interpretation
2
, 20 i K
u for that word
16 Le
b er y Lu
m b
• Stocks:
D ece 54.6 check for correct stock symbol and presence of price.
y, .2
r ida 72.16
F 1
• Weather: the result s location should match the location speci ed in the query (e.g. weather boston ), or the user s location if location is not
mentioned in query.
• Answers: If the query is an explicit question, see HS7. Grade on what is visible.
Please click on the thumbnail and grade the destination page(after redirects).
Web Images
A group of web images should be graded as a single result. Check to see if all the images have the
Ffollowing properties:
rid
ay,
1 De
1.72.1Imagec displays correct subject. The image must actually show the subject of the query. For
6.2 embe
54 r 1 if the query is dodecahedron, the image must actually show that geometric gure and
example,
.6
by 6, 20
not some 22
Lu other
Le from one. Missing images (or ones that do not load) do not have this property. o m
IP
iK r
uo IP
0 2 2 f uo
2. Subject clearly shown. All images in the set must clearly show the subject of the query. The 2 ei K
Query: eMenr 16, inu Black
L
b y L
subject should not be blocked, out of focus, too far away, or otherwise di cult to see clearly. m b
ece 54.6
y , D .2
r ida 72.16
3. Subject is focus of image. In cases where the image includes multiple people or objects, it F 1
should be clear who or what is the subject of the query. (For example, if the query is Joe Biden,
fi
fi
ffi
fi
it s ne to have people in the background of a picture of President Biden giving a speech, but it s not ne to have a picture of Presidents Biden and
Macron shaking hands.) Fr
ida
y
17 , Dec
4. Image shows representative version of subject. For example, if the query is the name of a currently popular actor, .the 2 16 eimage
m should show that
.25 ber
person as they look today (or how their character looks in a currently popular movie), not how they looked many years ago. 4.6 If
by 6,the
1
20 query is the
name of a famous I P person from the past who is no longer alive, the image should show them as they were best known. For
L u L 22 fr if the query is
example,
m ei K om
2 fro o u I
02 i Ku a picture should show him during the time he was U.S. president, not 20 years later when he was near the end of ohis Plife.
Richard 2Nixon,
r 16, u Le
e
b by L
5. No e cemduplicates.
4.6 The images in the set should all be di erent.
y , D . 2 5
ida .16
Fr 172
If ALL the images have all of the above properties, grade the result Highly Satisfying. Otherwise, downgrade the results as shown in the table below.:
If… … Then
All images exhibit all properties Grade as Highly Satisfying
All but 1 or 2 images in the set exhibit all properties Grade as Satisfying
Examples:
• Query is David Beckham, result is set shown above. It has all the desired properties, so you would grade as Highly Satisfying.
Fr
ida
y, D
•17Query
2.1 ecem is dodacahedron (a geometric shape); result set is shown on the left below. Neither the second image nor the last image in this set are
6.2 be
dodecahedrons,
54 r 1
.6 so they violate property #1. Therefore you would grade this Not Satisfying.
by 6, 20
Lu 22 IP
Le from m
• Query is ta i Kyu brodesser-akner (an author); result set is on the right below. Two of the images in the set are problematic; one shows 2 f uopart of a
r o
o IP 0 2
poster for an event featuring the author, and another shows her with another person, both partly cut o . Neither of these violates 2 iK
r 16, u Leproperty #1
e yL
because both attempt to represent the author and not something else that would confuse or mislead the user, like a picture c e mb ofb a di erent author. But
De 54.6
y , .2
r ida 72.16
F 1
Web image results for query “taffy brodesser-akner” Web image results for query “dodecahedron”
Fr
ida
y
17 , Dec
2.1 em
6.2 be
54 r 1
.6
by 6, 20
Lu 22 IP
Le from o m
iK r
uo IP
0 2 2 f uo
2 iK
r 16, u Le
e L
c e mb 6 by
e .
a y , D 6.254
id .1
Fr 172
Failing to Use Web Search 3. Falsely Assuming Dominant Interpretation. If you have heard of a
result, you may assume that it's the dominant interpretation. But this
1. Misunderstanding Query Meaning. The query may be a common is not always true.
word that you think you know. But the web search may show that
• Example: Query is "u of m scholarships," result is a page about
the primary meaning is something entirely di erent.
scholarships at the University of Michigan. A grader who knew
• Example: Query is "canada goose"; result is the wikipedia page nothing about the subject might conclude that this is a great
about that kind of bird. If you had not heard of the Canada Goose result, and rate it Highly Satisfying. But looking at the web results
clothing brand, you might assume that the bird page is what shows that the query has no dominant intent. It might be referring
almost all users would want to see. But by looking at the web to the University of Minnesota, or the University of Manitoba, or
search results, you can tell that this is not the case. many other things. Therefore the grade cannot be HS.
ff
fi
ff
• Example: Query is "dog," result is wikipedia page about the welsh
corgi, a particular breed of dog. This is too speci c.
Ignoring Relevance Grading Principles
Fr
ida
y
17 , Dec
• Example: Query is "new england patriots news," result is home 2.1 em
1. Matching Words Instead of Meaning.6.Graders
25 ber
4.6 16 sometimes forget
page for a regional sports news network that covers many by , 20
IP teams in New England, not just the New England the principle "Think about meaning, not just Lmatching
u L 22 fr words."
di erent sportso m ei K m o
r
2 f uo is too general. Just because the query words appear in the result udoes
o IP
not mean
Patriots.
202This
iK
r 16, u Le the result is a good one, and just because the query words are
e
b yL
2. Wrong e cem 4.6 b Level of Web Page. Pages on a given web site often form a missing does not mean the result is a bad one.
y, D 16.25
i d a hierarchy,
. with a home page for the site, subpages for di erent
Fr 172 • Example: Query is "far alone," result is a page containing the
topics, sub-sub-pages, and so on. A common mistake is not to
inspirational quote "If you want to go quickly, go alone. If you want
notice that a page is too high or too low in the hierarchy, compared
to go far, go together." The result contains both query words, but
to what the user is looking for.
they match only incidentally. It's clear that this is not what the user
• Example: Query is "us passport information"; result is was looking for, and in fact the web search results show that "Far
www.state.gov. This page is too high in the hierarchy of this web Alone" is the name of a song.
site. It is about everything the U.S. State Department does
2. Rating News Results Highly Satisfying. When a news event
(diplomatic relations, trade policies, etc.), not just passports.
happens, it is often reported by many di erent news organizations,
• Example: Query is "us passport information"; result is a page from whether it's local TV stations, newspapers, or major news networks.
the U.S. State Department about what to do if your passport is Furthermore, one news organization ‒ even one reporter ‒ may
lost or stolen. This page is too low in the hierarchy of the site. The actually write several stories about the same event. Maybe one
user never said anything about their passport being lost or stolen person wants to get an overview of an event while another wants
‒ in fact, we don't even know if the user already has a passport. the latest updates. Or one person only likes stories from Fox News
Fr while another prefers MSNBC. For these reasons, we can't say that
3.
ida Ignoring Degrees of Separation. Graders often ignore the principle
y a given news story is one that almost everyone wants to see. So it is
17 , Dec
2.1of degrees of separation. A result that's associated with the thing
6.2 embe mistake to rate a news result as Highly Satisfying.
5 r
the4user
.6 16,is looking for is not the same as the thing the user is
by 2
Lu 022 • Example: Query is brittney greiner sentencing and Iresult
P is a
looking for. Le fro m
iK m f r o
uo IP timely news article about the event on the news 2 022 iwebsite
Ku
o
• Example: Query is "chez panisse," result is Yelp's page of reviews 6 , e
theguardian.com. Although this result is about
b er y Lu the topic, it should
1 L
for that restaurant. This is a very useful result, but it is not Highly not be Highly Satisfying because it is a
m 6b
ecenews
. result.
, D 254 y .
Satisfying, because it is one degree of separation from what the r ida 72.16
F 1
user was looking for. 3. Ignoring Basic De nitions of Grading Scale. A common mistake is
to ignore the basic de nitions of each grade and only look at the
fi
ff
individual rules. The rules are meant to illustrate the de nitions in
di erent situations, not to replace them. If you're faced with a Fr
ida
grading situation where you don't see a rule that applies, just go y
17 , Dec
2.1 em
back to the de nitions: Is this a result most users would want to 6.2 be
54 r 1
.6
see? Etc. by 6, 20
IP Lu 22
o m Le from
r
f o iK
• Example: 22Query u is el pais (name of several newspapers, including uo IP
, 2 0 iK
6
1 uL e
one berin y Cali,
L Colombia and one in Madrid, Spain); user is in
c em b
e 54.6
a y , D Colombia
6.2 but result is for a more popular one in Madrid,
i d . 1
Fr 172 elpais.com. There s no rule about matching similarly-named results
in di erent countries, and the guidance about locale-sensitivity
doesn t exactly address this example. It s clear that the Spain
result is not what most Colombian users are looking for, but it
might be useful to some. By de nition, that means it s Slightly
Satisfying.
Fr
ida
y
17 , Dec
2.1 em
6.2 be
54 r 1
.6
by 6, 20
Lu 22 IP
Le from o m
iK r
uo IP
0 2 2 f uo
2 iK
r 16, u Le
e L
c e mb 6 by
e .
a y , D 6.254
id .1
Fr 172
Fr
ida Since it's both a company and an app, both of these
facebook
y
17 , Dec facebook.com, Facebook app HS are "o cial" results that most users would want to
2.1 em
6.2 be
54 r 1 see. (HS4 & HS1)
.6
by 6, 20
Lu 22
Le from The Premier League is the top englishomsoccer IP league.
iK fr o
uo IP Note that this is a result most users 2 would want to see
202 ei Ku
top english soccer league Home page of the Premier League, premierleague.com HS ,
even though it doesn't use the er ywords
1 u L "English" or
6
L
b
“Soccer." (HS4) e c e m 6b
4 .
D 5
i d ay, .16.2
Fr 172
ffi
fi
Query Result(s) Rating Explanation
Fr
ida
y, D
The result (knowledge
17 card
2.1 ecem
with the answer)
how many stomachs does a
HS immediately gives the user6.2 all
54 erthe
b information they
cow have .6 16,
asked for. (HS6) by 2
L 02
IP u L 2 fr
r o m ei K om
f uo IP
22 Kuo
6, 2 0
ei Almost all users searching for a business or service
beat thebebomb r 1 Lu L o cial website : https://beatthebomb.com HS
m by would want to see its o cial web site. (HS4)
ece 54.6
y, D .2 Result is the o cial Roland Garros (French Open)
ida 72.16
r
F 1 YouTube channel. Although there is no speci c rule for
french open highlights https://www.youtube.com/channel/UCF3K1Jf8hjFW8qliei8fQ3A HS
this case, it clearly satis es the de nition of Highly
Satisfying.
saw
Satisfying Examples
Fr
ida
Query Result(s) Rating Explanation
y
17 , Dec
2.1 em
6.2 be
54 r 1
.6 The query is asking an implicit question (how to change
by 6, 20
instagram.com Lu 22change Instagram password. This web page has the authoritativeIP
Le from O cial instructions on how to change instagram password S m
pass iK
uo IP answer, but the user has to click on the result
2 f uo to visit the
r o
2
page in order to see the answer. (S7)16, 2 Lei K
0
m ber by Lu
c e 6
y , De .254.
ida .16
Fr 172
Probably not what most users were looking for. (If they had
camden county college Home page for library at the college SS
wanted the library, they would have mentioned it in the query.)
farmers hawaii SS about a di erent state, so is not likely what most users would
[user is in Texas]
want to see.
A very popular interview with BTS. and tv show host, but not very
bts
2018 video of interview with the band SS relevant given that it is several years old, and several newer
[searched in 2022]
interviews are available.
Frcao
Irish website about applying to undergraduate
SS
There is a grocery chain in Florida called CAO, so it's unlikely that
[user
ida
y is in Florida] programs in Ireland. the user had the Irish website in mind.
17 , Dec
2.1 em
6.2 be
54 r 1
.6
Query is about a German track and eld star, so the most
by 6, 20 satisfying results will be about her competitions, herPathletic
Lu 22 I
alica schmidtLei K from I https://hotsportsgirls.com/alica-schmidt/ SS achievements, etc. In contrast, this result is solely f romabout her
uo P 2 uo
physical appearance, which will be of interest , 2 02toiK only some
16 u Le
searchers. e
mb y
r L
b
Dece 54.6
y , .2
r ida 72.16
F 1
eeting meaning SS De nition of a related word but not the word the user asked for
Fr
ida
y
17 , Dec
2.1 em
6.2 be
54 r 1
.6
by 6, 20
Lu 22 IP
Le from o m
iK r
uo IP
0 2 2 f uo
2 iK
r 16, u Le
e L
c e mb 6 by
e .
a y , D 6.254
id .1
Fr 172
how many weeks has it been Despite matching some words in the query, this
https://www.answers.com/Q/
since march 25th
NS result is for a totally di erent year and does not
How_many_weeks_has_it_been_since_April_27_2009
[query issued in April 2021] give the user any useful information. (NS6)
After providing satisfaction ratings for every result, you will be asked to choose
which side you prefer. This is called the Overall Preference Rating (OPR).
The rating scale is About the Same, Slightly Better, Better and Much Better.
OPR Criteria:
Use the following criteria to decide on the OPR:
1. Prefer the side whose results have higher satisfaction grades.
2. If there are multiple results, prefer the side where results with higher
satisfaction are ranked higher.
3. If there are multiple results, prefer the side with a more varied result set. This
might be a variety of result types (maps, apps, web pages, etc.), satisfying a
F variety of meanings of the query.
rid
ay
17 ,Note
4. D that the side with more results is not necessarily better.
2.1 ecem
6.2 be
5. If you 54 rer
.6 16,having trouble deciding which side is better, choose About the Same.
by 2
Lu 022 IP
Le from o m
iK r
How much these IP
uo criteria a ect OPR also depend on the position of the result. For example, 0 2 2 f uo
2 iK
if the satisfaction rating of the results in position 1 are di erent, that should have a bigger r 16, u Le
e L
impact on OPR than if the satisfaction rating of results in position 4 are di erent. c e mb 6 by
e .
a y , D 6.254
id .1
Fr 172
ff
ff
Writing Comments Fr
ida
y
17 , Dec
2.1 em
6.2 be
You might be asked to leave a comment (written in English) for why you chose the OPR. These are very helpful to the clients 54 ofr 1the grading task. It
.6
by 6, 20
helps understand the IP reasoning behind the rating for complex grading tasks and especially in locales the clients doesn t understand. Lu 22
Le from
ro m iK
2 f o uo IP
2 2
0 iK u
,
16 Le
b er y Lu
m b
D ece 54.6
r
y,
ida 72.16
.2 The query intent is Yahoo News and is most likely
F 1
to visit the main page of headlines of the queried
website. The 1st and 2nd results are the same on
the both sides. The rest of the results are similar
I came to the conclusion that the left side on both sides showing some speci c pages from
o ers more suitable results and therefore sports, entertainment and weather categories on
should be rated as better Yahoo News website and there is a little better
news among them (R5) on the right than the left
which is a breaking news from domestic news
category. Thus the right side is slightly better due
to better relevance and freshness.
Fr
ida
y
17 , Dec Poor Comment Excellent Comment
2.1 em
6.2 be
54 r 1
.6
by 6, 20
Lu 22 IP
Le from o m
The comment i on Ku the
o IP left can be improved by providing reasons why the left is more suitable . 2
r
2 f uo
2 0 iK
r 16, u Le
For the comment on the right, the writer states presumed search need and then goes on to describe how the results help meet e yL
c e mb 6 bthat and ultimately
D e 5 4 .
why they chose one over the other. d ay, .16.2
i
Fr 172
Examples Page 52 of 58
ff
fi
OPR & Comment Examples
ida
Fr
y
17 , Dec
2.1 em
6.2 be
54 r 1
Query 1: tdecu addresses four (the main app, the mortgage app, .6
by 6the
, 2 web page, and
IP Lu 022
o m the Twitter feed). So the right has a slightly more diverse Le from result set.
f r iK
Location: Richwood, 0 22 Kuo TX uo IP
, 2 i However, the user gave no indication that they were interested in the
16 Le
b er y Lu Twitter feed, so this is a very unlikely intent.
m b
D ece 54.6 LEFT RIGHT
a y , 6.2
id .1 Since we don t know whether more people are interested in the map or
Fr 172Official TDECU Digital Banking App Official TDECU Digital Banking App
the o cial site, the two sides are About the Same.
TDECU Mortgage Simplified App TDECU Mortgage Simplified App
Fbank)
rid with two branches near the user. We can assume the user wants
ay,
to
17 either
D do a bank transaction, go to the bank, or get information Much Better Better
Slightly
About the Same
Slightly
Better
Much
2.1 ecem Better Better Better
about 6.2 the
5 er bank.
b
4.6
1
by 6, 20
L 22 IP
The o cial uapp,
Le frthe
i K om I
o cial website, and the map results for the r o m
uo P 2 f uo
nearest locations are all Highly Satisfying. The map results appear on OPR Explanation: The query could refer to a clothing
0 2
iK
16, u Le store or a kind of
2
the left but not the right, while the o cial website appears on the right fuel.
e r
mb 6 by
L
e c e .
but not the left. y , D 6.254
a
id 72.1
• Fr 1on
Two out of three results are the same both sides, so they aren t
The left side addresses three search needs (it satis es people looking that di erent.
for the main app, the mortgage app, and the map) while the right
ffi
ffi
ff
ffi
ffi
fi
• The left side has a wrong language result, which is Not Satisfying to • The rst two results are the same on both sides.
users. F
• Both result sets have three types of rsearch
ida
y, D results.
17
• The right side ranks the diesel fuel result higher, showing both likely 2.1 ecem
6.2 be
interpretations near the top. • The third result on the left is only vaguely 5related
4.6 r 16 to the Apollo space
by , 20
IP program. It seems unlikely that someone searching Lu 22 for apollo
o m Le from
• The right side 0 22 K
r
f has
uo more diversity of result types (web pages and project would nd an obscure artist s ambient music i K useful
uo IP in
, 2 i
maps, instead 16 Lof
er y Lu
e only web pages). satisfying their search need.
b
m b
D ece 54.6
Since
y,
ida 72.16
.the
2 are multiple reasons to prefer the right side, that side should • The third result on the right is not at all related to the Apollo space
r
F 1
be more than Slightly Better. But since the lists aren t that di erent, it s program; it has something to do with a project of the Apollo Theater.
not Much Better. So we choose Better. Based on the web results, it s extremely unlikely that this was the
user s intended interpretation of the query.
Since only the last result is di erent, and the last result on the left is
Query 3: apollo project
less bad than the one on the right, we conclude that the left side is
Location: Cincinnati, OH on Feb. 13, 2020. Slightly Better.
LEFT RIGHT
Apollo Space Program wikipedia article Apollo Space Program wikipedia article
(en.wikipedia.org/wiki/Apollo_program) (en.wikipedia.org/wiki/Apollo_program)
FrProject Apollo — Moonlight Richards 50 Apollo Global Video Project: Les Twins
ida
songs
y, D to the moon, an Apollo 11 space of Sarcelles by Apollo Theater, Harlem
17 mission
2.1 ecemtribute [Apple Music result] [YouTube video]
6.2 be
54 r 1
.6
by 6, 20 Slightly Slightly Much
Much Better Lu LBetter 22 About the Same Better IP
f
ei K omr Better Better Better m
r o
uo IP
0 2 2 f uo
2 iK
r 16, u Le
e L
c e mb 6 by
e .
OPR Explanation: The query refers to the space program from the a y , D 6.254
id .1
Fr 172
1960s that rst put a human on the moon.
LEFT RIGHT
query was on Feb. 13, 2020, we assume the user wanted the most Official video for Ramos' 2021 song
NBC News article from February 2021
"Blessings"
recent award winner at the time, announced at the ceremony on Official video for Ramos' 2021 song
Anthony Ramos instagram page
February 8, 2020. “Say Less"
Slightly Slightly Much
Much Better Better About the Same Better
• Result #1 on the left (same as #2 on right) contains the answer, but Better Better Better
Fr requires visiting the page and scrolling all the way to the bottom to
ida
y, D
17 nd
2.1 ecit. Result #1 on the right gives us the answer right away, without
6.2 embe
even54having r1 to click on it. OPR Explanation: The query refers to an actor and singer who
.6
by 6, 20
Lu 22 appeared in the original cast of the musical Hamilton. IP
• Result #2Lonei K fthe
rom left is a YouTube video from a non-authoritative m
fro o
uo IP 2 2
source (a random fan), and it s very outdated ̶ from 2011. 0 i Ku
• Results L1, R1, and R4 all all Highly Satisfying. 1 6, 2 LAll
e the rest of the
b er y Lu
results on both sides are Satisfying. cem .6 b
• Result #3 on the left is related to best actor winners, but doesn t e
, D 6.254
a y
actually contain the answer the user is looking for. id .1
• Fr 172 providing more di erent
The set on the right is more diverse,
types of results.
OPR Explanation: Both sides have the same results, but they are
ranked di erently. Since the search was done in 2021, it s most likely
OPR Explanation: The query can refer to many di erent things or that the new 2021 documentary about Tina Turner ( Tina ) is what the
people, and the web search results make it clear that none of them is a user was looking for. Since the only di erence is the ranking, and the
dominant interpretation. Furthermore, these results all seem to be only right side ranking is clearly better than the left side (moving the best
FSomewhat
rid Satisfying, since it isn t likely that most users in the United result into position #1), it s Better.
ay,
States
17 Dec were searching for (say) an Indonesian app or an Israeli Singer
2.1 em
from 6.2thebe1990s. Therefore the two sides are About the Same.
5 r
4.6 1
by 6, 20
Lu 22 IP
Le from o m
iK r
uo IP
0 2 2 f uo
2 iK
r 16, u Le
e L
c e mb 6 by
e .
a y , D 6.254
id .1
Fr 172
Query: monster hunter stories 2 OPR Explanation: Both sides have the brief Knowledge card describing
the person (with links to her o cial website and twitter feed). The left
Location: Miami, FL on 2021-08-10. side also has web videos for two of her songs, while the right side also
Fr has her o cial website and Twitter feedResults R2 and R3 are more
ida LEFT RIGHT
y, D
17 Wikipedia
valuable than L2 and L3, but the lack of any videos makes the right
2.1 ecem entry for the video game Wikipedia link to Monster Hunter Stories
6.2 Monster
b Hunter Stories 2: Wings of Ruin side only Slightly Better.
54 er 1
.6 6, Slightly Slightly Much
Much Betterby L 2Better
02 About the Same Better
u L 2 fr Better Better Better IP
ei K om r o m
uo IP
0 2 2 f uo
2 iK
r 16, u Le
e L
c e mb 6 by
OPR Explanation: The user speci cally asked for Monster Hunter e
, D 6.254
.
a y
Stories 2 . The left side has a more general result (it s about the entire id .1
Fr 172
video game series), while the right is about the exact thing the user
asked about, so the right is Better. To be Much Better, the right side
ffi
ffi
fi
fi
OPR Explanation: The user is looking for the news site Hu ngton Post.
O cial website,app, and Twitter feed Fare all Highly Satisfying. The UK
Query: sunrise rid
ay
site is Somewhat Satisfying. Left is better
17 , Ddue to more satisfying
2.1 ecem
Location: West Melbourne, FL on 2021-09-01 results. 6.2 be
5 r4.6 1
by 6, 20
IP Lu 22
LEFT m RIGHT Le from
fro 2 uo
iK
uo IP
Weather Info card 02for
6 Lei West Melbourne A website selling the domain name
, 2 K
(with 1
esunrise/sunset
r Lu times) http://www.sunrise.am
c e mb 6 by Weather Info card for West Melbourne
App e 54link. for sunrise/sunset times
a y , Dstore
6. 2 (with sunrise/sunset times)
id 72.1
Fr Knowledge
1 Info card about the topic Knowledge Info card about the topic
Sunrise Sunrise
Slightly Slightly Much
Much Better Better About the Same Better
Better Better Better
OPR Explanation: Both have same third result. Both have the same
Highly Satisfying info card, but it s ranked better on the left. Of the
remaining results, the one on the left might be useful, while the one on
the right is Not Satisfying. Both of these di erences favor the left side,
so it is Better.
FLocation:
rid Paxtonia, PA 2021-09-22.
ay,
17 Dec
2.1 em
6.2 be
54 r 1 LEFT RIGHT
.6
by 6, 20
Lu 2website
Official 2 Official UK website IP
Le from o m
iK r
uo IP
0 2 2 f uo
Twitter handle Huffington Post News App 2 iK
r 16, u Le
e L
e mb 6 by
Slightly Slightly Much e c .
Much Better Better
Better
About the Same
Better
Better
Better a y , D 6.254
id .1
Fr 172