Download as pdf or txt
Download as pdf or txt
You are on page 1of 2

SIDE BY SIDE JUDGMENT GUIDELINES

Part 1. General information


You will be presented with a query, the query intent and two sets of image
results to compare. You will be then be asked which side you prefer and
why. IMPORTANT: When
judging a side, pay
Please try to place yourself in the user’s shoes when making a judgment. We close attention to the
are evaluating usefulness and ‘attractiveness’ of results, going first few
beyond relevance. images. Images that go
last are not as
Judgments will be based on multiple factors listed below. Use the query intent
important as the first
to better understand query context and interpretation before judging which
ones when choosing a
collection of images is better.
label.
Additionally, use the Bing/Google search URLs if you need additional context /
information to understand the query and what images should be served. Only
use the search links to understand the query, don’t use them inspect each
image individually.

Judgment factors in the Judgment labels, to


order of their importance indicate which side is
better
Part 2. Judgment Process Slightly better Better

If it takes some If at first glance it is


If one side is
looking, getting into clear which side
completely blank, technicalities then it results
please check Part 4 should be labeled are better, then go
Start on the next page ‘slightly better’. with the ‘ better’
for instructions.

You don't understand


Majority of YES NO
Is it possible to judge Can't the query after
the images loaded? this hit? Judge investigation, the
query is foreign, etc.
NO
YES

Didn't Decide which side is better, After going through each step, select
based on the following factors one most important judgment factor
Load (in the order of which determined your decision
their importance)

Majority of the images did not


Detrimental results even
load on either or both sides
NOT the same as completely
Results are adult, suggestive
violent, politically, or
Adult/Racy/Offe though query is not;
something you would not
blank side religiously offensive. nsive/Violent expect to see on TV

Unreadable
diagrams qualify as
irrelevant (not useful). If
Please mark ONE Results are relevant and satisfy both sides look irrelevant,
at least one query intent. Relevance judge based on other
factor that was the factors and go with
main reason why "slightly better" instead
of "much better"
you chose one side Images have good
over the other. contrast; no artifacts, or Technical Image Overexposed,
underexposed, blurred
watermarks detracting from the
experience
Quality images would
downgrade a side.
Aspects of image not
This does not apply to
captured in any of the other
labels but important for Fresh results Freshness/Trendi all cases.
Ex: Pictures
determining a winner.
Examples: Unique object
ness of celebrity with stale
look
arrangement, perspective,
viewpoint; Composition or
combination of colors Great variety of images Diversity Absence of dupes or
near dups, different , pers
making the picture clearly presented
of Presentation pective or light; different
pieces or elements of the
stand out.
subject

Another important
Other deciding factor not mentioned
above
Part 3. Judgement Table

Winning Side Losing Side


Blank VS Irrelevant

Relevant VS Blank

Blank VS Multiple Non-Famous People

Same Non-Famous Person VS Blank

Part 4. Important Concepts


Blank Side vs. Non-Blank Side

In cases where one side is blank, investigate whether the side with images is relevant. If the side with
images is relevant, give a win to that side. If the images are irrelevant and would waste the user’s time if
displayed to them, give a win to the blank side (this includes cases where there are multiple, non-famous
people). The final rating in this case will always be a full “Better” to one side or another.

In the above case, the left side is relevant, so the final rating would be “Left Side Better”

Non-Celebrity Intent

Not Relevant: Multiple people in different pictures Relevant: Multiple images of the same person

In cases where the user is searching up a person who is not a celebrity, a side is only relevant if it shows
multiple images of the same person. In cases where multiple people are shown in different pictures, the
side should be seen as irrelevant.

In the left image above, there are multiple (random) people with the name Ben Koehler. This should be
seen as irrelevant. In the right image above, there is a single person with the name Christina Hicks in
multiple pictures and it should be seen as relevant

You might also like