Download as pdf or txt
Download as pdf or txt
You are on page 1of 11

Catalog Quality Attribute Extraction Guideline

Last Updated - August 5th 2021

1. Intro to Commerce Attribute Extraction

What is an attribute?
An attribute (also known as an entity) provides descriptive characteristics that round out the look, feel, and details
of a product.

What is attribute extraction?


Attribute extraction is the process of extracting meaningful characteristic data from text and images.

Your Task
1. Evaluate product listings
2. Identify the attributes within a given product listing
3. Submit

2. Accessing the SRT Queue

Step 1 - Go to SRT
https://review.intern.facebook.com/intern/search_team

Step 2 - Navigate to the Correct Queue


Select the appropriate queue for your workflow according to your project.

Step 3 - Enqueue Jobs


Click “Save and Enqueue” to load jobs.
Step 4 - Just Go
Click 'Just Go' to enter the rating queue where you will be given a product listing.

3. SRT Components
# Component Description

1 Listing Image A photo of the product(s) being offered.

2 Listing Information Details including Product ID, Category, Category ID, and Possible Attributes.

3 Listing Description A summary of what is being offered that can be used to highlight and extract attributes.

4 Preview A duplicate of the Listing Description that can be used to copy and paste key terms for
side searches.

5 Attribute widget(s) Where you will tag and extract the attributes related to the relevant attribute widget(s)
displayed to you. Remember to scroll down until the end of the attribute widget(s) panel
to ensure you have covered all attribute questions.

6 Common question Once you have determined all the attribute widgets, you will conclude whether there are
potential issues or if they can be successfully rated.

7 Actions Where you will press “Submit” to conclude a job.

8 Job Info Details about the status of the job, labeling queue, and job ID.

4. Attribute Labeling Process

Step 1 - Identify Potential Attributes

In this project, Attributes come in a variety of Types. The chart below details the 5 different Attributes types (or
referred to as Attributes widget(s) in section 3).

Please note the widget to be labelled will vary from product to product. For example, for one product you have to
select Color and Size while for the other you only have to select Gender. The SRT queue will automatically display
the widget(s) which require annotation. Remember to scroll down until the end of the attribute widget(s) panel to
ensure you have covered all attribute questions.

Type Definition Examples Review Decision Tree Note

Color Select the most Black, Brown, 1. Can label? Note that this is the primary color of an
appropriate Yellow, Blue, - Yes: item, for example “red” for a red T-shirt
primary colour for Gold, Orange, ● Can label with white letters on it. Color names can
the product Gray, Bronze, - No for following be missing from the description but if the
among the 16 Beige, White, reasons: image provides enough evidence, the
normalized colors Multi-Color, ● Images are not answer should be submitted based on the
in the drop-down Purple, Silver, available and image.
options. Red, Pink, Green text lists
multiple colors Color should be marked as “multi-color”
● Image and text when multiple colors are dominant in the
description are image
not aligned
● No, can not Color should be marked as “irrelevant” for
label due to products whose color is irrelevant for the
other reasons user’s purchase decision. For example: the
apart from the user does not base their shampoo or book
ones listed purchase decisions on the colors of the
above products.
2. What’s the color?
- Drop-down for 16 Color should be “No color found” when
normalized colors color is actually relevant to the product
- Multi color but is missing from images or any of the
- Irrelevant text fields (title, description, color) for the
- No color found product.

For the two scenarios above, don’t select


‘No’ for cases where color is irrelevant/no
color found. Select can label and these
options are available in the color list
dropdown.

If image and product description are


contradictory (e.g. image shows red while
description states ‘blue’), please select
‘No Image and text description are not
aligned’ in the ‘Can label?’ section.

Brand Select the brand - Apple 1. Can label? Brand should be listed explicitly in the
name which is a iPhone 6s: - Yes: description text of the item or provided by
name given by the ‘Apple is the ● Can label the seller as a separate field.
maker to a product brand name - No for following
or a range of - Graco Pack reasons: When you are not sure if something is a
products and ’n Play ● Image and text brand, try a brief side search to figure out
under which it is Totbloc: description are if terms are a relevant brand in context.
sold.
‘Graco’ is the not aligned
If the seller brand exists, that’s the
brand name ● Brand is missing
answer except if the seller provided brand
- Adidas Yeezy ● No, can not looks incorrect like brand is “No_brand”,
Boost 350 label due to model name or numbers etc.
V2: Adidas is other reasons
the brand apart from the There are scenarios where the brand
name ones listed information might be available in both
above image and text fields (ie description and
2. What is the brand? title). In which case:
(based on text - Questions 2: Extract the brand
fields, not image) mentioned in the text fields
- Question 3: Extract the brand
If you can determine the
mentioned in the image
brand name within the
text, click on ‘highlight’
and hover over to the
listing to highlight the
text. The text you
highlight will
automatically display in
the answer box.
3. What is the brand
mentioned in the
image?
- Free text. If you can
determine the brand
through any watermark
or logo identified in the
image only, please type
this in the box.

Material Select the primary - Apparel: 1. Can label? If the product is made using multiple
material used to Flannel, - Yes: dominant materials, tag multiple
manufacture the Cotton, ● Can label materials.
product. Fleece; - No for the following
Velvet, reasons: If a product comes in multiple variants of
Denim, Silk, ● Images are not different materials (i.e. silver vs. gold) ,
Cashmere, available and only select the material of the specific
Canvas, text lists variant in the job.
Leather, etc. multiple
- Jewelry: materials If the title or description does not
Gold, Silver, ● Material is mention anything related to materials, but
Stainless missing you can very clearly see the material
Steel, ● Image and text (glass, wood etc.) from the image, then
Platinum, description are label the materials from the drop-down
Copper, etc. not aligned options in question 2. For clothing
● No, can not materials, it might not be very obvious
label due to unless listed in some text form. In such
other reasons cases where raters are not confident of
apart from the the image, you should mark it as ‘material
ones listed is missing’.
above
Don’t include packaging materials like
2. Select material:
plastic for cosmetics, shampoos etc.
- Drop-down contain
materials to select from
Only tag the materials of the core
product, in case of shampoos, you should
tag the material as Other.

Think of it as when a user searches for a


material, if showing the current product
seems appropriate using your common
sense.
For products which have metal plating,
there are options for plating listed, yellow
gold plating, silver plating etc. Select the
appropriate plating material and base
material if mentioned(Ex: Brass ring with
gold plating. Materials should be Brass
and yellow gold plating)

Gender Select the target - Male (man, 1. Rater or skip? Depending on the type of item you are
gender the mens, male, - Yes: I can rate this job reviewing, the frequency of these labels
product is Boys) - No: I cant rate this job will vary significantly. For instance when
manufactured for. - Female 2. What is the gender? you are reviewing clothing many items will
(woman, - Female be gender specific and only some will be
womens, - Male unisex. Whereas if you are reviewing
female, lady, - Unisex electronics, almost all items will be unisex.
ladies, Girls) If the product information fields
- Unisex (men contradict each other you should
& women, prioritize the description. I.e. If the
men or category includes “Women’s,” but the
women, description explicitly says “men’s” or “for
male & men” etc. then label the item Male.
female, male There might be cases where you can not
or female) extract gender information from the text,
and the image clearly shows indication of
gender (e.g. male model), label the item
Male.

Sanity check: These labels will be used for


product filtering/navigation for online
shopping. Ultimately, if you were
shopping online and navigated by gender,
where would you expect this item to show
up? (i.e. when filtering clothing by “men”
would you expect to see this product?)

Size Select part of the - Baby & 1. Can label? Normalised sizes are the list of
text that describes Toddler - Yes: standardized terminology we want to
the dimension of Sizes: ● Can label capture across all the products.
the product. Preemie, - No for following
Newborn, reasons: For example, a product might have ‘Small’
3-6 Months, ● Contains in its description, select ‘S’ in the
etc. multiple normalised sizes drop down(question 3)
and highlight ‘Small’ in the description for
- Kids’ Sizes: variants in
question 2.
Small, description
Medium, ● Conflict Note that Small, small, s, S should be
Large, 4, 5, between title, selected as ‘S’; XL, xl, Xl, Xlarge, X Large,
6X, 7, 8, 12, description, Extra Large - should be selected as ‘XL’. 2xl
etc. seller size should be selected as XX Large.
- Shoe Sizes: ● Size is missing
2, 2.5, 3, 3.5, Select One Size only when
● No, can not
etc. title/description/seller size explicitly
label due to
- Beauty mentions it.
other reasons When the size is not mentioned in any of
Product apart from the the normalized fields, select the “Size is
Sizes: ones listed missing” option in the Can label?
Sample Size, above Question. In this case, still capture it in the
1 oz, 2. What is the size? free text highlight question 3.
2-ounce, 2-3 (based on text
oz. fields, not image) When the description has sizes available
- Ring Sizes: If you can determine the from S-L or S,L,M but doesn’t clearly
6.5, 13, 4, size from the product mention size of this particular product
11.5, etc. description, click on that the user is going to buy, select
‘highlight’ and hover “Contains multiple variants in description”
over to the listing to in the Can label? question.
highlight the text. The
text you highlight will It's possible that the description has all
automatically display in available sizes but title, seller size might
the answer box. have one size variant. In such cases, raters
3. What’s the should select that particular variant as
normalized size? size.
(based on text
fields, not image) There are scenarios where the size
- Drop down contain information might be available in both
normalized options to image and text fields (ie description and
select from title). In which case:
4. What’s the size - Questions 2 and 3: Extract the
mentioned in the size mentioned in the text fields
image? - Question 4: Extract the size
- Free text. If you can mentioned in the image
determine the size from
the image only, please
type this in the box.

Step 2 - Evaluate the Listing


Once you have determined all the attribute widgets, you will conclude whether there are potential issues or if they
can be successfully rated. There are a number of scenarios you might come across when you evaluate the listing
(referred to as Common question within section 3):

Action Description

1. Can Label None of the other labels apply and you can label the job.

2. Wrong Language Some or all of the job content is in a language you do not understand, preventing you from
understanding its core purpose and correctly extracting all the attributes required.
For example, the product description might have 1 sentence in French, but the rest is in English
and you can see gender and colour in the English part of the description. In this case, label the
gender and colour part, and select ‘Can label’ for the right hand side box. If the job also asks for
a brand which you might not be able to extract due to the foreign language element, select ‘No
can not label for other reasons apart from the ones listed above’ in the brand box.

If the job description has some or all content in foreign language which means you can not
extract any of the attributes required, then select this option ‘Wrong language’ in the box on the
right.

3. Widget Error The job fails to load any information - your screen is blank and you do not see the SRT interface.

4. Violating The product image, description, or title contains or potentially contains any of the following:
content - Child Exploitation and/or Child Nudity
- Self Injury and Suicidal Content
- Credible Threats, Violence or Calls to Violence
- Sexual Content and/or Nudity
- Hate Speech
- Acts of Terrorism
- Human Trafficking
- Bullying and Harassment

Escalate to your manager immediately and report the Job ID.

Step 3 - Finish

Once all attributes are correctly tagged, click “Submit” in the “Actions” section to finish.

5. Examples

Example 1: Color
Color should be marked as multi-color when both colors are dominant in the image (see below).

Pictures below are not multi-colored. The primary color should be black for this pair of shoes below.
The primary color should be blue for this dress below.

Example 2: Highlighting a text

Using the screenshot above, we can determine that the brand of the listing is “Overton”.

1. Firstly select ‘Can label’ in the ‘Can label?’ question.


2. Select ‘Highlight’

3. Highlight the text by hovering over to the listing

4. The text should automatically appear in the highlighted value box. There is no indication of the brand
from the job image, so we can leave this question blank.
5. Select ‘Can label’ in the ‘Common question section’

6. Select ‘Submit’

You might also like