Local Curation - Business Validation V2: A. Experiment Overview

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 22

Local Curation - Business Validation

V2

A. Experiment Overview
1. OVERVIEW

We want to determine if a given entity is a real-world public place that people can
physically go to and interact with it. An entity could be a public place, a non-real
place (e.g. a fictional place), a private place (e.g. someone's house), a permanently
closed place, or not even a place at all. By examining these entities and researching
the web for corroborating information about each, you are helping users discover
real-world public places that they can visit in person.

B. Rating Instructions
1. JOB OVERVIEW

Each job will present you with the following screen:


(1) Entity Name - This will link to the Facebook page of the entity if there is one. In
this case, it should link to the Starbucks Facebook page

(2) Search and Pin links - “Open in Search” will auto-search for your place on an
external search engine. “Open in Pin” will show you local of entity on a map.

(3) Address & Website - These two attributes of the entity are given if Facebook
has that information on file.

==================

2. QUESTION 1

To start, do you research based on the benchmark hierarchy suggestion in section


C3 below. Some examples of benchmark include Google card, Facebook page, etc.
• (1) Permanently Closed - Select this If ANY website says the place is closed.
• (2) Public 2nd Tier - Select this if the page represents a Public 2nd Tier
place (see Section D. for Public 2nd Tier definition)
• (3) Non-Public / Not-a-Place - Select this if the page meets the criteria for
Non-Public / Not a Place (see Section D. for definition).
• (4) None of the Above - Select this if none of the above answer choices 1 - 3
applies.
*If you think more than one answer choice applies, pick the one that is first in order.
For example, if a place is both Permanently Closed and Non-public, select
Permanently Closed.

*Permanently Closed - The first step to answering this question is to search for a
benchmark labelling the place as permanently closed, using the top search engine in
your market. If ANY website says the place is closed, it is considered closed. Use the
top search engine in your market to perform the search. In the US, the top search
engines are Bing and Google. Sometimes, an entity may appear as open on one, but
closed on the other. Always base your decision on whichever one provides
the “closed” information.

For examples of how leading sites show that a place is permanently closed, see
section E1.
To check if an entity is permanently closed at its given address, we recommend
using the following terms in your search, though feel free to use any research steps
that you feel will help you locate the relevant information:
1. [name of the entity] + “closed”
2. [name of the entity] + [full address of the entity] + “closed”
3. [name of the entity] + [partial address of the entity] + “closed”
(!) NOTE: Use the language-specific word for “closed” in non-English jobs.

(!) TIP: Sometimes, a place will be closed at a different address than the one given in
SRT (commonly the case in chains or franchises with multiple locations). Always
remember that we are evaluating the SRT entity at the SRT address. Therefore, only
answer “Yes” if the entity at the SRT address is closed.
==================

3. QUESTION 2: Please insert the url of the benchmark

Enter the URL of the best benchmark you used to answer question 1

C. Benchmarks
1. BENCHMARK CRITERIA

All benchmarks must representatively match the reference entity. Don't overthink
this - just ask yourself, “Am I confident that this benchmark represents the reference
entity at the correct location?”

1. An exact or partial name match is required between the benchmark and the
reference entity
a. Finding a website listing “Keith's Bakery” would not be a valid
benchmark for an entity named “John's Bakery”
b. Abbreviations, Acronyms, Misspelling, Extraneous information (such
as location information, like “JFK Airport New York City”), places that
are “Formally Known As,” or have alternative names are ok to match
2. An exact or partial location match is required between the benchmark and
the reference entity
a. The benchmark MUST have some sort of location information that you
use to determine if it reflects the reference entity's location; if a
benchmark does NOT have any location information, it is not a valid
benchmark
b. If a partial address exists in SRT, and you can reasonably determine
this “matches” with the benchmark location information, the
benchmark is valid
i. Example: 123 Main St, Menlo Park [without a state or zip code]
could be matched with “123 Main St, Menlo Park, CA 94025”
ii. Example: Menlo Park, CA [without street information or zip
code] wouldn't be matched with “123 Main St, Menlo Park, CA
94025”, and a Pin Search would be used to confirm if the
benchmark is valid
c. If an address doesn't exist in SRT, or if the partial address does not
provide you with enough information to conclusively determine if the
benchmark is valid, perform a Pin Search (see Section E2) to
determine if the benchmark location is within 20 miles of the
reference entity. If it's the closest entity within 20 miles, the
benchmark is valid. If it's not the closest entity within 20 miles, the
benchmark is not valid.

2. FINDING AN OFFICIAL WEBSITE

Official websites can usually be found with a simple search-engine search. In fact,
Google provides a button to official websites when available. However, there are
lots of options to use to find an official website. Feel free to use whatever works
best for you / your locale (for example, maybe Google is not used in the job's locale).

3. BENCHMARK HEIRARCHY (AKA finding another benchmark with or


without user activity)

First, we always trust information from an official website over all other
benchmarks. However, if we can't find an official website, we want to find another
website that contains user-activity. We recommend searching on “social-aggregator”
sites that compile user-feedback about a particular place because these sites will
usually have “user-activity.” As long as you find a site with user activity, that's
enough to successfully answer the question.
• Tier 1 - Best: Official Website
o Always preferred given it has EITHER an exact name match + an exact
or partial address match OR no address match at all but location is
within 20 miles in performed search
o Remember: The official website must have enough location
information on it that allows you to confirm it is the official website
for the SRT entity being rated. Sometimes, official websites seem like
they match the SRT entity, but do not contain any location
information, and therefore are not a valid benchmark.
• Tier 2 - Best with user activity
o Google Entity Card - Yelp - TripAdvisor - YellowPages - WhitePages -
Manta - Zomato - Just Dial - Facebook - Foursquare - Wikipedia - etc.
(usually social aggregators)
o Wikipedia always has user activity, since all articles and updates
are user-generated!
o Please attempt to find a benchmark with the most user activity, or
very recent user activity! This usually signifies the benchmark is of
higher quality than one with only 1 piece of user activity.
• Tier 3 - Worst without user activity
o Google Entity Card - Yelp - TripAdvisor - YellowPages - WhitePages -
Manta - Zomato - Just Dial - Facebook - Foursquare - etc.
(!) NOTE: See Section D3 for User Activity definition / examples.

4. FAQ

I can't find a benchmark? What are some things I can do?


• Local Search Engine - If a benchmark is not found using search 1 or search 2,
perform the search again using the local version of the search engine, ex.
Google.com.br, Google.co.in, Baidu, etc.
• Modify Search Terms - Modifying the search terms can return different/better
results if no benchmark is found on the first search. Remove additional location
information and/or searching with just the entity name can populate better
search results.
• Avoid Alternative Spelling Search Recommendations - Search engines will
recommend a different spelling for the entity. This recommended spelling will
populate as the default search result. Reject this recommendation.
D. Definitions
1. Public 2nd Tier -------

Public 2nd Tier refers to public places that users would most likely not “check-in” to,
post reviews of, search for hours/phone numbers of, interact with other people at,
etc. Below is a complete list of Public 2nd Tier place types. Thus, if the entity is ANY
of the below, it is Public 2nd Tier. If it is ANYTHING ELSE, it is not Public 2nd Tier.

• Bus stop
o The page name does not require the words bus stop to be considered a
bus stop
o NOTE: Bus STATIONS and train STATIONS or any kind of
transportation terminal are larger entities where buses park and
people can buy tickets, food, and souvenirs. These are NOT Public 2nd
Tier. Bus stops, train stops, etc. are smaller and typically just a
designation on a road that the bus/train will stop there to pick up
riders, and thus are Public 2nd Tier.
• Intersections
o The page name consists of two intersecting road names

• Public phones
o NOTE: In India, the acronyms PCO/STD/ISD represent public phones.

• Public toilets
• Vending station
o Coinstars
o Redbox
o Electric car recharge stations
o Amazon Locker
o ATMs
▪ If there is an entity with a generic name, e.g. “ATM” - it is a
Non-Public entity, since the name is generic!
▪ If, on the other hand, the entity is specific, e.g. “Bank of America
ATM”, then rate it as Public 2nd Tier
o Kiosks
• Monuments
o except notable statues where there are opening/close hours or
ticketing. i.e. Eiffel Tower
• Statues
o except notable statues where there are opening/close hours or
ticketing. i.e. Statue of Liberty
• Geo-hubs - defined areas of land with clear borders such as cities,
neighborhoods, regions, and geo-graphical features (rivers, mountains,
oceans, etc.)
o Parks are NOT public 2nd tier - they are public places and you should
select “None of the above” in question 1
• Streets
• Any public place that occupies a certain location only temporarily
o Food Trucks
o Hot Dog Stand
o Pop up store
o Farmers’ Market
o Christmas market
o Etc.

2. Non-Public / Not-a-Place -------

Non-Public / Not-a-Place represents entities that we consider to be “junk.” They are


pages/cards that do not represent a real-world place that the public would go to and
interact with others at.

There can be one or many characteristics of a place/page that suggest it's a non-
public/not-a-place. The primary categories of so-called “junk” are listed
below. Please use any information on the reference source to determine if these
apply (e.g. looking at the pictures uploaded, looking at the name, reading comments,
reviewing the description, looking at category tags, etc.).

1. Events - generally have


a. Time frame such as start date / time and end date / time which can be
found in entity name, description, or anywhere on reference source
(e.g. event photos, etc.)
b. User response options such as "I'm attending," "I'm not attending," or
"I'm interested."
c. Note: Events are similar in nature to “Hosted Entities” - the entity
could list an address that matches another entity place's address at
specific times or dates (e.g. Coldplay Concert at Madison Square
Garden would be an event, even though it would list the address of
Madison Square Garden, but Madison Square Garden itself is a place)
2. Service Area Businesses (SAB) - businesses that service a geographical
location but do not have any public storefront that you can go to and inquire
about the service
a. Commonly plumbers, electricians, door-to-door sales, locksmiths,
cleaning/maid services
b. Some SABs will have pages with addresses, reviews, comments, etc.
However, we do NOT consider any kind of SAB to be a “place” that
people can visit and do business at.Therefore, if the page is an SAB,
regardless of if it has an address or anything else, label it as “non-
public/not-a-place.”
i. Example of non-public/not-a-place SAB with an
address: https://www.yelp.com/biz/fbf-office-cleaning-los-
angeles?osq=Deep+Cleaning+Service
3. Online-only / Brand Page / Public Figures
a. Online-only - the page represents an entity that only exists on the
internet, with no real world public place a user could go and interact
with somebody at
i. Any page with a PO Box as an address is an Online-Only page,
or a Service Area Business (see #2 above), and should (either
way) be marked non-public/not-a-place
ii. SEALFiT is a page about a website
b. Brand page - the page represents a company brand and not a specific
location (typically franchise brand pages)
i. Chipotle and McDonalds are brand pages that do not represent
any location
c. Person - a page that represents an individual
i. If you come across a page with a person's name and no other
information about the person's business and physical business
location, lavel it as Non-Public / Not-a-Place
ii. If the page contains information (such as a description, tag,
etc.) suggesting the person may be an Individual Practitioner,
then do NOT label it as Non-Public / Not-a-Place. See section
C4 for more information on Individual Practitioners.
1. This page is named for a person “Amber Rottman”,
which suggests that the page is non-public, but the
category tag of “therapist” provides us enough
information to say that the person may be an Individual
Practicitioner: https://www.facebook.com/pages/Amb
er-Rottman/428805217328410?nr
4. Other
a. The following characteristics denote that an entity should be marked
non-public/not-a-place.
i. PO Box used as an address
ii. If you click the header link and the page doesn't load, consider
the entity to be non-public/not-a-place
iii. Something vague or ambiguous
iv. Luxury Homes Approximately 20 miles South of Boston Mass,
Close to SJSU, Near Oakland
v. Generic - representing a broad category that isn't unique to a
specific place
vi. Supermarket, restaurant, gym
vii. A private residential area - apartments for rent, private homes,
Airbnb homes, vacation homes, VRBO listings, etc.
viii. My Bed Room , My couch , Allen's House , The Fortress of
Solitude
ix. (!) NOTE: Not all names containing the word home are non
public. For example, Home, Pennsylvania (city), Home Sweet
Home Care SF (company), and Home Sweet Home Cafe (cafe)
b. Entity name is an address / zip code
i. 14751 Juniper St
ii. (!) NOTE: Most places should not have an address for a page
name, but here are the exceptions
1. Shared living complexes are businesses (condominiums,
apartments, duplexes)
2. Businesses named after the address (1015 Folsom, a
nightclub, and 900 Grayson, a restaurant
c. Non-physical / imaginary
i. Skyline Manhattan, Hogwarts School of Wizardary
d. Action - expressing an action a person or thing is taking
i. In my bedroom sleeping, Running Lake Merritt
e. Private - intended for a non-public audience, typically a small, close-
knit group of people
i. Couins Bday Party!!!, Gracie's Going Away Party@ Smokey
Bones
ii. Commonly birthdays, happy hours, and going away parties
f. Violating policy
i. Nudity and/or Sexual Activity (including escort pages)
ii. Violence and/or Graphic Content
iii. Regulated Goods (guns, drugs, ammunition)
g. Nonsense - the page was created just for fun and has no real purpose
other than humor.
i. “Jokes and Humor for All” “Epic Fail Videos” “Hilarious Memes
of 2017”
h. Misspellings / Symbols in Entity Name
i. Lemon Croissant ? McDonalds!!!, Starbucks ^^, @Golgden Gate
theater, Keith!!s Bakery
ii. No symbols are “blacklisted” or “whitelisted” - the prevailing
logic here should be “does the misspelling / symbol usage
seem to indicate it is a junky page”
i. Hosted entity - the page itself represents a group or something else
that specifically meets at a certain place. The page does NOT
represent that place.
i. Southridge Elementary PTA, Carrie's Spinning Class, UCSD
Alumni
ii. Help groups or school clubs which uses the address of the
school/community center where they meet as their page
address
iii. College alumni pages - uses the address of the college as their
address
iv. Classes - uses the address of the gym where the classes are
held
v. Sports teams - uses the address of the stadium where games
are played
j. Compound entities - a page that represents two distinct places, even if
they are the same place
i. If a user would want to check into each place independently,
it's a compound entity
ii. O.Co and Oracle Arena, Disneyland California Adventure
k. Mobile transportation vessels
i. Boats, trains, ferries, etc.
ii. 5 Fulton, MTA 6 Train, Staten Island Ferry, Bremerton Ferry,
Alameda Ferry
l. Fan page - a page expressing an opinion about something
i. I Love Starbucks, McDonalds Best In The West, Heaven, also
known as AT&T Park
ii. Stryker Fan Club, Nico Rosberg Fan Club, Lionel Richie Fan
Club, Dean Ray Fan Club.
(!) NOTE: A blank page with a clear name and none of the above is NOT considered
a non-public/not-a-place.

3. User Activity -------

Conceptually, we define User Activity as any engagement with the page/site/card by


a third party user (i.e. not an admin, or the business owner). The only exception is
Facebook likes and reactions - these do not qualify for user activity. Please note that
the below lists are NOT exhaustive, but serve as guidance when examining a page.

• 3A. Facebook Page


o Any of the following qualify as “user activity” on a Facebook Page:
▪ Reviews
▪ Star Ratings
▪ Check-ins
▪ Comments
▪ Shares
o The following do NOT qualify as “user activity” on a Facebook Page:
▪ Likes (Page likes and Post likes)
▪ Follows
▪ Reactions
▪ Category tags
▪ Admin-activity (posts by the page / admin, updates by the page
/ admin)
• 3B. Google Card
o Any of the following qualify as “user activity” on a Google Card
▪ Google Reviews
▪ Questions and Answers
▪ Star Ratings
▪ User-uploaded photos
o The following do NOT qualify as “user activity” on a Google Card
▪ Admin / Owner updates
▪ Popular times
▪ Category tags
▪ Reviews from the web
• (!) TIP: If you see no activity on the Google card, but
reviews from other sources, feel free to click into those
sources and use them as a benchmark WITH user
activity, if there isn't an official website :)
▪ “People also search for”
▪ “See outside” (this is just the Google Street View)
▪ Map
• 3C. Non-Facebook, Non-Google Page
o Any of the following qualify as “user activity”
▪ Reviews
▪ Ratings (a user could give the place 5 stars but NOT leave a
review...this would qualify as user activity)
▪ Comments
▪ Check-ins
▪ User-uploaded photos
o The following do NOT qualify as “user activity”
▪ Admin / Owner updates
▪ Descriptions
▪ Category tags
▪ Information from other websites (such as the “Reviews from
the web” section on the Google card above...these are “on” the
benchmark but do not represent activity on the benchmark in
question)

4. Individual Practitioners (IP)

People will sometimes appear as entities. While one of the signals of a “non-
public/not-a-place” is a Person, sometimes we consider individuals as places
(referred to as “Individual Practitioners”). Therefore, if presented with an entity
representing a person, you will need to confirm the person is / is not an IP before
selecting “non-public/not-a-place.”
• Definition: A person who is themself a business. A public facing professional
with his or her own customer base.
• Q: How can people ever be considered places? A: The purpose of ensuring we
capture all real-world, public places is to allow our users to check-in, derive
information from, and interact with the place. Places like McDonald's like
providing their customers the convenience of a page that allows these
features. Similarly, people who have their own client base want to have the
same functionality as a business: the ability to interact with their customers,
give them information about their location, and allow them to “check-
in.” That's why IPs are considered Public Places. A client of a doctor, for
example, would want to know the doctor's address, hours, specialty, etc, just
as a customer of McDonald's would want to know the store address, hours,
menu, etc.
• Requirements of an IP:
▪ Is a public facing professional with clients / customers
▪ Provides a professional service
▪ Benchmark MUST provide an indicator of what type of service
the IP provides
• A category (e.g. “Therapist”) will suffice to prove a
person is an IP
▪ Can be a business
▪ Has an address (the address of the individual practitioner's
larger employer is acceptable in lieu of their own address)
• If the reference entity is a proper name with no professional
abbreviations (e.g. “Keith Armington”) but you cannot prove that it's an
IP, select Non-Public/Not a Place. This means that the person is just a
private individual.
• If an IP has passed away (died) or retired, label it as “Permanently Closed”
• When in doubt, think of whether the person provides a professional service
to public individuals, and retains his/her own client base.
• Examples of non-IPs:
o Engineers
o Judges
o Pharmacists
o Social workers
• Examples of IPs: (NOTE: This list does not include all IPs.)
Profession Common Occupations
designations
1 Medical professionals MD, Dr, DPT, PT, Chiropractor,
MSPT, DC Psychologist, Therapist,
Counselor,
Acupuncturist, Nurses
2 Lawyers Esq, JD
3 Dentists DDS
4 Financial consultants ChFC, CFP, CPA, EA,
CLU, CEBC
5 Insurance agents CPCU, CIC, AAI, CLCS,
CRM, ARM, CISR, AIS,
PLCS, AIC
6 Realtors CRS, CCIM, ABR,
MRP, GRI, SRS
7 Hair stylist Not applicable
8 Photographers (with a Not applicable
pysical location)
9 Architects

E. Appendix
1. Permanently closed examples

Yelp:

Google:
Mystore411
Foursquare

Yellow Pages
Tripadvisor

2. Performing a Pin Search

A pin search is used to find the closest representative benchmark to the reference
entity's pin location. To do this, perform the following steps:
Note: All steps/research methods are mandatory and need to be carried out before
making a final decision.

a. Click “Open Pin in Maps 1” to open the map with the reference pin.

b. Right click on top of the pin and select the “What’s here?” function.

c. Copy the address or latitude/longitude that the map shows. Sometimes, an


address won't appear. In this case, you can find an address by selecting a place close
to the pin and using that address. We only need a reference address/location!
d. Open a new tab, and use the leading search engine in the market to search for “the
name of the entity + the address provided in Google Maps.” You may need to modify
the address in the search, such as deleting the street number.

e. You may find a source nearby the location. If you find a source with the same or
slightly different name of your entity but with a different address than the one found
on GMaps, use the distance measuring feature in Google Maps to check the distance
between your pin and that address.
g. If the benchmark is within 20 miles, consider this as a valid benchmark. If
multiple entities appear, please select the one closest to the pin location.

3. Using search engines

Use any popular internet search engine such as Google, Bing, Yahoo, Baidu, etc. to
find relevant webpages with information to become your benchmark. These search
engines will return webpages that best fit the keywords in your query.
Keyword Search best practices
• Query using as much information provided as possible. A query for
'McDonalds Restaurant San Francisco' is better than a search for
'McDonalds' because it contains the specific location information.
• For international countries, try using the local version of the search engine
o bing.com.br or bing.com.uk
o google.com.uk, google.com.au, google.com.mx, google.com.br
• Alter the search query - follow these steps until a benchmark surfaces.
1. Remove the following attributes in this order until a benchmark surfaces :
a. Zip code and country
b. City/State
c. Address number
d. Any additional descriptors
2. If necessary, Add new descriptors that were not in the original search/query
(i.e. “Bar”, “Restaurant”, “Park”)

4. FAQs
• Q: What happens if my SRT address and reference source address are
different?
o A: Always use the SRT address as the source of truth when looking for
an external benchmark.
• Q: I see rater's using Wikipedia as a benchmark WITH user activity, but I
don't see any user activity on that benchmark! What's up?
o A: Wikipedia is a crowdsourced encyclopedia of sorts, so every article
and revision are made by users. Therefore, Wikipedia is always
considered a benchmark “with user activity.”
o A: Wikipedia is a crowdsourced encyclopedia of sorts, so every article
and revision are made by users. Therefore, Wikipedia is always
considered a benchmark “with user activity.”

You might also like