Discrimination in Online Ad Delivery

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 11

practice

doi:10.1145/ 2447976.2447990
Employers frequently ask whether
Article development led by
queue.acm.org
applicants have ever been arrested or
charged with a crime, but if an em-
ployer disqualifies a job applicant
Google ads, black names and white names, based solely upon information indi-
racial discrimination, and click advertising. cating an arrest record, the company
may face legal consequences. The U.S.
by Latanya Sweeney Equal Employment Opportunity Com-
mission (EEOC) is the federal agency

Discrimination
charged with enforcing Title VII of the
Civil Rights Act of 1964, a law that ap-
plies to most employers, prohibiting
employment discrimination based on

in Online
race, color, religion, sex, or national
origin, and extended to those having
criminal records.5,11 Title VII does not
prohibit employers from obtaining

Ad Delivery
criminal background information, but
a blanket policy of excluding appli-
cants based solely upon information
indicating an arrest record can result
in a charge of discrimination.
To make a determination, the
EEOC uses an adverse impact test that
measures whether certain practices,
intentional or not, have a dispropor-
tionate effect on a group of people
whose defining characteristics are
Do online ads suggestive of arrest records appear covered by Title VII. To decide, you
more often with searches of black-sounding names than calculate the percentage of people af-
fected in each group and then divide
white-sounding names? What is a black-sounding name the smaller value by the larger to get
or white-sounding name, anyway? How do you design the ratio and compare the result to
80. If the ratio is less than 80, then the
technology to reason about societal consequences like EEOC considers the effect dispropor-
structural racism? Let’s take a scientific dive into online tionate and may hold the employer re-
ad delivery to find answers. sponsible for discrimination.6
What about online ads suggesting
“Have you ever been arrested?” Imagine this someone with your name has an ar-
question appearing whenever someone enters rest record? Title VII only applies if you
have an arrest record and can prove the
your name in a search engine. Perhaps you are in employer inappropriately used the ads.
competition for an award or a new job, or maybe Are the ads commercial free
you are in a position of trust, such as a professor or a speech—a constitutional right to dis-
play the ad associated with your name?
volunteer. Perhaps you are dating or engaged in any The First Amendment of the U.S. Con-
one of hundreds of circumstances for which someone stitution protects advertising, but the
Illustration by A lex William so n

U.S. Supreme Court set out a test for


wants to learn more about you online. Appearing assessing restrictions on commercial
alongside your accomplishments is an advertisement speech, which begins by determining
implying you may have a criminal record, whether whether the speech is misleading.3 Are
online ads suggesting the existence of
you actually have one or not. Worse, the ads may not an arrest record misleading if no one
appear for your competitors. by that name has an arrest record?

44 comm unicatio ns o f the ac m | may 201 3 | vol . 5 6 | no. 5


cred it t k

may 2 0 1 3 | vo l. 56 | n o. 5 | c om m u n ic at ions of t he acm 45


practice

Figure 1. Ads from a Google search of three different names beginning with first name Assume the ads are free speech:
“Latanya.” what happens when these ads appear
more often for one racial group than
another? Not everyone is being equal-
ly affected by free speech. Is that free
speech or racial discrimination?
Racism, as defined by the U.S. Com-
mission on Civil Rights, is “any atti-
tude, action, or institutional structure
(a) which subordinates a person or group
because of their color.”16 Racial dis-
crimination results when a person or
group of people is treated differently
based on their racial origins, accord-
ing to the Panel on Methods for As-
sessing Discrimination of the National
Research Council.12 Power is a neces-
sary precondition, for it depends on
the ability to give or withhold benefits,
facilities, services, and opportunities
from someone who should be entitled
to them and is denied on the basis of
race. Institutional or structural racism,
as defined in The Social Work Diction-
ary, is a system of procedures/patterns
whose effect is to foster discriminatory
outcomes or give preferences to mem-
bers of one group over another.1
These considerations frame the
(b) relevant socio-legal landscape. Now
we turn to whether online ads sugges-
tive of arrest records appear more of-
ten for one racial group than another
among a sample of racially associated
names, and if so, how technology can
solve the problem.

The Pattern
(c) What is the suspected pattern of ad de-
livery? Here is an overview using real-
world examples.
Earlier this year, a Google search for
Latanya Farrell, Latanya Sweeney, and
Latanya Lockett yielded ads and crimi-
nal reports like those shown in Figure
1. The ads appeared on Google.com
(Figure 1a, 1c) and on a news website,
Reuters.com, to which Google supplies
ads (Figure 1c), All the ads in question
linked to instantcheckmate.com (Fig-
ure 1b, 1d). The first ad implied Lat-
anya Farrell might have been arrested.
Was she? Clicking on the link and pay-
ing the requisite fee revealed the com-
pany had no arrest record for her or
Latanya Sweeney, but there is a record
for Latanya Lockett.
In comparison, searches for Kristen
(d) Haring, Kristen Sparrow, and Kristen
Lindquist did not yield any instant-

46 communicatio ns o f th e ac m | may 201 3 | vol . 5 6 | no. 5


practice

checkmate.com ads, even though the Figure 2. Ad from a search of three different names beginning with the first name “Jill.”
company’s database reported having
records for all three names and arrest
records for Sparrow and Lindquist.
Searches for Jill Foley, Jill Schneider,
and Jill James displayed instantcheck-
mate.com ads with neutral copy; the
word arrest did not appear in the ads
even though arrest records for all three
names appeared in the company’s da-
tabase. Figure 2 shows ads appearing
on Google.com and Reuters.com and (a)
criminal reports from instantcheck-
mate.com for the first two names.
Finally, we considered a proxy for
race associated with these names. Fig-
ure 3 shows racial distinction in Google
image search results for Latanya, Lati-
sha, Kristen, and Jill, respectively. The
faces associated with Latanya and Lati-
sha tend to be black, while white faces
dominate the images of Kristen and Jill.
These handpicked examples de-
scribe the suspected pattern: ads sug-
gesting arrest tend to appear with
names associated with blacks, and
neutral or no ads appear with names
associated with whites, regardless of
whether the company placing the ad
has an arrest record associated with
the name. (b)

Google Adsense
Who generates the ad’s text? Who de-
cides when and where an ad will ap-
(c)
pear? What is the relationship among
Google, a news website such as Reuters,
and Instant Checkmate in the previous
examples? An overview of Google Ad-
Sense, the program that delivered the
ads, provides the answers.
In printed newspapers, everyone
who reads the publication sees the
same ad in the same space. Online ads
can be tailored to the reader’s search
criteria, interests, geographical loca-
tion, and so on. Any two readers (or
even the same reader returning to the
same website) might view different ads.
Google AdSense is the largest provid-
er of dynamic online advertisements,
placing ads for millions of sponsors on
millions of websites.9 In the first quar-
ter of 2011, Google earned $2.43 billion
through Google AdSense.10 Several dif-
ferent advertising arrangements exist,
but for simplicity this article describes
only those features of Google AdSense
specific to the Instant Checkmate ads (d)

in question.

may 2 0 1 3 | vo l. 56 | n o. 5 | c om m u n ic at ions of t he acm 47


practice

When a reader enters search crite- search bar generated a new Web page Rather, the list provides a qualified
ria in an enrolled website, Google Ad- with ads under the banner “Ads by sample of names to use in testing ad-
Sense embeds into the Web page of re- Google” (Figure 1c). delivery systems.
sults ads believed to be relevant to the There is no cost for displaying an
search. Figures 1 and 2 show ads deliv- ad, but if the user actually clicks on Black- and White-Identifying Names
ered by Google AdSense in response to the ad, the advertiser pays the auc- Black-identifying and white-identifying
various firstname lastname searches. tion price. This may be as little as a first names occur with sufficiently high-
An advertiser provides Google with few pennies, and the amount is split er frequency in one race than the other.
search criteria, copies of possible ads between Google and the host. Click- In 2003 Marianne Bertrand and
to deliver, and a bid to pay if a reader ing the Latanya Sweeney ad on Reuters. Sendhil Mullainathan of the National
clicks the delivered ad. (For conve- com (Figure 1c) would cause Instant Bureau of Economic Research (NBER)
nience, this article conflates Google Checkmate to pay its auction amount conducted an experiment in which
AdSense with the related Google Ad- to Google, and Google would split the they provided resumes to job posts that
words.) Google operates a real-time amount with Reuters. were virtually identical, except some
auction across bids for the same of the resumes had black-identifying
search criteria based on a “quality Search Criteria names and others had white-identi-
score” for each bid. A quality score in- What search criteria did Instant Check- fying names. Results showed white
cludes many factors such as the past mate specify? Will ads be delivered for names received 50% more interviews.2
performance of the ad and character- made-up names? Ads displayed on The study used names given to
istics of the company’s website.10 The Google.com allow users to learn why a black and white babies in Massachu-
ad having the highest quality score ap- specific ad appeared. Clicking the cir- setts between 1974 and 1979, defining
pears first, the second-highest second, cled “i” in the ad banner (for example, black-identifying and white-identifying
and so on, and Google may elect not Figure 1c) leads to a Web page explain- names as those that have the highest
to show any ad if it considers the bid ing the ads. Doing so for ads in figures ratio of frequency in one racial group to
too low or if showing the ad exceeds a 1 and 2 reveals that the ads appeared frequency in the other racial group.
threshold (For example, a maximum because the search criteria matched In the popular book Freakonomics,
account total for the advertiser). The the exact first- and last-name combina- Steven Levitt and Stephen Dubner re-
Instant Checkmate ads in figures 1 tion searched. port the top 20 whitest- and blackest-
and 2 often appeared first among ads, So, the search criteria must consist identifying girl and boy names. The list
implying Instant Checkmate ads had of both first and last names; and the comes from earlier work by Levitt and
the highest quality scores. names should belong to real people be- Roland Fryer, which shows a pattern
A website owner wanting to “host” cause a company presumably bids on change in the way blacks named their
online ads enrolls in AdSense and records it sells. children starting in the 1970s.7 It was
modifies the website to send a user’s The next steps describe the system- compiled from names given to black
search criteria to Google and to display atic construction of a list of racially as- and white children recorded in Cali-
returning ads under a banner “Ads by sociated first and last names for real fornia birth records from 1961–2000
Google” among search results. For ex- people to use as search criteria. Nei- (more than 16 million births).
ample, Reuters.com hosts AdSense, ther Instant Checkmate nor Google To test ad delivery, I combined the
and entering Latanya Sweeney in the are presumed to have used such a list. lists from these prior studies and add-
ed two black female names, Latanya
Table 1. Black-identifying names and white-identifying first names. and Latisha. Table 1 lists the names
used here, consisting of eight for each
White Female Black Female White Male Black Male of the categories: white female, black
Allison Aisha Brad Darnell female, white male, and black male
Anne Ebony Brendan Hakim from the Bertrand and Mullainathan
Carrie Keisha Geoffrey Jermaine
study (first row in Table 1); and the first
(a) Emily Kenya Greg Kareem
Jill Latonya Brett Jamal eight names for each category from the
Laurie Lakisha Jay Leroy Fryer and Levitt work (second row in
Kristen Latoya Matthew Rasheed Table 1). Emily, a white female name,
Meredith Tamika Neil Tremayne
Ebony, a black female name, and Dar-
Molly Imani Jake DeShawn
Amy Ebony* Connor DeAndre
nell, a black male name, appear in both
Claire Shanice Tanner Marquis rows. The third row includes the obser-
(b) Emily* Aaliyah Wyatt Darnell* vation shown in Figure 3. Removing
Katie Precious Cody Terrell
duplicates leaves a total of 63 distinct
Madeline Nia Dustin Malik
Katelyn Deja Luke Trevon first names.
Emma Diamond Jack Tyrone

(c) Latanya Full Names of Real People


Latisha Web searches provide a means of locat-
ing and harvesting a real person’s first
and last name (full name) by sampling

48 communicatio ns o f th e ac m | may 201 3 | vol . 5 6 | no. 5


practice

names of professionals appearing on Figure 3. Image search results for first names Latanya, Latisha, Kirsten, and Jill.
the Web; and sampling names of peo-
ple active on social media sites and
blogs (netizens).
Professionals often have their own
Web pages that list positions and de-
scribe prior accomplishments. Sev-
eral professions have degree designa-
tions (for example, Ph.D., M.D., J.D., (a)
or MBA) associated with people in that
profession. A Google search for a first
name and a degree designation can
yield lists of people having that first
name and degree.
The next step is to visit the Web
page associated with each full name,
and if an image is discernible, record
whether the person appears black,
white, or other.
Here are two examples from my test. (b)
A Google search for Ebony PhD revealed
links for real people having Ebony as a
first name—specifically, Ebony Book-
man, Ebony Glover, Ebony Baylor, and
Ebony Utley. I harvested the full names
appearing on the first three pages of
search results, using searches with oth-
er degree designations to find at least
10 full names for Ebony. Clicking on
the link associated with Ebony Glover
displayed an image.8 The Ebony Glover (c)
in this study appeared black.
Similarly, search results for Jill PhD
listed professionals whose first name
is Jill. Visiting links yielded Web pag-
es with more information about each
person. For example, Jill Schneider’s
Web page had an image showing that
she is white.14
PeekYou searches were used to har-
vest a sample of full names of netizens
having racially associated first names.
(d)
The website peekyou.com compiles on-
line and offline information on individ-
uals—thereby connecting residential
information with Facebook and Twitter
users, bloggers, and others—then as-
signs its own rating to reflect the size of
each person’s online footprint. Search
results from peekyou.com list people
having the highest score first, and in-
clude an image of the person.
A PeekYou search of Ebony listed Eb-
ony Small, Ebony Cams, Ebony King, Eb- Armed with the approach just de- and an even greater percentage of im-
ony Springer, and Ebony Tan. A PeekYou scribed, I harvested 2,184 racially as- ages associated with white-identifying
search for Jill listed Jill Christopher, Jill sociated full names of people with an names were of white people (96%).15
Spivack, Jill English, Jill Pantozzi, and online presence from September 24 Google searches of first names
Jill Dobson. After harvesting these and through October 22, 2012. Most im- and degree designations were not
other full names, I reported the race of ages associated with black-identifying as productive as first name lookups
the person if discernible. names were of black people (88%), on PeekYou. On Google, white male

may 2 0 1 3 | vo l. 56 | n o. 5 | c om m u n ic at ions of t he acm 49


practice

names, Cody, Connor, Tanner, and fewer. When ads did appear on Google.
Wyatt retrieved results with those as com, typically only one ad showed,
last names rather than first names; compared with three ads routinely ap-
the black male name, Kenya, was pearing on Reuters.com. This suggests
confused with the country; and black
names Aaliyah, Deja, Diamond, Ha- Of the more Google may be sensitive to the number
of ads appearing on Google.com.
kim, Malik, Marquis, Nia, Precious, and
Rasheed retrieved fewer than 10 full
than 2,000 names 2. Of 5,337 ads captured, 78% were for
government-collected information (pub-
names. Only Diamond posed a prob- searched, lic records) about the person whose name
lem with PeekYou searches, seeming-
ly confused with other online entities.
78% had at least was searched. Public records in the U.S.
often include a person’s address, phone
Diamond was therefore excluded from one ad for public number, and criminal history. Of the
further consideration.
Some black first names had perfect
records about more than 2,000 names searched, 78%
had at least one ad for public records
predictions (100%): Aaliyah, DeAndre, the person about the person being searched.
Imani, Jermaine, Lakisha, Latoya, Ma-
lik, Tamika, and Trevon. The worst pre- being searched. 3. Four companies had more than half
of all the ads captured. These compa-
dictors of blacks were Jamal (48%) and nies were Instant Checkmate, PublicRe-
Leroy (50%). Among white first names, cords (which is owned by Intelius), Peo-
12 of 31 names made perfect predic- pleSmart, and PeopleFinders, and all
tions: Brad, Brett, Cody, Dustin, Greg, their ads were selling public records.
Jill, Katelyn, Katie, Kristen, Matthew, Instant Checkmate ads appeared more
Tanner, and Wyatt; the worst predic- than any other: 29% of all ads. Ad distri-
tors of whites were Jay (78%) and Bren- bution was different on Google’s site;
dan (83%). These findings strongly sup- Instant Checkmate still had the most
port the use of these names as racial ads (50%), but Intelius.com, while not
indicators in this study. in the top four overall, had the second
Sixty-two full names appeared in the most ads on Google.com. These com-
list twice even though the people were panies dominate the advertising space
not necessarily the same. No name for online ads selling public records.
appeared more than twice. Overall, 4. Ads for public records on a per-
Google and PeekYou searches tended son appeared more often for those with
to yield different names. black-associated names than white-as-
sociated names, regardless of company.
Ad Delivery PeopleSmart ads appeared dispropor-
With this list of names suggestive of tionately higher for black-identifying
race, I was ready to test which ads ap- names—41% as opposed to 29% for
pear when these names are searched. white names. PublicRecords ads ap-
To do this, I examined ads delivered peared 10% more often for those with
on two sites, Google.com and Reuters. black first names than white. Instant
com, in response to searches of each Checkmate ads displayed only slightly
full name, once at each site. The brows- more often for black-associated names
er’s cache and cookies were cleared (2% difference). This is an interesting
before each search, and copies of Web finding and it spawns the question:
pages received were preserved. Figures Public records contain information on
1, 2, 5, and 6 provide examples. everyone, so why more ads for black-
From September 24 through Oc- associated names?
tober 23, 2012, I searched 2,184 full 5. Instant Checkmate ads dominated
names on Google.com and Reuters. the topmost ad position. They occupied
com. The searches took place at differ- that spot in almost half of all searches
ent times of day, different days of the on Reuters.com. This suggests Instant
week, with different IP and machine Checkmate offers Google more money
addresses operating in different parts or has higher quality scores than do its
of the United States using different competitors.
browsers. I manually searched 1,373 6. Instant Checkmate had the largest
of the names and used automated percentage of ads in virtually every first-
means17 for the remaining 812 names. name category, except for Kristen, Con-
Here are nine observations. nor, and Tremayne. For those names,
1. Fewer ads appeared on Google.com Instant Checkmate had uncharacter-
than Reuters.com—about five times istically fewer ads (less than 25%). Pub-

50 communicatio ns o f th e ac m | may 201 3 | vol . 5 6 | no. 5


practice

licRecords had ads for 80% of names (84%), Jermaine (81%), and DeShawn ads that appeared for black-identifying
beginning with Tremayne, and Connor, (86%), all black-identifying first names. names, 92% were suggestive of arrest.
and 58% for Kristen, compared to 20% Some names appeared counter to this Far fewer ads displayed for white-iden-
and less for Instant Checkmate. Why pattern: Dustin, a white-identifying tifying names (66 total), but 80% were
the underrepresentation in these first name, generated arrest ads in 81% of suggestive of arrest. This difference in
names? During a conference call with searches; and Imani, a black-identi- the ratios 92 and 80 is statistically sig-
company’s representatives, they as- fying name, resulted in neutral ads in nificant, with less than a 1% probabil-
serted that Instant Checkmate gave the 75% of searches. ity that the data can be explained by
same ad text to Google for groups of 9. Discrimination results on Google’s chance (chi-square test: X2(1)=7.71, p <
last names (not first names). site were similar, but, interestingly, ad 0.01). The EEOC’s adverse impact test
7. Almost all ads for public records text and distributions were different. for measuring discrimination is 40%,
included the name of the person, making While the same neutral and arrest ads so if this were employment, a charge
each ad virtually unique, but beyond per- having dominant appearances on Re- of discrimination might result. (The
sonalization, the ad templates showed uters.com also appeared frequently on adverse impact test gives 100-92=8 and
little variability. The only exception was Google.com, Instant Checkmate ads 100-80=20; dividing 8 by 20 equals 40.)
Instant Checkmate. Almost all People- on Google included an additional 10 A greater percentage of Instant
Finder ads appearing on Reuters.com templates, all using the word criminal Checkmate ads having the word arrest
used the same personalized template. or arrest. in ad text appeared for black-identify-
PublicRecords used five templates and More than 400 Instant Checkmate ing first names than for white-identify-
PeopleSmart seven, but Instant Check- ads appeared on Google, and 90% of ing first names within professional and
mate used 18 different ad templates these were suggestive of arrest, regard- netizen subsets, too. On Reuters.com,
on Reuters.com. Figure 4 enumerates less of race. Still, a greater percentage which hosts Google AdSense ads, a
ad templates for frequencies of 10 or of Instant Checkmate ads suggestive black-identifying name was 25% more
more for all four companies (replace of arrest displayed for black-associated likely to generate an ad suggestive of an
fullname with the person’s first and first names than for whites. Of the 366 arrest record.
last name).
While Instant Checkmate’s compet- Figure 4. Template for ads for public records on Reuters for frequencies less than 10. Full
list is available.15
itors also sell criminal history informa-
tion, only Instant Checkmate ads used
the word arrest. instantcheckmate Peoplesmart
8. A greater percentage of Instant 382 Located: fullname 87 We found: fullname
Checkmate ads using the word “arrest” Information found on fullname 1) Get Aisha’s Background Report
appeared for black-identifying first names fullname found in database. 2) Current Contact Info—Try Free!
than for white first names. More than 96 We found fullname 105 We found: fullname
1,100 Instant Checkmate ads appeared Search Arrests, Address, Phone, etc. 1) Contact fullname—Free Info! 2)
Search records for fullname. Current Address, Phone & More.
on Reuters.com, with 488 having black-
40 Background of fullname 348 We found: fullname
identifying first names; of these, 60% Search Instant Checkmate 1) Contact fullname—Free Info!
used arrest in the ad text. Of the 638 for the Records of fullname 2) Current Phone, Address & More.
ads displayed with white-identifying 17 fullname’s Records
names, 48% used arrest. This difference 1) Enter Name and State.
2) Access Full Background
is statistically significant, with less than
Checks Instantly. Publicrecords
a 0.1% probability that the data can be
195 fullname: Truth 570 fullname
explained by chance (chi-square test: Arrests and Much More. Public Records Found For: fullname.
X2(1)=14.32, p < 0.001). The EEOC’s and Everything About fullname View now.
U.S. Department of Labor’s adverse 67 fullname Truth 128 fullname
impact test for measuring discrimina- Looking for fullname? Public Records Found For: fullname.
Check fullname’s Arrests Search now.
tion is 77 in this case, so if this were an
176 fullname, Arrested? 13 Records: fullname
employment situation, a charge of dis- 1) Enter Name and State. Database of all lastname’s in
crimination might result. (The adverse 2) Access Full Background the Country. Search now.
impact test uses the ratio of neutral ads, Checks Instantly.
or 100 minus the percentages given, to 55 fullname Located 56 fullname
compute disparity: 100-60=40 and 100- Background Check, Arrest Records, We have Public Records For: fullname.
Phone, & Address. Instant, Accurate Search Now.
48=52; dividing 40 by 52 equals 77.)
62 Looking for fullname?
The highest percentage of neutral Comprehensive Background Report
ads (where the word arrest does not ap- and More on fullname Peoplefinders
pear in ad text) on Reuters.com were 523 We found fullname
those for Jill (77%) and Emma (75%), Current Address, Phone and Age.
Find fullname, Anywhere.
both white-identifying names. Names
receiving the highest percentage of
ads with arrest in the text were Darnell

may 2 0 1 3 | vo l. 56 | n o. 5 | c om m u n ic at ions of t he acm 51


practice

Figure 5. Senator Claire McCaskill’s ate in Missouri appeared alongside an may not know which ad copy will work
campaign ad appeared next to an ad Instant Checkmate ad using the word best, so the advertiser may provide
using the word “arrest.” arrest (Figure 5). Names mined from multiple templates for the same search
academic websites included graduate string, and the “Google algorithm”
students, staff, and accomplished aca- learns over time which ad text gets the
demics, such as Amy Gutmann, presi- most clicks from viewers. It does this
dent of the University of Pennsylvania. by assigning weights (or probabilities)
Dustin Hoffman (arrest ad) was among based on the click history of each ad. At
names of celebrities used. A smorgas- first, all possible ad texts are weighted
bord of athletes appeared, from local the same and are equally likely to pro-
to national fame (assorted neutral and duce a click. Over time, as people tend
arrest ads). The youngest person whose to click one ad copy over others, the
name was used in the study was a miss- weights change, so the ad text getting
ing 11-year-old black girl. the most clicks eventually displays
More than 1,100 of the names har- more frequently.
vested for this study were from PeekYou, Did Instant Checkmate provide
with scores estimating the name’s over- ad templates suggestive of arrest dis-
all presence on the Web. As expected, proportionately to black-identifying
celebrities get the highest scores of names? Or did Instant Checkmate
10s and 9s. Only four names used here provide roughly the same templates
had a PeekYou score of 10, and 12 had evenly across racially associated names
a score of 9, including Dustin Hoffman. but users clicked ads suggestive of ar-
Only two ads appeared for these high- rest more often for black-identifying
Figure 6. An assortment of ads appearing scoring names; an abundance of ads ap- names? As mentioned earlier, during
for Latisha Smith.
peared across the remaining spectrum a conference call with the founders of
of PeekYou scores. We might presume Instant Checkmate and their lawyer,
that the bid price needed to display an the company’s representatives assert-
ad is greater for more popular names ed that Instant Checkmate gave the
with higher PeekYou scores. Knowing same ad text to Google for groups of
that very few high-scoring people were last names (not first names) in its da-
(a)
in the study and that ads appeared tabase; they expressed no other criteria
across the full spectrum of PeekYou for name and ad selection.
scores reduces concern about varia- This study is a start, but more re-
tions in bid prices. search is needed. To preserve research
(b)
Different Instant Checkmate ads opportunities, I captured additional re-
sometimes appeared for the same sults for 50 hits on 2,184 names across
person. About 200 names had Instant 30 Web sites serving Google Ads to
Checkmate ads on both Reuters.com learn the underlying distributions of
(c) and Google.com, but only 42 of these ad occurrences per name. While ana-
names received the same ad. The other lyzing the data may prove illuminating,
82% of names received different ads in the end the basic message presented
across the two sites. At most, three dis- in this study does not change: there is
(d) tinct ads appeared across Reuters.com discrimination in delivery of these ads.
and Google.com for the same name.
Figure 6 shows the assortment of ads Technical Solutions
These findings reject the hypothesis appearing for Latisha Smith. Having How can technology solve this prob-
that no difference exists in the delivery different possible ad texts for a name lem? One answer is to change the
of ads suggestive of an arrest record reminds us that while Instant Check- quality scores of ads to discount for
based on searches of racially associ- mate provided the ad texts, Google’s unwanted bias. The idea is to mea-
ated names. technology selected among the pos- sure real-time bias in an ad’s delivery
sible texts in deciding which to display. and then adjust the weight of the ad
Additional Observations Figure 6 shows ads both suggestive of accordingly at auction. The general
The people behind the names used arrest and not, though more ads ap- term for Google’s technology is ad ex-
in this study are diverse. Political fig- pear suggestive of arrest than not. change. This approach generalizes to
ures included Maryland State Repre- other ad exchanges (not just Google’s);
sentatives Aisha Braveboy (arrest ad) More About the Problem integrates seamlessly into the way ad
and Jay Jacobs (neutral ad); Jill Biden Why is this discrimination occurring? exchanges operate, allowing minimal
(neutral ad), wife of U.S. Vice Presi- Is Instant Checkmate, Google, or so- modifications to harmonize ad deliv-
dent Joe Biden; and Claire McCaskill, ciety to blame? We do not yet know. eries with societal norms; and, works
whose campaign ad for the U.S. Sen- Google understands that an advertiser regardless of the cause of the discrimi-

52 comm unicatio ns o f the ac m | may 201 3 | vol . 5 6 | no. 5


practice

nation—advertiser bias in placing ads tion typically concerns the ability to


or society bias in selecting ads. give or withhold benefits, facilities, ser-
Discrimination, however, is at the vices, employment, or opportunities.
heart of online advertising. Differen- Instead of lumping all ads together, it
tial delivery is the very idea behind it.
For example, if young women with chil- Discrimination is better to use search strings, ad texts,
products, or URLs that display with ads
dren tend to purchase baby products
and retired men with bass boats tend
is at the heart of to decide which ads to assess.

online advertising.
This study assessed search strings
to purchase fishing supplies, and you of first and last names of real people,
know the viewer is one of these two
types, then it is more efficient to of-
Differential ads for public records, and ads having
a specific display URL (instantcheck-
fer ads for baby products to the young delivery is mate.com), the latter being the most
mother and fishing rods to the fisher-
man, not the other way around.
the very idea informative because the adverse ads all
had the same display URL.
On the other hand, not all discrimi- behind it. Of course, the audience for the ads
nation is desirable. Societies have is not necessarily the people who are
identified groups of people to protect the subject of the ads. In this study, the
from specific forms of discrimination. audience is a person inquiring about
Delivering ads suggestive of arrest the person whose name is the subject
much more often for searches of black- of the ad. This distinction is impor-
identifying names than for white- tant when thinking about the identity
identifying names is an example of of groups that might be impacted by
unwanted discrimination, according an ad. Group membership is based on
to American social and legal norms. the ad’s search string and text. The au-
This is especially true because the ads dience may resonate more with a dis-
appear regardless of whether actual ar- tinctly positive or negative character-
rest records exist for the names in the ization of the group.
company’s database. 3. Determining Ad Sentiment. Origi-
The good news is that we can use the nally associated with summarizing
mechanics and legal criteria described product and movie reviews, sentiment
earlier to build technology that distin- analysis is an area of computer science
guishes between desirable and unde- that uses natural-language process-
sirable discrimination in ad delivery. ing and text analytics to determine the
Here I detail the four key components: overall attitude of a writing.13 Senti-
1. Identifying Affected Groups. A set ment analysis can measure whether an
of predicates can be defined to identify ad’s search string and accompanying
members of protected and comparison text has positive, negative, or neutral
groups. Given an ad’s search string and sentiment. A literature search does not
text, a predicate returns true if the ad find any prior application to online ads,
can impact the group that is the sub- but a lot of research has been done as-
ject of the predicate and returns false sessing sentiment in social media (sen-
otherwise. Statistics of baby names can timent140.com, for example, reports
identify first names for constructing the sentiment of tweets, which like ad-
race and gender groups and last names vertisements have limited words).
for grouping some ethnicities. Special In this study, ads containing the
word lists or functions that report de- word arrest or criminal were classified as
gree of membership may be helpful for having negative sentiment; ads without
other comparisons. those words were classified as neutral.
In this study, ads appeared on 4. Testing for Adverse Impact. Con-
searches of full names for real people, sider a table where columns are com-
and first names assigned to more black parative groups, rows are sentiment,
or white babies formed groups for test- and values are the number of ad im-
ing. These black and white predicates pressions (the number of times an
evaluate to true or false based on the ad appears, though the ad is not nec-
first name of the search string. essarily clicked). Ignore neutral ads.
2. Specifying the Scope of Ads to As- Comparing the percentage of ads hav-
sess. The focus should be on those ing the same positive or negative senti-
ads capable of impacting a protected ment across groups reveals the degree
group in a form of discrimination pro- to which one group may be impacted
hibited by law or social norm. Protec- more or less by the ad’s sentiment.

may 2 0 1 3 | vo l. 56 | n o. 5 | c om m u n ic at ions of t he acm 53


practice

Table 2. Negative and neutral sentiments placed by a disgruntled customer or


of black and white groups. ads placed by competitors on brand Related articles
names of the competition, unless these on queue.acm.org

Black White
are deemed to be protected groups. Modeling People and Places with Internet
Nonprotected marketing discrimi- Photo Collections
Negative 291 60% 308 48%
nation can continue even to protected David Crandall, Noah Snavely
Neutral 197 40% 330 52% http://queue.acm.org/detail.cfm?id=2212756
groups. For example, suppose search
Positive
terms associated with blacks tend to Interactive Dynamics for Visual Analysis
Totals 488 638 Jeffrey Heer, Ben Shneiderman
get neutral ads for some music artists,
http://queue.acm.org/detail.cfm?id=2146416
while those associated with whites
tend to get neutral ads for other music Social Perception
James L. Crowley
A chi-square test can determine sta- artists. All ads would appear regard- http://queue.acm.org/detail.cfm?id=1147531r
tistical significance, and the adverse less of the disproportionate distribu-
impact test used by the EEOC and the tion because the ads are not subject to
References
U.S. Department of Labor can alert suppression. 1. Barker R. The Social Work Dictionary (5th ed.). NASW
whether in some circumstances legal As a final example, this approach Press, Washington, DC,ss, 2003.
2. Bertrand, M. and Mullainathan, S. Are Emily and Greg
risks may result. allows everyone to be negatively im- more employable than Lakisha and Jamal? A field
experiment on labor market discrimination. NBER
In this study the groups are black pacted as long as the impact is approxi- Working Paper No. 9873, 2003; http://www.nber.org/
and white, and the sentiments are neg- mately the same. Suppose all ads for papers/w9873.
3. Central Hudson Gas & Electric Corp. v. Public Service
ative and neutral. Table 2 shows a sum- public records on all names, regardless Commission of New York. Supreme Court of the United
mary chart. Of the 488 ads that appeared of race, were equally suggestive of ar- States, 447 U.S. 557, 1980.
4. Dwork, C., Hardt, M., et al. 2011. Fairness through
for the black group, 291 (or 60%) had rest and had almost the same number awareness. arXiv:1104.3913; http://arxiv.org/
negative sentiment. Of the 638 ads dis- of impressions; then no ads suggestive abs/1104.3913.
5. Equal Employment Opportunity Commission.
played for the white group, 308 (or 48%) of arrest would be suppressed. Consideration of arrest and conviction records in
had negative sentiment. The difference Computer scientist Cynthia Dwork employment decisions under Title VII of the Civil
Rights Act of 1964. Washington, DC, 915.002,
is statistically significant (X2(1)=14.32, p and her colleagues have been work- 2012. http://www.eeoc.gov/laws/guidance/arrest_
< 0.001) and has an adverse impact mea- ing on algorithms that assure racial conviction.cfm.
6. Equal Employment Opportunity Commission. Uniform
sure of (40/52), or 77%. fairness.4 Their general notion is to guidelines on employee selection procedures.
An easy way of incorporating this ensure similar groups receive similar Washington, DC, 1978.
7. Fryer, R. and Levitt, S. The causes and consequences
analysis into an ad exchange is to de- ads in proportions consistent with the of distinctively black names. The Quarterly Journal of
cide which bias test is critical (for ex- population. Utility is the critical con- Economics 59, 3 (2004); http://pricetheory.uchicago.
edu/levitt/Papers/FryerLevitt2004.pdf.
ample, statistical significance or ad- cern with this direction because not all 8. Glover, E.; http://www.physiology.emory.edu/
FIRST/ebony2.htm (archived at http://foreverdata.
verse impact test) and then factor the forms of discrimination are bad, and org/onlineads).
test result into the quality score for the unusual and outlier ads could be un- 9. Google AdSense; http://google.com/adsense.
10. Google. Google announces first quarter 2011 financial
ad at auction. For example, if we were to necessarily suppressed. Still, their re- results; http://investor.google.com/earnings/2011/
modify the ad exchange not to display search direction looks promising. Q1_google_earnings.html.
11. Harris, P. and Keller, K. Ex-offenders need not apply:
any ad having an adverse impact score In conclusion, this study demon- The criminal background check in hiring decisions.
of less than 80, which is the EEOC stan- strates that technology can foster Journal of Contemporary Criminal Justice 21, 1
(2005), 6-30.
dard, then arrest ads for blacks would discriminatory outcomes, but it also 12. Panel on Methods for Assessing Discrimination,
sometimes appear, but would not be shows that technology can thwart un- National Research Council. Measuring racial
discrimination. National Academy Press, Washington,
overly disproportionate to whites, re- wanted discrimination. DC, 2004.
gardless of advertiser or click bias. 13. Pang, B. and Lee, L. A sentimental education:
sentiment analysis using subjectivity summarization
Though this study served as an ex- Acknowledgments based on minimum cuts. Proceedings of the 42nd
ample throughout, the approach gen- The author thanks Ben Edelman, Annual Meeting on Association for Computational
Linguistics (2004).
eralizes to many other forms of dis- Claudine Gay, Gary King, Annie Lewis, 14. Schneider, J. http://www.lehigh.edu/bio/jill.html
crimination and combats other ways ad and weekly Topics in Privacy partici- (Archived at http://foreverdata.org/onlineads).
15. Sweeney, L. Discrimination in online ad delivery (2013).
exchanges may foster discrimination. pants (David Abrams, Micah Altman, (For details, see full technical report at http://ssrn.
com/abstract=2208240. Data, including Web pages
Suppose female names tend to get Merce Crosas, Bob Gelman, Harry and ads, archived at http://foreverdata.org/onlineads).
neutral ads such as “Buy now,” while Lewis, Joe Pato, and Salil Vadhan) for 16. U.S. Commission on Civil Rights. Racism in America
and how to combat it. Washington, DC, 1970.
male names tend to get positive ads discussions; Adam Tanner for first 17. WebShot Command Line Server Edition. Version
such as “Buy now. 50% off!” Or sup- suspecting a pattern; Diane Lopez and 1.9.1.1; http://www.websitescreenshots.com/.
pose black names tend to get neutral Matthew Fox in Harvard’s Office of
ads such as “Looking for Ebony Jones,” the General Counsel for making pub- Latanya Sweeney (latanya@fas.harvard.edu) is professor
of government and technology in residence at Harvard
while white names tend to get positive lication possible in the face of legal University. She creates and uses technology to assess
ads such as “Meredith Jones. Fantastic!” threats; and Sean Hooley for editorial and solve societal, political, and governance problems and
teaches others how to do the same. She is also founder
Then the same analysis would suppress suggestions. Data from this study is and director of the Data Privacy Lab at Harvard.
some occurrences of the positive ads so available at foreverdata.org and the
as not to foster a discriminatory effect. IQSS Dataverse Network. Supported in
This approach does not stop the part by NSF grant CNS-1237235 and a
appearance of negative ads for a store gift from Google, Inc. © 2013 ACM 0001-0782/13/05

54 com municatio ns o f th e ac m | may 201 3 | vol . 5 6 | no. 5

You might also like