Download as pdf or txt
Download as pdf or txt
You are on page 1of 11

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/261217624

Big Questions for Social Media Big Data: Representativeness, Validity and Other
Methodological Pitfalls

Article · March 2014


DOI: 10.1609/icwsm.v8i1.14517 · Source: arXiv

CITATIONS READS

450 3,468

1 author:

Zeynep Tufekci
University of North Carolina at Chapel Hill
17 PUBLICATIONS   3,805 CITATIONS   

SEE PROFILE

All content following this page was uploaded by Zeynep Tufekci on 06 May 2014.

The user has requested enhancement of the downloaded file.


Tufekci, Zeynep. (2014). Big Questions for Social Media Big Data: Representativeness, Validity and Other Methodological
Pitfalls. In ICWSM ’14: Proceedings of the 8th International AAAI Conference on Weblogs and Social Media, 2014.
[forthcoming]

Big Questions for Social Media Big Data:


Representativeness, Validity and Other Methodological Pitfalls
Zeynep Tufekci
University of North Carolina, Chapel Hill
zeynep@unc.edu

Abstract of data collection, as with the study of ideological polariza-


Large-scale databases of human activity in social media tion on Syrian Twitter (Lynch, Freelon and Aday, 2014).
have captured scientific and policy attention, producing a The emergence of big data from social media has had im-
flood of research and discussion. This paper considers pacts in the study of human behavior similar to the intro-
methodological and conceptual challenges for this emergent
duction of the microscope or the telescope in the fields of
field, with special attention to the validity and representa-
tiveness of social media big data analyses. Persistent issues biology and astronomy: it has produced a qualitative shift
include the over-emphasis of a single platform, Twitter, in the scale, scope and depth of possible analysis. Such a
sampling biases arising from selection by hashtags, and dramatic leap requires a careful and systematic examina-
vague and unrepresentative sampling frames. The socio- tion of its methodological implications, including trade-
cultural complexity of user behavior aimed at algorithmic
offs, biases, strengths and weaknesses.
invisibility (such as subtweeting, mock-retweeting, use of
“screen captures” for text, etc.) further complicate interpre- This paper examines methodological issues and ques-
tation of big data social media. Other challenges include ac- tions of inference from social media big data. Methodolog-
counting for field effects, i.e. broadly consequential events ical issues including the following: 1. The model organism
that do not diffuse only through the network under study but problem, in which a few platforms are frequently used to
affect the whole society. The application of network meth-
generate datasets without adequate consideration of their
ods from other fields to the study of human social activity
may not always be appropriate. The paper concludes with a structural biases. 2. Selecting on dependent variables
call to action on practical steps to improve our analytic ca- without requisite precautions; many hashtag analyses, for
pacity in this promising, rapidly-growing field. example, fall in this category. 3. The denominator problem
created by vague, unclear or unrepresentative sampling. 4.
The prevalence of single platform studies which overlook
Introduction the wider social ecology of interaction and diffusion.
Very large datasets, commonly referred to as big data, There are also important questions regarding what we
have become common in the study of everything from ge- can legitimately infer from online imprints, which are but
nomes to galaxies, including, importantly, human behavior. one aspect of human behavior. Issues include the follow-
Thanks to digital technologies, more and more human ac- ing: 1. Online actions such as clicks, links, and retweets are
tivities leave imprints whose collection, storage and aggre- complex social interactions with varying meanings, logics
gation can be readily automated. In particular, the use of and implications, yet they may be aggregated together. 2.
social media results in the creation of datasets which may Users engage in practices that may be unintelligible to al-
be obtained from platform providers or collected inde- gorithms, such as subtweets (tweets referencing an un-
pendently with relatively little effort as compared with tra- named but implicitly identifiable individual), quoting text
ditional sociological methods. via screen captures, and “hate-linking”—linking to de-
Social media big data has been hailed as key to crucial nounce rather than endorse. 3. Network methods from oth-
insights into human behavior and extensively analyzed by er fields are often used to study human behavior without
scholars, corporations, politicians, journalists, and gov- evaluating their appropriateness. 4. Social media data al-
ernments (Boyd and Crawford 2012; Lazer et al, 2009). most solely captures “node-to-node” interactions, while
Big data reveal fascinating insights into a variety of ques- “field” effects—events that affect a society or a group in a
tions, and allow us to observe social phenomena at a previ- wholesale fashion either through shared experience or
ously unthinkable level, such as the mood oscillations of through broadcast media—may often account for observed
millions of people in 84 countries (Golder et al., 2011), or phenomena. 5. Human self-awareness needs to be taken
in cases where there is arguably no other feasible method into account; humans will alter behavior because they
know they are being observed, and this change in behavior
Pre-print version.
Copyright © 2014, Association for the Advancement of Artificial Intelli- may correlate with big data metrics.
gence (www.aaai.org). All rights reserved.
Methodological Considerations platforms introduces non-representativeness at the level of
mechanisms, not just at the level of samples. Thus, drawing
1. Model Organisms and Research: Twitter as the a Twitter sample that is representative of adults in the tar-
Field’s Drosophila Melanogaster. get population would not solve the problem.
While there are many social media platforms, big data re- To explain the issue, consider that all dominant biologi-
search focuses disproportionately on Twitter. For exam- cal model organisms – such as the fruit fly Drosophila
ple, ICWSM, perhaps the leading selective conference in melanogaster, the bacterium Escherichia coli, the nema-
the field of social media, had 72 full papers last year tode worm Caenorhabditis elegans, and the mouse Mus
(2013), almost half of which presented data that was pri- musculus – were selected for rapid life cycles (quick re-
marily or solely drawn from Twitter. (Disclosure: I’ve long sults), ease of breeding in artificial settings and small adult
been involved in this conference in multiple capacities and size (lab-friendliness), “rapid and stereotypical” develop-
think highly of it. The point isn’t about any one paper’s ment (making experimental comparisons easier), and “ear-
worth or quality but rather about the prevalence of atten- ly separation of germ line and soma,” (reducing certain
tion to a single platform). kinds of variability) (Bolker, 1995; Jenner and Wills,
2007). However, the very characteristics that make them
This preponderance of Twitter studies is mostly due to useful for studying certain biological mechanisms come at
availability of data, tools and ease of analysis. Very large the expense of illuminating others (Gilbert, 2001, Jenner
data sets, millions or billions of points, are available from and Wills, 2007). Being easy to handle in the laboratory, in
this source. In contrast to Facebook, the largest social me- effect, implies “relative insensitivity to environmental in-
dia platform, almost all Twitter activity, other than direct fluences” (Bolker, 1995, p:451–2) and thus unsuitability
messages and private profiles, is visible on the public In- for the study of environmental interactions. The rapid de-
ternet. More Facebook users (estimated to be more than velopment cycle depresses mechanisms present in slower-
50%) have made their profiles “private”, i.e. not accessible growing species, and small adult size can imply “simplifi-
on the public Internet, as compared with Twitter users (es- cation or loss of structures, the evolution of morphological
timated as less than 10%). While Twitter has been closing novelties, and increased morphological variability”
some of the easier means of access, the bulk of Facebook is (Bolker, 1995, p:451).
largely inaccessible except by Facebook’s own data scien-
tists. (Though Facebook public pages are available through In other words, model organisms can be unrepresenta-
its API). Unsurprisingly, only about 5% of the papers pre- tive of their taxa, and more importantly, may be skewed
sented in ICWSM 2013 were about Facebook, and nearly with regard to the importance of mechanisms in their taxa.
all of them were co-authored with Facebook data scientists. They are chosen because they don’t die easily in confine-
ment and therefore encapsulate mechanisms specific to
Twitter data also has a simple and clean structure. In surviving in captivity—a trait that may not be shared with
contrast to the finer-grained privacy settings on other social species not chosen as model organisms. The fruit fly,
media platforms, Twitter profiles are either “all public” or which breeds easily and with relative insensitivity to the
“all private.” With only a few basic functions (retweet, environment, may lead to an emphasis on genetic factors
mention, and hashtags) to map, and a maximum of 140 over environmental influences. In fact, this appears to be a
characters per tweet, the datasets generated by Twitter are bias common to most model organisms used in biology.
relatively easy to structure, process and analyze as com-
pared with most other platforms. Consequently, Twitter Barbara McClintock’s discovery of transposable genes,
has emerged as a “model organism” of big data. though much later awarded the Nobel prize, was initially
disbelieved and disregarded partly because the organism
In biology, “model organisms” refer to species which she used, maize, was not a model organism at the time
have been selected for intensive examination by the re- (Pray and Zhaurova, 2008). Yet the novel mechanisms
search community in order to shed light on key biological which she discovered were not as visible in the more ac-
processes such as fundamental properties of living cells. cepted model organisms, and would not have been found
Focusing on model organisms is conducive to progress in had she kept to those. Consequently, biologists interested
basic questions underlying the entire field, and this ap- in the roles of ecology and development have widened the
proach has been spectacularly successful in molecular bi- range of organisms studied in order to uncover a wider
ology (Fields and Johnson, 2007; Geddes, 1990). Howev- range of mechanisms.
er, this investigative path is not without tradeoffs.
The dominance of Twitter as the “model organism” for
Biases in “model-organism” research programs are not social media big data analyses similarly skews analyses of
the same as sample bias in survey research, which also im- mechanisms. Each social media platform carries with it a
pact social media big data studies. The focus on just a few suite of affordances - things that it allows and makes easy
versus things that are not possible or difficult - which help and may not be fully represented or even sampled via cur-
structure behavior on the platform. For Twitter, the key rent methods.
characteristics are short message length, rapid turnover, All this is not to say that Twitter is an inappropriate plat-
public visibility, and a directed network graph (“follow” form to study. Research in the model organism paradigm
relationships do not need to be mutual.) It lacks some of allows a large community to coalesce around shared da-
the characteristics that blogs, LiveJournal communities, or tasets, tools and problems, and can produce illuminating
Facebook possess, such as longer texts, lengthier reaction results. However, the specifics of the “model organism”
times, stronger integration of visuals with text, the mutual and biases that may result should not be overlooked.
nature of “friending” and the evolution of conversations
over longer periods of time. 2. Hashtag Analyses, Selecting on the Dependent
Twitter’s affordances and the mechanisms it engenders Variable, Selection Effects and User Choices.
interact in multiple ways. Its lightweight interface, suitable The inclusion of hashtags in tweets is a Twitter convention
to mobile devices and accessible via texting, means that it for marking a tweet as part of a particular conversation or
is often the platform of choice when on the move, in low- topic, and many social media studies rely on them for sam-
bandwidth environments or in high-tension events such as ple extraction. For example, the Tunisian uprising was as-
demonstrations. The retweet mechanism also generates its sociated with the hashtag #sidibouzid while the initial
own complex set of status-related behaviors and norms that Egyptian protests of January 25, 2011, with #jan25. Face-
do not necessarily translate to other platforms. Also, cru- book’s adoption of hashtags makes the methodological
cially, Twitter is a directed graph in that one person can specifics of this convention even more important. While
“follow” another without mutuality. In contrast, Face- hashtag studies can be a powerful for examining network
book’s backbone is mostly an undirected graph in which structure & information flows, all hashtag analyses, by def-
“friending” is a two-way relationship and requires mutual inition, select on a dependent variable, and hence display
consent. Similarly, Livejournal’s core interactions tend to the concomitant features and weaknesses of this methodo-
occur within “friends lists”. Consequently, Twitter is more logical path.
likely than other platforms to sustain bridge mechanisms “Selecting on the dependent variable” occurs when in-
between communities and support connections between clusion of a case in a sample depends on the very variable
densely interconnected clusters that are otherwise sparsely being examined. Such samples have specific limits to their
connected to each other. analytic power. For example, analyses that only examine
To see the implications for analysis and interpretation of revolutions or wars that have occurred will overlook cases
big data, let’s look at bridging as a mechanism and consid- where the causes and correlates of revolution and war have
er a study which shows that bit.ly shortened links, distrib- been present but in which there have been no resulting
uted on Twitter using revolutionary hashtags during the wars or revolutions (Geddes, 2010). Thus, selecting on the
Arab Spring, played a key role as an information conduit dependent variable (the occurrence of war or revolution)
from within the Arab uprisings to the outside world—in can help identify necessary conditions, but those may not
other words, as bridges (Aday et al, 2012). Given its de- be sufficient. Selecting on the dependent variable can in-
pendence on Twitter’s affordances, this finding should not troduce a range of errors specifics of which depend on the
be generalized to mean that social media as a whole also characteristics of the uncorrelated sample.
acted as a bridge, that social media was primarily useful as In hashtag datasets, a tweet is included because the user
a bridging mechanism, nor that Twitter was solely a bridge chose to use it, a clear act of self-selection. Self-selected
mechanism (since this analyzed only of tweets containing samples often will not only have different overall charac-
bitl.ly links). Rather, this finding speaks to a convergence teristics than the general population, they may also exhibit
of user needs and certain affordances in a subset of cases significantly different correlational tendencies which create
which fueled one mechanism: in this case, bridging. thorny issues of confounding variables. Famous examples
Finally, there is indeed a sample bias problem. Twitter is include the hormone replacement therapy (HRT) contro-
used by less than 20% of the US population (Mitchell and versy in which researchers had, erroneously, believed that
Hitlin, 2014), and that is not a randomly selected group. HRT conferred health benefits to post-menopausal women
While Facebook has wider diffusion, its use is also struc- based on observational studies of women who self-selected
tured by race, gender, class and other factors (Hargittai, to take HRT. In reality, HRT therapy was adopted by
2008). Thus, the use of a few online platforms as “big da- healthier women. Later randomized double-blind studies
ta” model organisms raises important questions of repre- showed that HRT was, in fact, harmful—so harmful that
sentation and visibility, as different demographic or social the researchers stopped the study in its tracks to reverse
groups may have different behavior—online and offline— advice that had been given to women for a decade.
Figure 1: The frequency of top 20 hashtags associated with Gezi Protests. (Banko and Babacan, 2013)

Samples drawn using different hashtags can differ in What had happened was that as soon as the protest be-
important dimensions, as hashtags are embedded in partic- came the dominant story, large numbers of people contin-
ular cultural and socio-political frameworks. In some cas- ued to discuss them heavily – almost to the point that no
es, the hashtag is a declaration of particular sympathy. In other discussion took place on their Twitter feeds – but
other cases, there may be warring messages as the hashtag stopped using the hashtags except to draw attention to a
emerges as a contested cultural space. For example, two new phenomenon or to engage in “trending topic wars”
years of regular monitoring of activity—checking at least with ideologically-opposing groups. While the protests
for an hour once a week—on the hashtags #jan25 and continued, and even intensified, the hashtags died down.
#Bahrain show their divergent nature. Those who choose Interviews revealed two reasons for this. First, once every-
to use #jan25 are almost certain to be sympathetic to the one knew the topic, the hashtag was at once superfluous
Egyptian revolution while #Bahrain tends to be used both and wasteful on the character-limited Twitter platform.
by supporters and opponents of the uprising in Bahrain. Second, hashtags were seen only as useful for attracting at-
Data I systematically sampled on three occasions showed tention to a particular topic, not for talking about it.
that only about 1 in 100 #jan25 tweets were neutral while In August, 2013, a set of stairs near Gezi Park which had
the rest were all supporting the revolution. Only about 5 been painted in rainbow colors were painted over in drab
out of 100 #Bahrain tweets were neutral, and 15 out of 100 gray by the local municipality. This sparked outrage as a
were strongly opposed to the uprising, while the rest, 80 symbolic moment, and many people took to Twitter under
out of 100 were supportive. In contrast, #cairotraffic did the hashtag #direnmerdiven (roughly “#occupystairs”). The
not exhibit any overt signs of political preference. Conse- hashtag quickly and briefly trended and then disappeared
quently, since the hashtag users are a particular communi- from the trending list as well as users’ Twitter streams.
ty, thus prone to selection biases, it would be difficult to However, this would be a misleading measure of activity
generalize from their behavior to other samples. Political on the painting of the stairs, as monitoring a group who
users may be more prone to retweeting, say, graphic con- had been using the hashtag showed that almost all of them
tent, whereas non-political users may react with aversion. continued to talk about the topic intensively on Twitter, but
Hence, questions such as “does graphic content spread without the hashtag. Over the next week, hundreds, maybe
quickly on Twitter” or “do angry messages diffuse more thousands of stairs in Turkey were painted in rainbow col-
quickly” might have quite different answers if the sample ors as a form of protest, a phenomenon not at all visible in
is drawn through different hashtags. any data drawn from the hashtag.
Hashtag analyses can also be affected by user activity Finally, most hashtags used to build big datasets are suc-
patterns. An analysis of twenty hashtags used during the cessful hashtags - ones that got well-known, distributed
height of Turkey’s Gezi Park protests in June 2013 (#oc- widely and generated large amount of interest. It is likely
cupygezi, #occupygeziparki, #direngeziparki, #direnanka- that the dynamics of such events differ significantly from
ra, #direngaziparki, etc.) shows a steep rise in activity on those of less successful ones. In sum, hashtag datasets
May 30th when the protests began, dropping off by June should be seen as self-selected samples with data “missing
3rd (Figure 1). Looking at this graph, one might conclude not at random” and interpreted accordingly (Allison, 2001;
that either the protests had died down, or that people had Meiman and Freund, 2012; Outhwaite et al, 2007)
stopped talking about the protests on Twitter. Both conclu- All this is not to argue that hashtag datasets are not use-
sions would be very mistaken, as revealed by the author’s ful. In contrast, they can provide illuminating glimpses into
interviews with hundreds of protesters on-the-ground dur- specific cultural and socio-political conversations. How-
ing the protests, online ethnography that followed hundreds ever, hashtag dataset analyses need to be accompanied by a
of people active in the protests (some of them also inter- thorough discussion of the culture surrounding the specific
viewed offline), monitoring of Twitter, trending topics, hashtag, and analyzed with careful consideration of selec-
news coverage and the protests themselves. tion and sampling biases.
There might be ways to structure the sampling of Twit- ab uprisings and concludes that “new media outlets that
ter datasets so that the hashtag is not the sole criterion. For that use bit.ly are more likely to spread information outside
example, Freelon, Lynch and Aday (2014) extracted a da- the region than inside it.” This is an important finding.
taset first based on the use of the word “Syria” in Arabic or However, interpretation of this finding should take into ac-
English, and then extracted hashtags from that dataset count the respective populations of Twitter users in the
while also performing analyses on the wider dataset. An- countries in question. Egypt’s population is about 80 mil-
other method might be to use the hashtag to identify a lion, about 1 percent of the global population. Any topic of
sample of users and then collect tweets of those users (who global interest about Egypt could very easily generate more
will likely drop using the hashtag) rather than collecting absolute number of clicks outside the country even if the
the tweets via the hashtag. activity within the country remained much more concen-
Above all, hashtag analyses should start from the princi- trated in relative proportions. Finally, the size of these da-
ple of understanding user behavior first, and should follow tasets makes traditional measures like statistical signifi-
the user rather than following the hashtag. cance less valuable (Meiman and Freund, 2012), a problem
exacerbated by lack of information about the denominator.
3. The Missing Denominator: We Know Who
Clicked But We Don’t Know Who Saw Or Could: 4. Missing the Ecology for the Platform:
One of the biggest methodological dangers of big data Most existing big data analyses of social media are con-
analyses is insufficient understanding of the denominator. fined to a single platform (often Twitter, as discussed.)
It’s not enough to know how many people have “liked” a However, most of the topics of interest in such studies,
Facebook status update, clicked on a link, or “retweeted” a such as influence or information flow, can rarely be con-
message without knowing how many people saw the item fined to the Internet, let alone to a single platform. The dif-
and chose not to take any action. We rarely know the char- ficulty in obtaining high-quality multi-platform data does
acteristics of the sub-population that sees the content even not mean that we can treat a single platform as a closed and
though that is the group, and not the entire population, insular system. Information in human affairs flows through
from which we are sampling. Normalization is rarely done, all available channels.
or may even be actively decided against because the results The emergent media ecology is a mix of old and new
start appearing more complex or more trivial (Cha, 2008). media which is not strictly segregated by platform or even
While the denominator is often not calculable, it may be by device. Many “viral” videos take off on social media
possible to estimate. One measure might be “potential ex- only after being featured on broadcast media, which often
posure,” corresponding to the maximum number of people follows their being highlighted on intermediary sites such
who may have seen a message. However, this highlights as Reddit or Buzzfeed. Political news flowing out of Arab
another key issue: the data is often proprietary (Boyd and Spring uprisings to broadcast media was often curated by
Crawford, 2012). It might be possible to work with the sites such as Nawaat.org that had emerged as trusted local
platforms to get estimates of visibility, click-through and information brokers. Analysis from Syria shows a similar
availability. For example, Facebook researchers have dis- pattern (Aday et al. 2014). As these examples show, the
closed that the mean and median fraction of a user’s object of analysis should be this integrated ecology, and
friends that see status update posts is about 34 to 35%, there will be significant shortcomings in analyses which
though the distribution of the variable seems to have a consider only a single platform.
large spread (Bernstein et al., 2013). Link analyses on hashtags datasets for the Arab upris-
With some disclosure from proprietary platforms, it may ings show that the most common links from social media
be possible to calculate “likely” exposure numbers based are to the websites of broadcast media (Aday et al. 2012).
on “potential” exposure - similar to the way election polls The most common pattern was that users alternate between
model “likely” voters or TV ratings try to measure people Facebook, Twitter, broadcast media, cell-phone conversa-
watching a show rather than just being in the room where tions, texting, face-to-face and other methods of interaction
the TV is on. Steps in this direction are likely to be com- and information sharing (Tufekci & Wilson, 2012).
plex and difficult, but without such efforts, our ability to These challenges do not mean single-platform analyses
interpret raw numbers will remain limited. The academic are not valuable. However, all such analyses must take into
community should ask for more disclosure and access from account that they are not examining a closed system and
the commercial platforms. that there may be effects which are not visible because the
It’s also important to normalize underlying populations relevant information is not contained within that platform.
when comparing “clicks,” “links,” or tweets. For example, Methodologically, single-platform studies can be akin to
Aday et al. (2012) compares numbers of clicks on bit.ly looking for our keys under the light. More research, admit-
links in tweets containing hashtags associated with the Ar- tedly much more difficult and expensive than scraping data
from one platform, is needed to understand broader pat- without understanding the context, the spike in
terns of connectivity. Sometimes, the only way to study @celebboutique mentions could easily be misunderstood.
people is to study people. Polarized situations provide other examples of “negative
retweets.” For example, during the Gezi protests in Turkey,
the mayor of Ankara tweeted personally from his account,
Inferences and Interpretations often until late hours of the night, engaging Gezi protesters
The question of inference from analyses of social media individually in his idiosyncratic style, which involved the
big data remains underconceptualized and underexamined. use of “ALL CAPS” and colorful language. He became
What’s a click? What does a retweet mean? In what con- highly visible among supporters as well as opponents of
text? By whom? How do different communities interpret these protests. His visibility, combined with his style,
these interactions? As with all human activities, interpret- meant that his tweets were widely retweeted—but not al-
ing online imprints engages layers of complexity. ways by supporters. Gezi protestors would retweet his
messages and then follow the retweet with a negative or
1. What’s in a Retweet? Understanding our Data: mocking message. His messages were also retweeted with-
out comment by people whose own Twitter timelines made
The same act can have multiple, even contradictory mean- clear that their intent was to “expose” or ridicule, rather
ings. In many studies, for example, retweets or mentions than agree. A simple aggregation would find that thou-
are used as proxies for influence or agreement. This may sands of people were retweeting his tweets, which might be
hold in some contexts; however, there are many conceptual interpreted as influence or agreement.
steps and implicit assumptions embedded in this analysis. One of the most cited Twitter studies (Kwak et al.) grap-
It is clear that a retweet is information exposure and/or re- ples with how to measure influence, and asks whether the
action; however, after that, its meaning could range from number of followers or the number of retweets is a better
affirmation to denunciation to sarcasm to approval to dis- measure. That paper settles on retweets, stating that “The
gust. In fact, many social media acts which are designed number of retweets for a certain tweet is a measure of the
as “positive” interactions by the platform engineers, rang- tweet’s popularity and in turn of the tweet writer’s popular-
ing from Retweets on Twitter to even “Likes” on Facebook ity.” The paper then proceeds to rank users by the total
can carry a range of meanings, some quite negative. number of retweets, and refers to this ranking alternatively
as influence or popularity. Another important social media
study, based on Twitter, speaks of in-degree (number of
followers) as a user’s popularity, and retweets as influence
(Cha et al., 2010). Both are excellent studies of retweet and
following behavior, but in light of the factors discussed
above, “influence” and “popularity” are may not be the
best term to use for the variables they are measuring. Some
portion of retweets and follows are, in fact, negative or
mocking, and do not represent “influence” in the way it is
ordinarily understood. The scale of such behavior remains
Figure 2: Retweeted widely, but mostly in disgust
an important, unanswered question (Freelon, 2014).
As an example, take the recent case of the twitter ac-
count of fashion store @celebboutique. On July, 2012, the 2. Engagement Invisible to Machines: Subtweets,
account tweeted with glee that the word “#aurora” was Hate-Links, Screen Captures and Other Methods:
trending and attributed this to the popularity of a dress
named #aurora in its shop. The hashtag was trending, how- Social media users engage in practices that alter their visi-
ever, because Aurora, Colorado was the site of a movie bility to machine algorithms, including subtweeting, dis-
theatre massacre on that day. There was an expansive cussing a person’s tweets via “screen captures,” and hate-
backlash against @celebboutique’s crass and insensitive linking. All these practices can blind big data analyses to
tweet. There were more than 200 mentions and many hun- this mode of activity and engagement.
dreds of retweets with angry messages in as little as sixty Subtweeting is the practice of making a tweet referring
seconds. The tweet itself, too, was retweeted thousands of to a person algorithmically invisible to that person—and
times (See Figure 2). After about an hour, the company re- consequently to automated data collection—even as the
alized its mistake and stepped in. This was followed by reference remains clear to those “in the know.” This ma-
more condemnation—a few hundred mentions per minute nipulation of visibility can be achieved by referring to a
at a minimum. (For more analysis: (Gilad, 2012)) Hence, person who has a twitter handle without either “mention-
ing” this handle, or by inserting a space between the @
ssign and the handle,
h or by using
u their reg
gular name or a an houur at a time inn, totaling at leeast 10 hours oof observa-
nnickname ratheer than the hanndle, or even so ometimes delib b- tion deedicated to cattching subtweeets. This resulteed in a col-
eerately misspellling the name. In some casees, the referencce lectionn of 100 unm mistakable subttweets; many m more were
ccan only be un nderstood in coontext, as theree is no mentio on undouubtedly missedd because theyy are not alwayys obvious
oof the target in
n any form. Th hese many form ms of subtweet- to obsservers. In fact,, the subtweetss were widely uunderstood
inng come with different
d implications for big g data analyticss. and reetweeted, whiich increases the importancce of such
For examplle, a controv versial article by Egyptian n- practicces. Overall, thhe practice apppears commonn enough to
AAmerican Mon na El Eltahawyy sparked a masssive discussio on be desscribed as routiine, at least in Turkey.
inn Egypt’s sociial media. In a valuable anallysis of this diss-
ccussion, socioloogists Alex Haanna and Marcc Smith extract-
eed the tweets which
w mentioneed her or linkeed to the articlee.
TTheir network analysis reveaaled great polarization in th he
ddiscussion, with two distinctlly clustered groups. Howeveer,
wwhile watching g this discussio
on online, I no oticed that manny
oof the high-proofile bloggers and young Eg gyptian activistts
ddiscussing the article - and greatly influenccing the converr-
ssation - were indeed
i subtweeeting. Later discussions
d witth
thhis community y revealed thatt this was a deeliberate choicce
thhat had been made
m because many people did not want to t
ggive Eltahawy attention,” ev ven as they waanted to discusss
thhe topic and her
h work.

F
Figure 3: Two peeople “subtweetting” each other without mentionn-
ing names. Thee exchange was clear
c enough, ho
owever, to be re--
ported in newspapers.
n Figuree 4: Algorithmiccally Invisible Enngagement: A coolumnist re-
sponds to critics by screeen captures.
In another exxample drawn from my prim mary research ono
TTurkey, figure 3 shows a sub btweet exchang ge between tw wo Usiing screen capttures rather thaan quotes is another prac-
pprominent individuals that would
w be uninteelligible to any
y- tice thhat adds to thhe invisibility of engagemennt to algo-
oone who did notn already follow the broad der conversatio on rithmss. A “caps” is ddone when Tw witter users refeerence each
aand was not in ntimately familiar with the context. Whille other’ s tweets throuugh screen caaptures rather than links,
eeach person is referring to th he other, theree are no names, mentioons or quotes. An examplee is shown onn Figure 4.
nnicknames, or handles.
h dition, neither follows the oth
In add h- This ppractice is so w widespread thaat a single hourr following
eer on Twitter. It is, howeverr, clearly a dirrect engagemen nt the saame purposive sample resultted in more thhan 300 in-
aand conversatioon, if a negativve one. A broaad discussion ofo stancees in which useers employed suuch “caps.”
thhis “Twitter sp
pat” on Turkish Twitter prov ved people werre Yett another praactice, colloquuially known as “hate-
aaware of this ass a two-way coonversation. It was so well un n- linkingg,” limits the algorithmic vvisibility of enngagement,
dderstood that it was even repoorted in newspaapers. althouugh this one is potentially ttraceable. “Haate-linking”
While the truue prevalence of this behavio or is hard to ess- occurss when a user llinks to anotheer user’s tweet rather than
taablish, exactly
y because the activity
a is hidd
den from largee- mentiooning or quotiing the user. T This practice, ttoo, would
sscale, machine--led analyses, observations ofo Turkish Twiit- skew analyses baseed on mentions or retweets, though in
ter during the Gezi
G protests off June 2013 rev vealed that succh this caase, it is at leasst possible to loook for such linnks.
ssubtweets weree common. In n order to gett a sense of itts Subbtweeters, “capps” users, and hate-linkers arre obvious-
sscale, I underttook an onlinee ethnography y in Decembeer, ly a smmaller commuunity than tweeeters as a wholle. While it
22013, during which
w two hun ndred Twitter users
u from Turr- is uncclear how wideespread these ppractices truly are, study-
kkey, assembled d as a purposiv ve sample including ordinarry ing Tuurkish Twitter shows that theey are not unccommon, at
uusers as well as journalists annd pundits, weere followed fo or least iin that contextt. Other counttries might havve specific
ssocial media practices that confound big data d analytics in
i Theere are clearly similar dynam mics in differennt types of
ddifferent ways.. Overall, a simple
s “scrapiing” of Turkissh netwoorks, human annd otherwise, and the diffeerent fields
TTwitter might produce
p a polaarized map of groups
g not talk
k- can leearn much from m each other. However, impportation of
inng to each othher, whereas th he reality is a polarized situaa- methoods needs to reely on more than some putatiive univer-
tiion in which contentious
c grooups are engag ging each otheer sal, coontext-indepenndent property of networked interaction
bbut without thee conventionall means that makem such conn- simplyy by virtue of tthe existence oof a network.
vversations visib
ble to algorithm
ms and to reseaarchers.
4. Fieeld Effects: N
Non-Network
ks Interaction
ns
33. Limits of Methodologic
M cal Analogiess and Importt- Anothher difference between spaatial or epideemiological
ing Network Methods fro om Other Fieelds: netwoorks and humann social netwoorks is that hum man social
DDo social mediia networks op perate through similar mechaa- informmation flows ddo not occur onnly through noode-to-node
nnisms as netwo orks of airliness? Does inform mation work th he netwoorks but also thhrough field efffects, large-scaale societal
wway germs do o? Such questiions are rarely y explicitly ad d- eventss which impacct a large grouup of actors coontempora-
ddressed even though
t many papers importt methods from m neouslly. Big events,, weather occuurrences, etc. aall have so-
oother fields on the implicit asssumption that the answer is a ciety-wwide field efffects and ofteen do not difffuse solely
yyes. Studies thaat do look at this
t question, suchs as Romerro througgh interpersonnal interaction (although theey also do
eet al. (2011) annd Lerman et al.a (2010), are often limited to t greatlyy impact interrpersonal interaction by afffecting the
ssingle, or few, platforms, which
w limit th
heir explanatorry agendda, mood and ddisposition of inndividuals).
ppower becausee information among a humanns does not dif- Forr example, moost studies agrree that Egypptian social
ffuse in a singlee platform wh hereas viruses do, indeed, dif- mediaa networks playyed a role in tthe uprising whhich began
ffuse in a well-defined manneer. To step bacck further, eveen in Egyypt in January 2011 and weree a key conduit of protest
rrepresenting so ocial media in nteractions as a network ree- informmation (Lynch,, 2012; Aday et al, 2012; T Tufekci and
qquires a whole host of impliccit and importaant assumption ns Wilsoon, 2012). How wever, there waas almost certainly anoth-
thhat should be considered ex xplicitly ratherr than assumeed er impportant informaation diffusionn dynamic. Thee influence
aaway (Butts, 20 009). of thee Tunisian revoolution itself on the expectattions of the
Epidemiologgical or contaagion-inspired analyses ofteen Egypttian populace was a major turning pointt (Ghonim,
trreat connected d edges in sociial media netw works as if theey 2012; Lynch, 2012). While analyysis of networkks in Egypt
wwere “neighborrs” in physicall proximity. In n epidemiology y, might not have revvealed a majorr difference beetween the
itt is reasonablee to treat “physsical proximityy” as a key varri- secondd and third weeek of Januaryy of 2011, som mething ma-
aable, assuming that adjacent nodes
n are “suscceptible” to diss- jor haad changed in tthe field. To trranslate it into epidemio-
eease transmission for very good reason: the underlyin ng logicaal language, duue to the Tunnisian revolutioon and the
mmodel is a well-developed
w d, empirically--verified germ m- exampple it presentedd, all the “noddes” in the netw work had a
thheory of diseaase in which sm mall microbes travel in actuaal differeent “susceptibiility” and “recceptivity” to innformation
sspace (where distance
d matters) to infect thee next person byb about an uprising. TThe downfall oof the Tunisiann president,
eentering their body. This physical proccess has welll- whichh showed that eeven an enduriing autocracy iin the Mid-
uunderstood pro operties, and un nderlying prob babilities can of- dle Eaast was suscepptible to streeet protests, eneergized the
ten be calculateed with precisio on. oppossition and channged the politiical calculationn in Egypt.
Creating an analogy from m social mediaa interactions to t This iinformation w was diffused thhrough multiplle methods
pphysical proxim mity may be a reasonable and d justified undeer and brroadcast mediaa played a keyy role. Thus, thhe commu-
ccertain conditioons and assum mptions, but thiis step is rarelly nicatioon of the Tuniisia effect to thhe Egyptian neetwork was
ssubjected to critical
c examinnation (Salathéé et al, 2013). not neecessarily depeendent on the network struccture of so-
TThere are sign nificant differeences between germs and in n- cial m
media.
fformation traveeling in social media networrks. Adjacenccy
inn social med dia is multi-faaceted; it can nnot always be b
mmapped to physsical proximity y; and human “nodes”
“ are sub
b-
ject to informaation from a wide
w range of sources,
s not just
thhose they are connected to in a particulaar social mediia
pplatform. Finallly, whether th here is a straig ghtforward relaa-
tiion between in nformation exp posure and thee rate of “influ u-
eence,” as there often is for exxposure to a diisease agent an nd Figure 5: Cleaar meaning onlyy in context and ttime.
thhe rate of infecction, is sometthing that should be empiricaal- Soccial media itseelf is often inncomprehensibble without
lyy investigated,, not assumed. referennce to field evvents outside iit. For examplle, take the
tweet in Figure 5. The tweets merely statess: “Getting
ccrowded underr that bus.” Sttrangely, it haas been tweeteed sensitiive to. That haashtag indeed trrended worldwwide. Simi-
mmore than sixtty times and favorited morre than 50. Fo or lar cooordinated cam mpaigns are com mmon in Turkkey and oc-
thhose following g in real time, this was an obbvious referencce curredd almost everyy day during thhe contentious protests of
too New Jersey Governor Chrris Christie’s pressp conferencce June, 22013.
inn which he blamed
b multiplle aides for thhe closing of a Succh behaviors, aaimed at avoidiing detection, amplifying
bbridge which caaused massive traffic jams, allegedly
a to pun
n- a signnal, or other goals, by deliberrate gaming of algorithms
ish a mayor wh ho did not endo orse him. Withhout understand d- and mmetrics, shouldd be expected iin all analysess of human
inng the Chris Christie
C press conference,
c neeither the tweeet, social media. Currenntly, many studdies do take innto account
nnor many retweeets of it are innterpretable. “gamiing” behaviorss such as spam m and bots; hoowever, co-
The turn to networks
n as a key
k metaphor in i social sciencc- ordinaated or active aattempts by acctual people too alter met-
ees, while fruitfful, should not diminish our attention to th he rics orr results, whichh often can onnly be discoverred through
mmulti-scale natu ure of human social
s interactio
on qualitaative research, are rarely takeen into accountt.

55. You Namee It, Humans Will Game it:


i Reflexivity
y Con
nclusion: Praactical Step
ps—a Call too Action
aand Humans:
UUnlike disease vectors or gasses in a chamb ber, humans un n- Sociall media big daata is a powerfful addition too the scien-
dderstand, evaluuate and respon nd to the same metrics that biig tific ttoolkit. Howevver, this emerrgent field neeeds to be
ddata researcherrs are measurin ng. For examp ple, political acc- placedd on firmer m methodological and conceptuual footing.
tiivists, especiallly in countriees such as Bahhrain, where th he Meaniing of social m media imprints, context of huuman com-
uunrest and reprression have received less maainstream globaal municcations, and naature of socio-cultural interaactions are
mmedia attention, often undeertake deliberaate attempts to t multi--faceted and ccomplex. Peopple’s behavior differs in
mmake a hashtag g “trend.” Indeeed, “to trend”” is increasinglly ficant dimensioons from other objects of nettwork anal-
signifi
uused as a transsitive verb am mong these useers, as in: “let’s yses suuch as germs oor airline flightts.
trrend #Hungry4 4BH”. These efforts
e are not mere
m blind stab bs Thee challenges ouutlined in this paper are resuults of sys-
aat massive tweeeting; they ofteen display a fin ne-tuned underr- tematiic qualitative aand quantitative inquiry over two years;
sstanding of Tw witter’s trendiing topics alg gorithm, which h, howevver, this paper cannot be connclusive in idenntifying the
wwhile not publiic, is widely un nderstood thro
ough reverse en n- scale and the scope these issues— —in fact, the chhallenge is
ggineering (Lotaan, 2012). In coordinated caampaigns, Bah h- exactlly that such isssues are resiistant to large-scale ma-
rrain activistss have trended hashtaags such as a chine--analyses that allows us to exactly pinppoint them.
##100strikedays, #Bloodyf1, #KillingKhawaja, #StopTearr- Whilee the challengees are thorny, there are manny practical
GGasBahrain, an nd #F1DontRacceBahrain, amo ong others. steps. These includee:
1- Taarget non-soccial dependeent variables.. No data
sourcee is perfect, annd every dataseet is imperfect in its own
ways. Most big dataa analyses rem mains within thhe confines
the daataset, with litttle means to probe validityy. Seeking
depenndent variabless outside big ddata, especiallyy those for
whichh there are otheer measures obbtained using traditional,
tested and fairly rreliable methoods, and lookiing at the
conveergent and diveergent points, w will provide muuch needed
clarityy to strengths aand weaknessees of these dataasets. Such
Figure 6: Ankkara Mayor leads
ds a hashtag camp
mpaign that will depenndent variables could range ffrom , elections results to
eeventually trend
d worldwide. [Translation: Yes…
… I’m announcing g unempployment numbbers.
ur hashtag. #sstoplyingCNN] 2- Q ualitative pu ull-outs. Reseearchers can include a
Campaigns to t trend hashtaags are not limmited to grasss- “qualiitative pull-ouut” from theeir sample too examine
rroots activists. In Figure 6, drawn from my m primary ree- variatiions in behavvior. For exaample, what ppercent of
ssearch in Turkeey, you can seee AKP’s Ankarra mayor, an acc- retweeets are “hate-rretweets”? A ssmall random subsample
tiive figure in Turkish
T Twitterr discussed befo
fore, announcin ng can prrovide a checkk. This may invvolve asking qquestions to
thhe hashtag thaat will be “tren
nded”: #cnnislying. This waas peoplee. For examplee: did the som me people hear of X from
rretweeted moree than 4000 tim mes. He had been announcin ng TV ass well as from m Twitter? Thhese qualitativee pull-outs
thhe campaign anda had asked people to be in i front of theeir need nnot be huge to help interpretaation.
ddevices at a set time; in these campaig gns, the actuaal 2. Basseline panels. Establishing a panel to studdy peoples’
hhashtag is often withheld un ntil a pre-agreeed time so as tot digitall behavior, sim
milar to panel sstudies in sociaal sciences,
pproduce a max ximum spike which Twitterr’s algorithm is could develop “baseelines” and “gguidelines” for the whole
community. Data sought could include those in this paper. Freelon, D. (2014). On the interpretation of digital trace data in
communication and social computing research. Journal of Broadcasting
3. Industry Outreach. The field should solicit cooperation & Electronic Media, 58(1), 59–75.
from the industry for data such as “denominators”, similar Geddes, B. 1990. How the Cases You Choose Affect the Answers You
to Facebook’s recent release of what percent of a Facebook Get: Selection Bias in Comparative Politics. Pol. Analysis 2(1): 131–150.
network sees status updates. Industry scientists who Ghonim, W. 2012. Revolution 2:0: A Memoir and Call to Action. New
participate in the research community can be conduits. York: Houghton Mifflin Harcourt, 2012.
4. Convergent answers and complimentary methods. Gilbert, S.F. 2001. Ecological Developmental Biology: Developmental
Multi-method, multi-platform analyses should be sought Biology Meets the Real World. Developmental Biology 233(1): 1–12.
and rewarded. As things stand, these exist (Adar et al., Golder, S.A. and Macy, M.W. 2011. Diurnal and seasonal mood vary
with work, sleep, and daylength across diverse cultures. Science
2007 or Kairam, 2013) but are rare. Whenever possible,
333(6051): 1878–1881.
social media big data studies should be paired with
Hargittai, E. 2008. Whose Space? Differences Among Users and Non-
surveys, interviews, ethnographies, and other methods so Users of Social Network Sites. Journal of Computer-Mediated
that biases and short-comings of each method can be used Communication 13(1): 276–297.
to balance each other to arrive at richer answers. Jenner, R.A., and Wills, M.A. 2007. The choice of model organisms in
5. Multi-disciplinary teams. Scholars from fields where evo–devo. Nature Reviews Genetics 8(4): 311–314.
network methods are shared should cooperate to study the Kairam, Sanjay Ram, Meredith Ringel Morris, Jaime Teevan, Dan
scope, differences and utility of common methods. Liebling, and Susan Dumais. 2013. “Towards Supporting Search over
6. Methodological awareness in review. These issues Trending Events with Social Media.” In Seventh International AAAI
should be incorporated into the review process and go Conference on Weblogs and Social Media.
beyond soliciting “limitations” sections. Kwak, Haewoon, Changhyun Lee, Hosung Park, and Sue Moon. 2010.
“What Is Twitter, a Social Network or a News Media?” In Proceedings of
A future study that recruited a panel of ordinary users,
the 19th International Conference on World Wide Web, 591–600. WWW
from multiple countries, and examined their behavior ’10. New York, NY, USA: ACM.
online and offline, and across multiple platforms to detect Lazer, D.; Pentland, A.; Adamic, L.; et al. 2009. Computational Social
the frequency of behaviors outlined here, and those not Science. Science 323(5915): 721–723.
detected yet, would be a path-breaking next step for Lerman, Kristina, and Rumi Ghosh. "Information Contagion: An
understanding and grounding our social media big data. Empirical Study of the Spread of News on Digg and Twitter Social
Networks." ICWSM 10 (2010): 90-97.
Lotan, G. October 12, 2011. Data Reveals That “Occupying” Twitter
References Trending Topics is Harder than it Looks. SocialFlow. Accessed at:
Adar, Eytan, Daniel S. Weld, Brian N. Bershad, and Steven S. Gribble. http://blog.socialflow.com/post/7120244374/data-reveals-that-occupying-
2007. “Why We Search: Visualizing and Predicting User Behavior.” In twitter-trending-topics-is-harder-than-it-looks.
Proceedings of the 16th International Conference on World Wide Web, Lynch, M. 2012. The Arab uprising : the unfinished revolutions of the new
161–70. WWW ’07. New York, NY, USA: ACM. Middle East. New York: PublicAffairs.
Aday, S.; Farrell, H.; Lynch, M.; Sides, J.; and Freelon, D. 2012. Blogs Mitchell, A., and Hitlin, P. March 4, 2014. Twitter Reaction to Events
and Bullets II: New Media and Conflict after the Arab Spring. United Often at Odds with Overall Public Opinion. Pew Research Center.
States Institute of Peace. Accessed at: http://www.pewresearch.org/2013/03/04/twitter-reaction-to-
Lynch, M.; Freelon, D and Aday, S. 2014. Syria’s Socially Mediated Civil events-often-at-odds-with-overall-public-opinion/.
War. United States Institute of Peace. Meiman, Jon, and Jeff E. Freund. 2012. “Large Data Sets in Primary Care
Allison, P.D. 2001. Missing data. Thousand Oaks, CA: SAGE. Research.” The Annals of Family Medicine 10 (5): 473–74.
Banko, M.; and Babaoğlan, A.R. 2013. Gezi Parkı Sürecine Dijital Outhwaite, W.; Turner, S.; Dunning, T.; and Freedman, D.A., eds. 2007.
Vatandaş’ın Etkisi. Ali Riza Babaoglan. Modeling Selection Effects. In The SAGE handbook of social science
methodology. Thousand Oaks, CA: SAGE Publications.
Bernstein, M.S. 2013. Quantifying the Invisible Audience in Social
Pray, L. & Zhaurova, K. (2008) Barbara McClintock and the discovery of
Networks. In Proceedings of the SIGCHI Conference on Human Factors
jumping genes (transposons). Nature Education 1(1):169
in Computing Systems, 21-30. New York: Association for Computing
Machinery. Romero, Daniel M., Brendan Meeder, and Jon Kleinberg. "Differences in
the mechanics of information diffusion across topics: idioms, political
Bolker, J.A. 1995. Model systems in developmental biology. BioEssays
hashtags, and complex contagion on twitter." Proceedings of the 20th
17(5): 451–455.
international conference on World wide web. ACM, 2011.
boyd, d., and Crawford, K. 2012. Critical Questions for Big Data.
Salathé, Marcel, Duy Q. Vu, Shashank Khandelwal, and David R. Hunter.
Information, Communication & Society 15 (5): 662–679. 2013. “The Dynamics of Health Behavior Sentiments on a Large Online
Butts, C.T. 2009. Revisiting the foundations of network analysis. Science Social Network.” EPJ Data Science 2 (1): 4. doi:10.1140/epjds16.
325(5939): 414. Tufekci, Z. and Wilson, C. 2012. Social Media and the Decision to
Cha, M.; Haddadi, H.; Benevenuto, F.; and Gummadi, K. 2010. Participate in Political Protest: Observations From Tahrir Square. Journal
Measuring user influence in twitter: The million follower fallacy. ICWSM, of Communication 62(2): 363–379.
10: 10-17.
Fields, S., and Johnston, M. 2005. Whither Model Organism Research?
Science 307(5717): 1885–1886.

View publication stats

You might also like