Professional Documents
Culture Documents
Social Problems
Social Problems
O
n 3 November 1948, the day after tinos, women, urban residents” (9) whereas cess to platform-specific data, algorithms,
Harry Truman won the United States Pinterest is dominated by females, aged 25 to and resources) is creating a divided social
presidential elections, the Chicago 34, with an average annual household income media research community. Such research-
Tribune published one of the most of $100,000 (10). These sampling biases are ers, for example, can see a platform’s inner
famous erroneous headlines in rarely corrected for (if even acknowledged). workings and make accommodations, but
newspaper history: “Dewey Defeats Proprietary algorithms for public data. may not be able to reveal their corrections
Truman” (1, 2). The headline was informed Platform-specific sampling problems, for or the data used to generate their findings.
and capture human behavior according to Incomparability of methods and data. (so as to permit the calculation of a signifi-
behavioral norms that develop around and With few exceptions, the terms of usage for cance score within the study itself).
as a result of the specific platforms. For in- social media platforms forbid the retention
stance, the ways in which users view Twitter or sharing of data sets collected from their CONCLUSIONS. The biases and issues high-
as a space for political discourse affects how sites. As a result, canonical data sets for lighted above will not affect all research in
representative political content will be. The the evaluation and comparison of compu- the same way. Well-reasoned judgment on
challenge of accounting for platform-specific tational and statistical methods—common the part of authors, reviewers, and editors is
behavioral norms is compounded by their in many other fields—largely do not exist. warranted here. Many of the issues discussed
temporal nature: They change with shifts Furthermore, few researchers publish code have well-known solutions contributed by
in population composition, the rise and fall implementing their methods. The result is other fields such as epidemiology, statistics,
of other platforms, and current events (e.g., a culture in which new methods are intro- and machine learning. In some cases, the
revelations concerning interest and tracking duced (and often touted as being “better”) solutions are difficult to fit with practical
of social media platforms by intelligence ser- without having been directly compared to realities (e.g., as in the case of proper sig-
vices). In the absence of new methodologies, existing methods on a single data set. Given nificance testing) whereas in other cases the
we must rely on assessments of where such community simply has not broadly adopted
entanglements likely occur. best practices (e.g., independent data sets for
Distortion of human behavior. Develop- testing machine learning techniques) or the
ers of online social platforms are building There is “ … the need for existing solutions may be subject to biases
Published by AAAS
Social media for large studies of behavior
Derek Ruths and Jürgen Pfeffer
Science (ISSN 1095-9203) is published by the American Association for the Advancement of Science. 1200 New York Avenue NW,
Washington, DC 20005. The title Science is a registered trademark of AAAS.