Current Issues in Criminal Justice

Social Media Sentiment Analysis: A New Empirical

Tool for Assessing Public Opinion on Crime?

Jeremy Prichard, Paul Watters, Tony Krone, Caroline Spiranovic & Helen

Social Media Sentiment Analysis:
A New Empirical Tool for Assessing
Public Opinion on Crime?
Jeremy Prichard, * Paul Watters, † Tony Krone,‡ Caroline Spiranovic § and
Helen Cockburn **


‘Big data’ presents many interesting opportunities and challenges. This article focuses on
the potential use of social media sentiment analysis as a legitimate tool for criminological
research to better understand public perceptions of crime problems and public attitudes to
responses to crime. While a degree of scepticism should always apply to the use of
unsubstantiated sources on the internet, SMSA is likely to be a rich source of valuable
information. Observational SMSA research presents low-level risks in terms of human
research ethics principally because the information derived is unlikely to lead to the
identification of research subjects. It is arguable, but less certain, that material posted
publicly online does not attract a reasonable expectation of privacy for the author. However,
the strength of this argument may depend on the particular circumstances in which the
material to be analysed was posted.

Keywords: sentiment analysis – big data – criminological research – privacy –

attitudes to crime – research ethics

Sentiment analysis (or ‘opinion mining’) is the use of information technology to automatically
evaluate opinions expressed across multiple texts. The internet is a rich source of opinions —
in posts or comments on news and other websites, as well as many different social media

platforms such as Facebook, Twitter, blogs and message boards. On just one day in early June
2015 it was estimated that there were more than three billion internet users collectively using
almost a billion sites, sending more than 200 billion emails, making nearly four million blog
posts, sending more than 750 million Tweets, and almost 1.5 billion Facebook accounts were
active (Real Time Statistics Project 2015). These raw figures are incredible but are likely to
be inflated by what is effectively ‘junk’, such as spam.
Opinion mining is one way of manipulating part of the staggering amount of information
or ‘big data’ that modern information and communications technology generates (Moorthy et
al 2015). Opinion mining across online news media and social media is referred to as social
media sentiment analysis (‘SMSA’). A basic form uses natural language processing
techniques to extract binary sentiments on particular issues. SMSA may also involve more
nuanced techniques, such as clustering, to analyse related opinions or constructs that do not
fall neatly into binary categories (Layton et al 2013a).
This article discusses the use of SMSA to observe and record public commentary on the
internet that has not been solicited by the researcher (Veltri 2013). In terms of privacy and
human research ethics concerns, this is arguably the least intrusive research application of
There are other applications of SMSA in academic research beyond the scope of this
article. Each raises differing questions about privacy and ethics (Freeman Cook and Hoas
2013), including the effect of the active role taken by the researcher interacting with the
subjects of the research (Hesse-Biber and Griffin 2013). Studies observed can be categorised
into three types:
1. researcher use of a virtual space to engage others and elicit comments (Allen 2014);
2. researcher–participant interaction facilitated through social media (Curtis 2014);
3. researchers monitoring participants in clinical research studies (Glickman et al 2012).
In the technical literature, much attention has been given to refining SMSA to collate
macro-level, real-time indicators of public opinion. Feldman (2013) estimated that
information technology (‘IT’) researchers published over 7000 articles on SMSA. This effort
is, at least in part, driven by the demand for SMSA from governments (Gray and Gordo 2014),
the corporate sector (Zhang and Vos 2014) and in politics (Groshek and Al-Rawi 2013;
Hawthorne et al 2013).
SMSA techniques are established in fields as diverse as health (Christensen et al 2014),
product safety (Isah et al 2014; Shan et al 2014), crisis management (Johansson et al 2012),
economic development (Schroeder 2014) and education (Granitz and Koernig 2011). It is
clear that the vast array of platforms for the expression of opinion presents distinct
opportunities and challenges for research. For example, Hesse-Biber and Griffin (2013)
reviewed different research studies targeting particular interests, including an investigation of
social capital in online gaming communities, a study of hyperlinking (for network analysis)
on ‘living wage’ activist sites, and online social support groups on a parenting site.

SMSA and criminology

The internet and the rise of big data, particularly on platforms that transcend national
boundaries, raise many important theoretical and practical challenges for criminology that
involve the possible use or misuse of big data techniques such as SMSA. A complex mix of

concerns intersects the fields of surveillance, privacy, freedom of expression, intellectual

property, national security, law enforcement and public perceptions of crime and fear of
The use of data mining, including SMSA, for law enforcement and national security
purposes is already entrenched (McCue 2015). State surveillance online is possibly
considered as routine as the use of closed-circuit television (Sykora 2013) but with much
greater power, raising vital questions about potential abuse (Qin 2015; Clement 2014).
Social media has also become both process and record — as a platform for movements for
social change and as a repository of information documenting the events and processes of
change (Lewis et al 2011; Creech 2014; Byun and Hollander 2015). There are some notable
examples of an immediate interplay between social media and crime and responses to crime.
These include the negative role of some social media following the Boston Marathon bombing
in 2013 (Potts and Harrison 2013; Marx 2013), the viral Kony2012 campaign originating in
the United States that pushed for international action to arrest Joseph Kony, founder of the
Lord’s Resistance Army (Thomas et al 2015), and the unsuccessful campaign in Australia to
try to save Andrew Chan and Myuran Sukumaran from the death penalty in Indonesia
(Mayfield 2015).
Although SMSA represents a powerful tool for researching public attitudes to crime and
crime control, criminologists have paid little attention to its potential. This may reflect what
McQuade (2009) considers a bias in criminology towards social science research methods
with which practitioners are familiar and, thus, a lack of awareness or confidence in alternative
research methods (see also Savage and Burrows 2007).
Criminologists have a long-standing interest in exploring social attitudes to various aspects
of the criminal justice system (Indermaur and Roberts 2009). Research has focused on
attitudes to issues such as what is criminalised, how categories of criminals are perceived and
how police operate. The interaction between public attitudes to punitiveness and sentencing
law and practice has also been explored (Doob and Roberts 1983; Warner and Davis 2012).
There is particular value in work of this type in that it helps us to understand the interplay
between justice institutions, the public, the media and politics in shaping and reforming (or
ossifying) the criminal justice system (Pickett et al 2013).
This interdisciplinary article describes social science methods currently used to examine
public attitudes, noting their relative strengths and weaknesses. We explain how SMSA works
and is applied in different contexts. Disciplinary views on the meaning of ‘public attitudes’,
‘sentiment’ and ‘opinion’ are examined. Apparent advantages for researchers will be
identified, including very large sample sizes, low cost and speed. The quality and richness of
SMSA data is critically considered, particularly whether SMSA could be an efficient means
to examine democratised versions of media — such as blogging and social media — which
provide greater opportunities for a diverse set of interests within public debate (Meraz 2009).
Limitations of SMSA will be highlighted, not least of which is the fact that it precludes non-
internet users and hence probably underrepresents vulnerable groups. Finally, we discuss
SMSA and human research ethics and privacy issues. This article helps define ethical
boundaries for criminologists considering SMSA in stand-alone studies or in combination
with traditional social science methods.

Current methods for examining public attitudes

Public opinion may be gauged using any number of methods. The criminological literature
tends to rely on four main methods to assess public views on crime and justice matters: media
polls, representative surveys, focus groups and deliberative polls (see Gelb 2006 for a useful
overview of these approaches). Each of the four major methods has weaknesses and strengths.
Simple media poll style questions such as ‘Are you in favour of x?’ are often cited by
politicians as evidence of the level of support of the public for a given crime and justice policy.
Media polls are relatively quick to run and inexpensive, and often generate large samples.
However, there are a number of disadvantages, which most notably include the fact that they
typically measure the views of a select and unrepresentative group of respondents (Gelb 2006)
and do not provide contextual information or choices and therefore encourage respondents to
provide what has been referred to as ‘mass opinion’ or uninformed ‘top-of-the-head’ opinion
(Green 2006; Yankelovich 2010). In other words, respondents may give flippant answers to
media polls or are constrained by the question asked, such as the controversial Roy Morgan
Research poll reported in January 2015 which indicated majority support for the death penalty
to be carried out in the cases of Chan and Sukumaran (Meade 2015). The poll found that
52 per cent of those surveyed said ‘yes’ to the question, ‘In your opinion if an Australian is
convicted of drug trafficking in another country & sentenced to death, should the penalty be
carried out?’ (Roy Morgan Research 2015).
Representative surveys
Many criminologists use representative surveys to gauge public opinion (for example, see
Gelb 2006:12). These surveys employ representative samples and typically ask a variety of
questions, rather than a single question. This enables researchers to gain a better
understanding of individual perspectives and the impact of variables, such as demographic
differences, across a sample. A closed choice response format makes it possible to generate
quantitative data on public opinion that can be readily summarised.
Representative surveys may be administered via telephone or via face-to-face interviews
and, now less commonly, via paper-based postal surveys. Compared with media polls,
representative surveys are relatively expensive to run, particularly for face-to-face interviews,
but they often provide more detailed information. Face-to-face interviews are a particularly
good choice when sensitive information is being elicited from respondents; telephone
administered surveys can include respondents from rural and remote regions. However,
telephone surveys relying on landline numbers will not capture the views of mobile-only
users, who are most likely aged 18 to 25 (see Gelb 2006 for a discussion of these and other
issues). Although representative surveys are generally considered superior to media polls,
survey design, item wording and response options may also determine the quality of responses
obtained. At worst, representative surveys may only crudely gauge mass opinion. However,
where relevant contextual information accompanies a choice of responses, the quality of
responses is greatly improved (Varma and Marinos 2013).
The predetermined nature of questions and the typical forced-choice format of
representative surveys gives rise to criticism that the results miss nuanced and complex views
(Gelb 2006) and lack richness in detail (Stobbs, Mackenzie and Gelb 2014). They may also
fail to gauge informed opinion that is well-considered, stable, consistent and relatively
enduring (Price and Neijins 1998; Yankelovich 2010).

Deliberative processes
As surmised by Indermaur and colleagues (2012), scholarly literature has identified a number
of prerequisites of informed opinion including information, responsibility taking and
deliberation (see, for example, Price and Neijins 1998). Information refers to the fact that
respondents require a certain level of knowledge and must be provided with relevant
contextual information in order to arrive at an informed opinion. Responsibility taking refers
to respondents feeling some personal investment or responsibility for their answers.
Deliberation requires an in-depth consideration of the available information and choices
available and the pros and cons of these choices before reaching a decision. The process of
deliberation has been described as a social process whereby individuals discuss their views
with others and must consider the alternative views of others (Yankelovich 2010). Adopting
this strict definition of ‘informed opinion’ would mean that even well-designed and well-
worded representative surveys cannot tap into informed opinions because respondents are not
able to deliberate with others when answering.
Focus groups
Due to these and other weaknesses of representative surveys, some criminologists prefer to
use focus groups to gauge public opinion on crime and justice issues (Gelb 2006:16). Focus
groups usually involve small groups of respondents brought together to discuss a particular
issue(s) and a facilitator who ensures that discussions stay on topic and necessary issues are
covered. The samples generated from focus group studies tend not to be representative of the
population as a whole as the numbers participating are generally small and self-selection
biases may determine who is willing to participate in this more time-intensive method.
Focus groups also tend to generate qualitative, as opposed to quantitative, data. However,
it has been argued that this approach provides richer data than media polls or representative
surveys, as participants can explain and qualify their views in more detail and are encouraged
to think about the issues more deeply by discussing them with others (Gelb 2006; Stobbs et
al 2014). In this sense, focus groups may better tap into informed opinions at least with respect
to the deliberation component. The extent to which respondents are informed and encouraged
to take responsibility largely depends on the design of the study, including the information
and instructions provided to respondents, and respondents’ understanding of the implications
of the study for criminal justice policy.

Mixed methods
Due to the strengths and weaknesses of these approaches, many researchers gauging public
opinion towards crime and justice issues advocate the use of mixed-methods approaches
involving both representative surveys and focus groups. The rich data obtained from focus
groups is said to complement and supplement the information obtained from representative
surveys. Mixed-methods have also been used in juror studies (see, for example, Warner and
Davis 2012) investigating attitudes to sentencing using both surveys and semi-structured
interviews to provide a richly textured understanding of the attitudes of ordinary people
presented with legally admissible material relevant to sentencing of individual offenders
(Warner and Davis 2012; Gwin 2010).
Deliberative polls
Deliberative polls combine the key features of representative surveys and focus groups and
capitalise on the strengths of these methods. Deliberative polls essentially involve the use of

mixed methods in a pre-test–post-test design. A large representative survey of public opinion

is firstly conducted. A large sub-sample (often in excess of n=500) of respondents is then
invited to join in a day- or weekend-long session involving small group deliberations with
other members of the public and experts. Experts may include researchers, criminal justice
professionals, such as judges, and even offenders and victims. The dialogue between experts
and the public is described as ‘two-way’. The views of participants are gauged typically in
relation to a single major policy issue using more open-ended responses. Finally, the initial
survey is readministered to respondents. The pre-test–post-test survey design helps to
demonstrate any possible shift in views from initial to informed opinion. Sturgis, Roberts and
Allum (2005) provide a useful overview of the deliberative polls method and Hartz-Karp et
al (2010) outlines a case study of a deliberative forum.
Due to the level of information provided, as well as the opportunities for deliberation and
deep reflection offered, deliberative polls appear to be a superior method of gauging informed
opinion when compared with representative surveys or focus groups used in isolation (Green
2006; Price and Neijins 1998). However, whether deliberative polls do actually gauge
informed opinions has been questioned, as some research shows that the attitudes garnered in
a deliberative poll may be relatively inconsistent with other values and views held by the
individual (Sturgis, Roberts and Allum 2005). Consistency in views is said to be a hallmark
of informed opinion (Yankelovich 2010). Furthermore, a notable limitation of deliberative
polls is the time and costs involved in conducting them. Thus, although they may be a superior
method, deliberative polls are rarely used. It has also been noted that deliberative polls can
only gather informed views on one particular policy issue and thus the results obtained from
this approach may be of little use to researchers who are interested in exploring public opinion
on a number of issues or in determining the relationship between opinions on differing issues
(Gelb 2006).

How social media sentiment analysis works

The computational analysis of sentiments and opinions has a simple goal: to summarise
publicly expressed thoughts, beliefs and arguments in social media. SMSA falls into the
category of ‘big data’, since it must deal with velocity, variety and volume of data (McAfee
and Brynjolfsson 2012):
x velocity, since opinions posted as messages on news websites, social media sites or
short messaging services appear instantaneously, creating novel phenomena such as
‘trending topics’ or ‘going viral’;
x variety, because these opinions can be expressed using natural language, graphics,
emoticons, voice clips and videos, and other types of user-generated content; and
x volume, because a globally connected user base of billions of people contributes
opinions on all manner of topics in a variety of forums every day.
Each of these dimensions poses its own technical challenges, and some are more easily
solved than others. The capacity of systems to deal with volume and velocity is a function of
Moore’s law (Moore 1988), which predicts that the number of transistors that can be packed
into an integrated circuit doubles every two years; this increase enables computer systems and
networks to process and transmit data at an ever-faster rate, from more and more users. The
fundamental limitation of these systems is in developing computational intelligence that can
accurately map subjective opinions, nuanced arguments, strongly (or weakly) held attitudes,
mediated through a range of emotional states, and more or less coherently expressed

sentiments into a simple, quantitative statement, such as ‘90% of respondents agree that sex
offenders deserve life in jail’.
A recent Australian online newspaper article proposed increases to sentences for child sex
offences. In this example, a journalist wrote a short news story covering a proposal to change
the law, and 46 users responded with their own opinions. The responses range in length from
one or two words (‘good’ or ‘great idea’), to 293 words. Other responses include both natural
language, as well as links to tweets and images. The opinions range from ‘kill everyone before
they commit crime’, and ‘physical castration’, through to crime prevention and rehabilitation.
Many responses contain spelling or grammatical errors. To reduce this complex set of data to
one or more statements expressing sentiment, accompanied by a frequency analysis, a
significant amount of natural language processing and information retrieval is required.
Some approaches to opinion mining attempt to circumvent the information retrieval
problem by forcing users to provide quantitative ratings against qualitative descriptors. For
example, allows users to rank products from one to five stars and to leave a
comment or write a review. Similarly, TripAdvisor provides an equivalent five-point scale for
hotel reviews. Yet these kinds of scales do not represent the range of opinion, emotion or
attitudes that might be revealed from a computational analysis of text; indeed, sometimes the
quantitative ratings are not consistent with the qualitative reviews, or with external standards.
A user may rate an externally rated three-star hotel with five stars, since the experience met
his or her expectations, but this does mean that the hotel is actually ‘5-star’ (Layton et al
2013c). To some extent, this reflects the subjective nature of sentiments, rather than more
fact-based schemes; for example, to achieve an extra star rating, a hotel may simply have to
install a pool, rather than meet the subjectively identified needs of its patrons.
In describing the development of sentiment analysis, Pang and Lee (2008) note the range
of data sources first able to be mined, beginning with e-commerce sites, review sites and
blogs. With Web 2.0, this extended to social media including tweets, Facebook and LinkedIn.
Common constraints apply to the computational processing required to identify and extract
sentiment from these newer sources.
An additional problem is that short message services like Twitter provide very little textual
material to process. Returning to the child sex offender story, a single comment like ‘Good’
is ambiguous, since the subject must be inferred from the story. Is it ‘good’ that proposed
sentences are longer or was there some other aspect of the story or comments made that was
‘good’? A reader may be able to infer a sequence within the discussion forum threads, but it
is not always the case that users will reply in the most ‘logical’ place, and an automated
technique for analysing opinion may struggle without a clearly defined context. These types
of ambiguity continue to make SMSA a challenge. For example, Bartlett and Norrie (2015)
describe a study of public attitudes towards immigration which was initially based on
automated ‘natural language processing’ analysis of Twitter feeds. The authors found it
necessary to include manual analysis to determine the direction of sentiments (whether
positive, negative or neutral).
Most approaches to SMSA need three components to operate: a model for representing
text to perform computations on it; an algorithm for identifying and measuring sentiment; and
a reporting system.
Representational models
The most common approach to natural language processing is to use a vector representation,
or a ‘bag of words’ approach, which is described in detail by Perone (2011). In a bag of words,

each document, such as a comment on a news story, is coded with the frequency of term
occurrence, where each unique term is coded as a dictionary entry. Coding as a dictionary
entry means that you create a data dictionary of unique terms in all of the documents, starting
at 1, and enumerating every unique term. So ‘crime’ is term 1, ‘to’ is term 2, and so on. The
order of terms is not considered by most algorithms. Thus, if we take two or more documents
(from our newspaper opinion example above), such as:
Opinion 1: ‘crime to come to the attention of the police’ and
Opinion 2: ‘get tough on crime’
we can construct a dictionary thus:
‘crime’: 1,
‘to’: 2,
‘come’: 3,
‘the’: 4,
‘attention’: 5,
‘of’: 6,
‘police’: 7,
‘get’: 8,
‘tough’: 9,
‘on’: 10,
which has 10 distinct terms. We then create a vector space representation of the terms in each
Opinion 1: [1, 2, 1, 2, 1, 1, 1, 0, 0, 0]
Opinion 2: [1, 0, 0, 0, 0, 0, 0, 1, 1, 1]
Reading the first vector, which corresponds to the first document, from left to right, it
means that there is one instance of the term ‘crime’, two of the term ‘to’, one of the term
‘come’, two of the term ‘the’, and so on. The term ‘crime’ appears in each document, so the
frequency count shown here is ‘1’ for each vector. For the terms ‘to’ and ‘the’, the frequency
count for the first document is ‘2’, but since the terms do not appear in the second vector, the
frequency count is ‘0’. This is an example only and this sort of analysis is obviously unlikely
to be meaningful with a small number of documents.
While the frequency count is critical to determining the relevance of a certain term to a
particular document, this can also be offset by weighting the terms against its frequency in
natural language at large. Schemes such as Term Frequency-Inverse Document Frequency
(‘TF-IDF’) operate using this principle, and can be used to remove high-frequency words such
as ‘to’ and ‘the’ by creating a stoplist, since they are not helping in computationally extracting
meaning from documents (Wu et al 2008). Standard natural language processing technologies
can be applied to improve the quality of the vectors: verbs can be stemmed to ensure that they

are not counted as separate features, and misspelled words could be identified and counted
within the frequencies for the correctly spelled word.
Once feature vectors of this kind have been developed, they can act as input for various
learning algorithms that could be used to measure sentiment. This can be achieved using a
similar approach to spam classification for electronic mail, for example, where more terms
associated with spam will be associated with the ‘spam’ set of terms than the non-spam ‘ham’
set. In the simplest case of sentiment analysis — such as a proposition to increase jail terms
for sex offenders — it should be possible to separate documents into two separate groups
(for/against) using a binary classifier, such as Bayes’ algorithm. If sufficiently large
representative samples are obtained for each class, this kind of probabilistic classifier can
produce highly accurate results. It may also be possible to improve the classification results
by using a form of semi-supervised learning, such that a human judge can provide feedback
on the judgments made by a supervised algorithm (Goldberg and Zhu 2006).
To automatically identify which groups are associated with each proposition, it is necessary
to match keywords that are typically ‘for’ a proposition to cases, and those typically ‘against’.
This could be achieved by using data gathered from human judges (Pang et al 2002), or by
using a set of hypernyms extracted from a semantic database like WordNet (Baccianella et al
2010). For an exploration of concepts relevant to determining meaning in social media text,
see Lomborg (2015).
The easiest propositions to test for sentiment are those that are polarising and likely to fall
into two separate camps. As is apparent from the two sample vectors above, there is not a lot
of overlap. If this pattern was repeated at large scales, with many respondents, separating out
the terms associated with each argument (good/bad, for/against etc) should be relatively easy.
One aspect of sentiment analysis that makes it more complicated than email filtering is that
the identification of multiple classes may not be known a priori. It is not the case that posters
in the online article referred to above only had two opinions; the issues raised were
multifaceted and complex, so multiclass classification may be necessary.
In the simple example above, the data was drawn from posts on a single news article. To
investigate sentiments more broadly, it may be necessary to integrate raw data sampled from
a range of sources, which is technically relatively easy to achieve. Any data that can eventually
be represented as a case, using the bag of words model, can be analysed for sentiment. Many
social media applications provide Application Programming Interfaces (‘APIs’) that make it
easy to search for, identify and download relevant data. A range of data interchange formats
is widely in use, including the eXtensible Markup Language (‘XML’), and the so-called
‘semantic web’ technologies for representing and reasoning about web data (including the
Resource Description Framework). Each API will have its own formats and available
services; Google, for example, has a set of APIs that allows data to be searched for and
integrated across web, mail and geographic data sets. However, there may be proprietary
barriers to accessing data in bulk and many services limit the rate at which data can be
downloaded, so that competitors cannot simply create a ‘carbon copy’ of all of the company’s
data; Twitter, for example, limits search rates to between 15 and 180 requests for 15 minutes
(Twitter 2015). When services place time or capacity limits on data downloads, this can
significantly lengthen the data acquisition phase of the study — a data retrieval task that might
take ten minutes ordinarily may take 24 hours if delays are introduced. Depending on the

study design, it may be helpful to pool all data together into a single dataset, or at least retain
the source, so that comparisons could be made between different providers (Facebook,
Twitter) or modalities (news commentary, social media).
Researchers intending to use sentiment analysis are faced with a range of practical
considerations. Sample sizes required depend entirely on the classification algorithms being
used and the application at hand. An example is the sentiment analysis of H1N1 tweets to
predict the spread of the virus. In this case, a maximum of 600 tweets per day over nine days
was sufficient to achieve a high level of predictability (Chew and Eysenbach 2010). The cost
of implementing a system for undertaking sentiment analysis will depend on: whether
commercial or open source software is used; whether API access to data sources is free; the
scale of the data to be extracted; whether custom APIs or screen-scraping software need to be
developed; and the not-insignificant hardware costs for storing and processing data. The
expertise required to implement these systems includes natural language engineering skills,
data integration knowledge, and experience with various machine learning algorithms.
Researchers with these IT skill sets would exist at many universities in countries like Australia
and New Zealand. However, for their skills to effectively address criminal justice-related
research questions, clearly they would need to collaborate with criminologists.
As a note of caution, the accuracy of even some of the best techniques is far from perfect.
For example, Agarwal et al (2011) used a completely automated model of SMSA. They
undertook binary opinion mining of a large Twitter corpus, and found accuracy ranged
between 71.35 to 75.39 per cent using various sorts of SMSA algorithms, including unigram,
tree kernel, senti-features and combinations of these. Standard deviation for test accuracy
ranged between 0.65 and 1.95. Given that chance level accuracy would be 50 per cent, it
seems that current iterations of 100 per cent automated SMSA involves unacceptable risk of
error. Where statistical analyses were concerned, this could translate into Type 1 and 2 errors
(erroneous acceptance or erroneous rejections of hypotheses). Consequently, those interested
in investigating SMSA for criminological research are — at least for the foreseeable future
— likely to want to include the sorts of human judgment and supervision employed by
Goldberg and Zhu (2006). Perhaps these results suggest that a certain level of automation may
be desirable, and may reduce the human effort required by about 50 per cent, but, ultimately,
human assessment is required for greatest reliability.

SMSA for criminological research

For many criminologists, embarking on a SMSA study would first require establishing new
relationships with academics from IT disciplines (Shneiderman et al 2011). The research team
would need to clearly understand both the technical requirements to achieve optimum
accuracy with SMSA, and the shortcomings of SMSA. Chief among the limitations of SMSA
is that the method almost certainly underrepresents the opinions of vulnerable groups who,
for a wide variety of reasons (for example, homelessness, incarceration, mental illness,
physical illness, physical disability, and illiteracy) may not be able to use or access the internet
(Wilkerson et al 2014; Grace 2014). This article suggests that even among frequent internet
users, SMSA is likely to over-represent the views of those labelled by Prensky (2001) as
‘digital natives’, typically younger groups who are heavy users of social media and the Web
2.0 in general. This means that ‘digital immigrants’ (Prensky 2001) — like most of the authors
of this article — are less likely to be heard through a SMSA method because of their
comparatively lower use of social media. Reporting the results of SMSA studies in
criminological journals may also present hurdles. The novelty and technical dimension of the

findings may be difficult for journal editors and peer reviewers to assess. For a discussion of
the sorts of discipline-challenges that big data (like SMSA) has presented empirical sociology,
see Savage and Burrow (2007).
Notwithstanding these complexities and challenges, this article suggests that SMSA is a
promising method for gauging public opinion either alone or in combination with traditional
empirical approaches. Certain strengths of SMSA ought to be considered from the empirical
perspective. First, after establishing new collaborations and implementing and refining SMSA
methods, research teams would have a tool that could be used efficiently and frequently. This
would be ideal, for example, to use cross-sectional repeated measures to track public opinion
on a particular topic over time. Second, although this article has highlighted how error can
operate within SMSA, the traditional methods are themselves not protected from human error.
For instance, a researcher’s handwritten interview notes may capture some of the sentiment
expressed by a participant, but miss other points conveyed. Additional errors may be made
when typing the notes into an electronic format, coding the qualitative data, or cleaning the
data in preparation for analysis (McCrady et al 2010). Third, and perhaps most strikingly,
SMSA sample sizes can be very large indeed, as discussed above – many hundreds of
thousands of people. Fourth, unlike traditional methods of studying public opinion, SMSA
does not recruit participants. It only analyses what participants express in public settings
online. This means that SMSA limits some of the selection effects capable of biasing results
in traditional methods. For example, for practical reasons, recruitment for traditional studies
may be limited to certain geographical areas. Alternatively, participation in a study may be
inconvenient for a class of people because of work, leisure or family commitments — despite
the fact that they fall within a study’s target population.
Finally, participation in empirical research can itself influence participants’ behaviour in
different ways — a phenomenon that is sometimes called the ‘observer effect’. Among other
things, participant responses can be affected by their desire to be seen in a positive light by
the researcher, particularly in face-to-face interviews (Krumpal 2013). This suggests that
another potential value of SMSA data is that it removes researchers from the environment
under analysis. It is likely that if individual concerns about ‘social desirability’ (Krumpal
2013:2026) affect behaviour in empirical interviews, then social desirability probably also
influences online behaviour. However, arguably social desirability loses potency when
internet users feel anonymous. The perception of anonymity is considered a powerful factor
in criminal decision-making (Clarke 2008), including serious online crimes (Wortley and
Smallbone 2012) and engaging in other forms of deviant behaviour (Demetriou and Silke
2003). The implication for criminologists is that SMSA may have particular advantages in
capturing honest but extreme views on contentious criminal justice issues that would not be
expressed in other forums.

Privacy issues and research ethics

There is no doubt that the widespread use of social media raises interesting questions
concerning the distinction between public and private life (Papathanassopoulos 2015; Waltorp
2013). Equally, there are legitimate concerns about the reliability and accuracy of what is
posted, whether posts are created by human actors or automated software, and about the intent
of those posting material. There is a risk that some forms of SMSA may impinge on privacy
concerns as:

x some techniques may identify individuals by gathering data, such as names, images,
dates of birth or addresses. Individuals who post under pseudonyms may
inadvertently reveal information about themselves. There have been numerous cases
of individuals posting opinions on social media whose employment has been
terminated for failing to adhere to their employer’s social media policies (Berkelaar
2014; Jacobson and Tufts 2013; Moussa 2015; O’Connor and Schmidt 2015; Van
Iddekinge 2013; West and Bowman 2014);
x sometimes opinions given in restricted circumstances may inadvertently be leaked.
Tagging a friend in Facebook posts, for example, may make these opinions available
to friends of friends. It is not clear that users always understand the implications of
opinion leakage;
x open source intelligence algorithms also make it possible to match, with 90 per cent
accuracy, text being composed by the same individual using different aliases or
pseudonyms (Layton et al 2013b). However, when this step is taken alone, the
identity of the person using those aliases is not revealed.
Privacy laws in Australia such as the Privacy Act 1988 (Cth) currently have a narrow scope,
being ‘concerned with the security of personal information held by certain entities, rather than
with privacy more generally’ (ALRC 2014:46). The Australian Law Reform Commission
(‘ALRC’) recommended a new tort of invasion of privacy with two limbs: intrusion into a
reasonable expectation of privacy; and misuse of private information with a test that ‘the
invasion of privacy must be committed intentionally or recklessly, must be found to be
serious, and must not be justified by broader public interest considerations, such as freedom
of speech’ (ALRC 2014:78). Importantly, the ALRC noted that the terms under which a
person posts material to the internet is usually determined by the End User Licence Agreement
set by the website administrator and agreed to as a condition of use. A comprehensive review
of these agreements showed widely varying practices that are unlikely to be fully appreciated
by users (MacGibbon and Phair 2013).
Most internet users included in a SMSA study are at very low risk of being identified by
algorithms designed for the limited purpose of analysing public opinion. Importantly, SMSA
can be designed to explicitly exclude identifying information from the data collection, or the
risk of inadvertent identification can be reduced by cloaking the results when reported.
Australia’s National Statement on Ethical Conduct in Human Research (NHMRC 2007),
updated in May 2015, does not contain specific provisions regarding social media. However,
SMSA clearly falls under its broad definition of ‘human research’ because it involves
analysing ‘data’ or ‘other materials’ generated by individuals (NHMRC 2007:7). In a sense,
SMSA also involves human ‘observation’, albeit in an online environment and not usually in
real time. Like similar research-ethics documents that operate in other countries, the National
Statement (NHMRC 2007) recognises cornerstone ethical principles for human research.
These principles are not intended to be applied in a formulaic way. Rather, they are used to
balance the ethical strengths and weaknesses of potential research.
One such principle is respect for individuals’ autonomy. Autonomy is most obviously
respected by the fact that researchers usually seek individuals’ voluntary and fully informed
consent before including them in a study. In addition, participants’ autonomy is respected
through taking steps to safeguard participants’ confidentiality and to protect their personal
information (Beauchamp and Childress 2001). The other ethical principles are non-
maleficence, beneficence and distributive justice. Respectively these principles require that

x mitigates risks of harming anyone, including participants (Brody 1998);

x has a prospect of benefiting participants or the broader community; and
x evenly and fairly distributes burdens, risks and benefits between participants
(Beauchamp and Childress 2003: Hall et al 2012; NHMRC 2007).
This brings into relief the core ethical problem facing SMSA: individuals’ data are studied
without their consent, breaching the respect for autonomy principle. However, it is possible
for a human research ethics committee (‘HREC’) to nonetheless approve a SMSA study —
effectively granting a waiver of consent — provided the committee abides by the
considerations set out in sections 2.3.9–2.3.11 of the National Statement (NHMRC 2007).
Certainly, an application to waive consent could argue that the object of the SMSA is the
public good (beneficence) and that seeking consent would be impracticable (NHMRC
2007:2.3.10(c)) because of the extraordinarily large numbers of people involved. It might also
be important to emphasise the fact that the SMSA research focused on public opinion of
criminological issues, as distinct from using SMSA in some way to expose illegal activity
(NHMRC 2007:2.3.11). Weighed against these considerations would be the HREC’s
perspective of the risks of harm for participants (non-maleficence), and the sufficiency of the
steps taken to safeguard confidentiality and privacy (autonomy).

This article deals with the observation and recording of ‘public’ commentary for the purposes
of criminological research. The use of SMSA to distil opinions from publicly posted writings
is unlikely to identify persons and, in any event, is based on material where there is unlikely
to be a reasonable expectation of privacy. In our view, to answer the question we posed in the
title to this article, SMSA is a potentially useful new empirical tool for assessing public
opinion on crime.
SMSA can be designed so that, from the data gathered, all or most of the participants are
non-identifiable — meaning that the data do not contain individual identifiers. Steps can be
taken to further mitigate the low risk of identifying participants. As noted, human judges
improve the accuracy of SMSA data (Goldberg and Zhu 2006). They could also be employed
to test the efficacy of SMSA identity safeguards in preparatory phases. Once a study
commences, human judges could play a central role in monitoring the SMSA project’s HREC
compliance. Adverse or unexpected outcomes would need to be reported to the relevant
HREC. In some cases, it may be possible to rectify the SMSA algorithm to address the
safeguard problem. Since SMSA is a form of big data, researchers are not likely to be
interested in reporting specific sections of text, although if some text is worth quoting it can
be suitably cloaked to minimise identification. If researchers are committed to following
protocols about reporting qualitative data, they could further reduce risks of harm to
participants (for example, by ensuring individuals are not linked with views that may
embarrass them or cause them to be discriminated against).
Other forms of research using big data or SMSA techniques may be more problematic and
would have to be considered individually on their merits. The ‘mosaic theory’, which suggests
that expectations of privacy may be engaged for the aggregation of disparate personalised
data, may serve as a useful guide for considering the implications of other uses of SMSA
(Gray et al 2013).

Finally, a note of caution is required. As with anything on the internet, common sense and
experience tells us to be sceptical and critical. The potential for misinformation, distortion,
trolling and manipulation of social media is ever present and we should consider carefully the
wider context in which all comments appear on the internet.

