Professional Documents
Culture Documents
Robert Ackland - Web Social Science - Concepts, Data and Tools For Social Scientists in The Digital Age-SAGE (2013) - 81-115
Robert Ackland - Web Social Science - Concepts, Data and Tools For Social Scientists in The Digital Age-SAGE (2013) - 81-115
Robert Ackland - Web Social Science - Concepts, Data and Tools For Social Scientists in The Digital Age-SAGE (2013) - 81-115
While some authors may be willing to apply the term ‘social network’ to
any network involving people, this book does not regard all interpersonal
networks as social networks. Some interpersonal networks that are formed in
social media environments are difficult to be conceived of and studied as
social networks. A good example is Twitter: the founders of Twitter
emphasise that it is not a social network (in comparison to, say, Facebook)
but an information network, and Kwak et al. (2010) confirmed this after
studying the patterns of activity within Twitter (for more on this, see Section
3.4).
Network nodes are often represented in sociograms using symbols such as
a disc or square, with the choice and size of the symbol reflecting the
attribute of the node. There are two major types of node attribute. Graph-
theoretic attributes are determined from the position of the node in the
network (e.g. number of connections or degree), while non-graph-theoretic
attributes are intrinsic to the node (e.g. gender, race, age) and are not related
to network position.2
Network edges can represent different connections (e.g. collaboration,
kinship, friendship), and there are two major types of edges. Directed edges
have a clear origin and destination (e.g. a Twitter user following another
user, a hyperlink from one page to another), and may or may not be
reciprocated. A directed edge is represented in a sociogram as a line with an
arrow head. Undirected edges, in contrast, represent a mutual relationship
with no origin or destination (e.g. marriage, Facebook friend), and they do
not exist unless they are reciprocated. An undirected edge is represented in a
sociogram as a line without an arrow head.
Finally, we can illustrate the different types of networks (Box 3.2) using
the school friendship data. Figure 3.2 shows the 1.0-degree egonet for person
C, while the 1.5- and 2.0-degree egonets are shown respectively in Figure
3.3 and Figure 3.4.
SNA differs markedly from other social scientific approaches in that it puts
emphasis on the structure of social relations, rather than the attributes of an
individual actor, in determining that actor’s behaviour and outcomes (see
Box 3.3).
The network perspective focuses on relationships between actors as the
unit of analysis rather than the actors themselves. With ordinary least
squares – a hallmark technique of ‘non-relational’ social science – the unit
of analysis is the individual. In contrast, with exponential random graph
models (Section 3.2.2) – an increasingly used statistical SNA technique –
the unit of analysis is the dyad.
While ordinary least squares involves the assumption that observations are
independent (e.g. the error term in a labour market regression model is
assumed to be identical and independently drawn from the normal
distribution), the network perspective explicitly assumes the
interdependence of observations (see Section 3.2.2).
The network perspective recognises that social networks can have both
direct and indirect impacts on individual behaviour and outcomes. Hence,
in order to understand the flow of information and resources between two
people it is important to know about the entire social network which they
are a part of, and not just the fact that they are friends (Box 3.4).
People tend to belong to several overlapping social networks. That is,
individuals tend to be members of several groups and the group boundaries
are often fuzzy.
The structure of actor relations both constrains and provides opportunities to individual actors
– social structure can therefore influence the behaviour of actors. Suppose Sue has a wider
network of friends and contacts than Jill. Sue is more likely to hear about possible jobs than Jill,
with her more limited social network. Thus, Sue’s social network is providing opportunities
(information) that are affecting her behaviour (applying for a suitable job).
The structure of social relations can affect outcomes – the likelihood of particular outcomes can
be influenced by the structure of the social network. Imagine that Sue and Jill both apply for the
same position. Further, imagine that Sue’s social network includes people who are known to the
hiring committee, and are prepared (possibly informally) to vouch for her; suppose one of the
people on the hiring committee asks informally, ‘So is this person any good?’ Meanwhile Jill does
not have any contacts who will vouch for her. We can expect that Sue, who has a good social
network, will (all other things considered) get the job.
In the above example, what do we mean by saying that the person with the ‘wider network’ will
have more opportunities to find information on good jobs? We assume that Jill mainly has close
friendships with people who also know each other (e.g. they went to the same school, work at the
same places), while Sue has more acquaintances in her social network (e.g. people whom she
knows, but who do not know each other). Jill’s friends all tend to move in the same circles and
hence she is unlikely to pick up novel information that can help with her job seeking (e.g. that a firm
is looking to hire a person in her area of work). Her friends can thus be thought of as ‘redundant
ties’, in that they are likely to provide the same information. In contrast, Sue’s acquaintances, by
definition, know people whom she does not know, and Sue is therefore more likely to receive novel
information from these ties that can assist with her job seeking.
This idea was formalised by the sociologist Mark Granovetter (1973) in the influential concept of
the strength of weak ties. Montgomery (1991, 1992) demonstrated that weak ties are positively
related to wages and employment rates.
There are many SNA measures (or metrics) that can be used to describe
individual actors within a network and the network as a whole – see Section
3.5 for some basic network- and node-level metrics for the example school
friendship network.
Network effects
There are two types of features in social networks. Purely structural or
endogenous network effects are network ties that have nothing to do with
actor attributes, but arise because of social norms. Section 3.1 introduced
two examples of endogenous network effects: reciprocity (if i nominates j as
a friend, then it is more likely that j will nominate i as a friend), and
transitivity (if i nominates j, and j nominates k, then there is a higher
probability that i will also nominate k, thus forming a transitive triad).
The second major feature in social networks is actor–relation effects –
these are network ties that are created because of the attributes of actors.
There are three main types of actor-relation effects:
C may have nominated J as a friend because they are both girls (this would
be an example of homophily).
C may have nominated J as a friend because J is older and C would like to
be seen to be ‘hanging out with the older girls’ (this would be an example
of a receiver effect, with J receiving more nominations because she is
older).
The tie from C to J could also be a purely structural effect: the fact that C
nominates H and H nominates J means there is a higher likelihood that C
will also nominate J (thus forming a transitive triad).
Blogs
Two people (A and B) each blog on a similar topic. Blogger A writes a blog
post where he comments on a post that was earlier written by blogger B, and
to assist the reader he creates a hyperlink pointing to the earlier blog post.
This can be represented as a directed and unweighted network between the
two bloggers (the diagram describing this is analogous to Figure 3.7). Note
that to capture the temporal nature of blogging, we could have also
represented this behaviour using Figure 3.6.
Wikis
Person A and person B both edit wiki page 1. Person B also edits wiki page
2, which is also edited by person C (Figure 3.8). At this point, it is useful to
introduce some additional social network terminology. A unimodal network
contains only one type of node, while a multimodal network contains more
than one type of node. The wiki network just described is an example of a
bimodal network containing exactly two types of vertices: wiki editors
(people) and the wiki pages they edit (this is also known as an affiliation
network). So, in this particular affiliation network, people do not connect
directly with people and wiki pages do not connect directly with wiki pages.
However, a bimodal affiliation network can be transformed into two separate
undirected and unweighted unimodal networks (wiki editor to wiki editor
and article to article), as is shown in Figure 3.8.
Microblogs
Person A follows person B on Twitter (see Section 3.3.4 for more on Twitter
terminology). Person C authored a tweet where person B was mentioned.
Therefore, there are two types of directed edges (follows, mentions) and this
is an example of a multiplex network, which is a network with multiple types
of edges. As noted in Section 3.3.4, with Twitter there are actually three
types of directed edges: following relationships, ‘reply to’ relationships and
‘mention’ relationships. Multiplex ties are often reduced to a simplex tie
(e.g. a tie exists if any of the multiplex ties exist). Thus we can represent this
example multiplex Twitter network as a directed and unweighted simplex
network with person A nominating person B, and person C nominating
person B (Figure 3.10).
Virtual worlds
Three people are playing World of Warcraft. Person A and person B both
join raiding party 1 (parties allow players to work together as a team,
enabling private in-game communication and sharing of resources). Person B
also joins raiding party 2, which person C also belongs to. This can be
represented as an affiliation network with exactly the same structure as the
wiki network in Figure 3.8 (and the two unimodal networks for people and
raiding parties).
ver. 9.0.1 will not open a mail window from another program
Thunderbird 8 and 9 chooses wrong printer
What is the Profile folder?
Boyd and Ellison (2008) argue that social network sites are different from
earlier forms of online communities such as newsgroups because they
primarily support pre-existing social relations (e.g. maintaining or
solidifying existing offline relations rather than being used to meet new
people). Lampe et al. (2006) found that Facebook users tend to search for
people with whom they already have an offline connection rather than
browsing for complete strangers to meet. Lenhart and Madden (2007) found
that 91% of teenagers in the USA using social network sites do so to connect
with offline friends. Research has therefore shown that people generally use
Facebook to connect with friends they have in the ‘real world’, hence
Facebook data are useful indicators of real-world friendships. As Wimmer
and Lewis (2010, p. 585) note with regard to Facebook picture friends:
‘Online pictures document an existing ‘‘real-life’’ tie and are therefore
qualitatively different from the ‘‘virtual’’ network studied by other
researchers.’
3.3.4 Microblogs
The underlying technology for microblogs is quite similar to blogging
(Section 4.3.3), but posts (‘tweets’, as they are called by Twitter) are limited
to 140 characters. While there are several examples of microblogs, Twitter
is the best known and so the following discussion is focused on Twitter
(many of the features described here are common to other microblogs).
Tweets are typically public and ‘one-to-many’. So how can we extract or
identify directed ties between particular Twitter users? There are four types
of interactions between Twitter users that can be represented as network ties.
First, it is possible to subscribe to or follow another Twitter user and you
will then receive that person’s tweets. The people you are following are your
friends, while the people who follow you are your followers.
It is also possible to directly include another Twitter user in the body of a
tweet. This may be a better indication of the existence of a social tie between
two users than friends or follower lists. In the same way, it is argued by some
that in blogs, permalinks (hyperlinks in blog posts) are a better indication of
connections between bloggers than blogrolls, which can be notoriously stale
(Section 4.3.3).
@replies are when the tweet starts with a username (prefixed with ‘@’),
and are used to show that a tweet is intended for a particular user. @replies
can be contrasted with @mentions, which is where the username (prefixed
with ‘@’) appears somewhere in the tweet, but not at the start. So the tweet
‘We saw @rob at the football’ is an @mention, while ‘@rob – was that you
we saw at the football?’ is an @reply. So @mentions indicate
acknowledgement while @replies show that a tweet is being directed to a
particular user.
The fourth type of interaction in Twitter is the retweet function, which is
where a Twitter user forwards a tweet to their followers, prefixing the tweet
with ‘RT @user’, where ‘user’ is the person who authored the original
tweet. So, if Twitter user @dave tweeted ‘We saw @rob at the football’ and
one of his followers decided to retweet this, this would be done using ‘RT
@dave: We saw @rob at the football’. Retweeting is a sign that the person
doing the retweet thinks the original tweet is interesting or important enough
that it needs to be read by a wider audience (obviously not so in this
sample), and hence is a sign of acknowledgement.
As discussed below, there are tools that can be used to find, for a given
user, who they follow and who they have @mentioned, @replied to or
retweeted, thus allowing the construction of a directed multiplex 1.0-degree
egonet. By repeating this process for each of the users or alters identified in
this way, it is possible to create 1.5- and 2.0-degree egonets. By repeating
this process for egos, it is possible to create a complete network showing the
follower relationships amongst the Twitter users. In some contexts it may
make sense to just focus on one type of tie (e.g. follower) or else to convert
the multiplex network into a simplex network.
In terms of identifying the boundaries to a given Twitter network, the
Twitter Application Programming Interface (API) allows you to identify all
the users (over a particular period of time) who have authored tweets on a
particular topic. This can be done via the Twitter hashtag (#) which is the
convention for identifying the topic of a tweet, or else by searching for a
particular word or phrase. So it would be possible to construct the network
of Twitter users who tweeted on ‘#climatechange’ or ‘climate change’, for
example.
3.6 CONCLUSION
This chapter has provided an introduction to social network analysis and has
shown how SNA concepts and tools can be used (and adapted) for studying
social media network data. Particular focus was given to three types of
social media data: threaded conversations, social network services and
microblogs (hyperlink networks are covered separately in Chapter 4). Many
of the concepts in this chapter are used in the remainder of the book.
Further reading
The canonical social network analysis reference is Wasserman and Faust
(2004). An introduction to SNA that is available online is Hanneman and
Riddle (2005). For more on analysis of communication networks, see Monge
and Contractor (2003), and for a treatment from the network science (applied
physics/computer science) and economics perspectives, see Easley and
Kleinberg (2010). For an introduction to social media network analysis, see
Hansen et al. (2010a).
_____
_____
_____
___
1Hyperlink network analysis is covered separately in Chapter 4.
2In mathematics and computer science, the term ‘graph’ is used synonymously with ‘network’; we use
the terms interchangeably here.
3Throughout this book, lower-case letters are used to denote actors in generic examples, while upper-
case letters are used for specific examples (e.g. the school friendship network).
4See also Katz et al. (2004) for a summary of these features.
5See, for example, Robins et al. (2007) for a detailed introduction.
6Faust and Skvoretz (2002; and Skvoretz and Faust (2002)) referred to these as structural signatures.
7This section draws from Hansen et al. (2010b). See also Vergeer and Hermans (2008) for more on
using content analysis and network techniques to analyse online political discussions.
8http://groups.google.com/group/comp.os.linux.advocacy/topics
9http://forums.mozillazine.org/viewforum.php?f=39
10See http://www.google.com/googlegroups/archive_announce_20.html
11There are other examples of research into friendship formation that use data from social network
sites (see, for example, Mayer and Puller, 2008; Thelwall, 2009a).
12http://namegen.oii.ox.ac.uk/namegenweb/learnmore.php
13https://dev.twitter.com
14At the time of writing, Twitter are planning to phase out the Search API in favour of the Streaming
API, which allows researchers to compile their own databases of historical Twitter data by accessing
near real-time flows of tweets (this is referred to as accessing the Twitter ‘firehose’).
15http://tweepy.github.com
16http://nodexl.codeplex.com
17See also Park (2003) who compares social networks, information networks and hyper-link networks.
18A connected component is a set of nodes that are connected (either directly or indirectly) to one
another. A strongly connected component is where each node is reachable by all other nodes.