15 07 026 PDF

Linked Activity Spaces:
Embedding Social Networks in

Urban Space
Yaoli Wang
Chaogui Kang
Luís M. A. Bettencourt
Yu Liu
Clio Andris
SFI WORKING PAPER: 2015-07-026
SFI Working Papers contain accounts of scienti5ic work of the author(s) and do not necessarily represent
the views of the Santa Fe Institute. We accept papers intended for publication in peer-‐reviewed journals or
proceedings volumes, but not papers that have already appeared in print. Except for papers by our external
faculty, papers must be based on work done at SFI, inspired by an invited visit to or collaboration at SFI, or
funded by an SFI grant.
©NOTICE: This working paper is included by permission of the contributing author(s) as a means to ensure
timely distribution of the scholarly and technical work on a non-‐commercial basis. Copyright and all rights
therein are maintained by the author(s). It is understood that all persons copying this information will
adhere to the terms and constraints invoked by each author's copyright. These works may be reposted
only with the explicit permission of the copyright holder.
www.santafe.edu
SANTA FE INSTITUTE
Linked Activity Spaces: Embedding Social Networks in Urban Space
Yaoli Wanga,b, Chaogui Kangc,d, Luis M. A. Bettencourtb, Yu Liuc and Clio Andrisb*

a. Dept. of Geography, 120K, Geog-‐Geol Building, University of Georgia, Athens, GA30602
b. Santa Fe Institute, 1399 Hyde Park Rd. Santa Fe NM 87501
c. Institute of Remote Sensing and Geographical Information Systems, Peking University, Beijing
100871, China
d. MIT Senseable City Laboratory, 9-‐209, 77 Massachusetts Ave. Cambridge MA 02139
*Corresponding author
Keywords: call data records, interpersonal relationships, mobile phones, pervasive computing,
geography, GIS, urban planning

Abstract
We examine the likelihood that a pair of sustained telephone contacts (e.g. friends, family,
professional contacts) use the city similarly. Using call data records from an undisclosed city in
China, we define a proxy for the daily activity spaces of each individual subscriber by
interpolating the points of geo-‐located cell towers he or she uses most frequently. We calculate
the overlap of pairs of linked activity spaces, (e.g. the pairs of space between two established
telephone contacts) and find that friends and second degree friends are more likely to overlap
than random pairs. We find that higher degree users and users with many network triangles
(connected groups of three nodes) tend to congregate in the central business district. We also
find that the downtown area hosts many heterogeneous modular communities of social groups
(derived from telephone calls), but that two distinct neighborhoods contain distinct social
clusters and act like a boundary around these friendships. We connect our findings with the role
of social capital in urban planning.

Book chapter proposal submission: Computational Approaches for Urban Environments, 2013

I Introduction
In this chapter, we present a methodology that can help elucidate how groups of friends, family
and professional contacts use the city. We know that cities are comprised of two interacting
components, social networks and physical infrastructure, and that the social dynamics of
encounters in urban space form the backbone of city life (Bettencourt 2013). Yet our ability to
model social networks in urban spaces is very limited. This presents a problem because often our
behavior results from the influence of others (Salganik and Watts 2008). Similarly, our social ties
affect how we use the city: where we choose to meet, live and work.
It is paradoxical whether a citizen reaps peak benefits of the city (such as social support, quality
of life and economic opportunities) (Bettencourt et al 2007, Bettencourt 2013) by having his or
her contacts nearby or dispersed. At one extreme, dispersed contacts can expose the ego to new
neighborhoods and a variety of urban knowledge (such as finding the quickest post office, the
best doctor or an exciting new restaurant), due to their variety of experiences in diverse parts of
a city. Yet, it may be more difficult to meet with dispersed contacts, and having friends in
disparate parts of the city is likely to form a social network where one has few friends “in
common” with other friends, which can be a key strength of social networks.
At the other extreme, a socially-‐tight neighborhood forms trusted bonds through multiple
channels of social validation (Centola and Macy 2007) and through increased outdoor exposure
and neighborhood institutions such as local schools. Proximal social contacts can meet
conveniently, benefiting elderly and the mobility-‐challenged and, in some cases, poorer or
immigrant communities who likely rely on friends and family for help with amenities such as
2


child-‐care. Yet, in these enclosed neighborhoods, information and social capital from other parts
of the city may be less accessible (Granovetter 2005) resulting in missing or unsupportive social
systems across the same city (Granovetter 1973).
Do citizens seem take advantage of the benefits, and bear the drawbacks, of spatially-‐dispersed
social contacts or spatially-‐clustered social contacts? In order to answer this question, we must
model the social network of egos within the built environment.
The state of social-‐spatial modeling
There is no way to extricate social life from the built environment, yet the methods available to
understand the dispersion or clustering of a set of individual’s social networks in geography are
limited, as social network and urban spatial models have matured in separate domains, and are
analysed in separate spheres (Andris 2011). Network methods are also rarely used by those who
study city form (Sevtsuk and Mekonnen 2012). Social networks represent influences and social
capital as graph configurations of nodes (agents) and links (e.g. edges) between nodes that
represent connections between agents. Alternatively, spatial (e.g. Geographic Information
Systems (GIS)) models are represented in a contiguous topological plane, where adjacency and
proximity links entities, such as commercial buildings. These entities can be represented as
polygons or lines and unlike in the social network model, direct connections (e.g. links) are rarely
made between these entities.
As a result, social/spatial phenomena are examined separately by those inclined towards
computational sociologists or geography who analyze a social network or a GIS model,
respectively. One example is the study of obesity, where social networks (Christakis and Fowler
3


2008) and city form (Papas et al 2007) are examined as causal factors. These mechanisms may
be factors separately, but it may be how social ties influence choices within the confines of the
built environment that best explain this health problem.. Similarly, research showing how
students use a college campus in space and time via WiFi usage makes strides in examining the
flexibility of meeting places due to mobile computing from an architectural standpoint, but could
be extended to assess social gatherings in time and place (Sevtsuk and Ratti 2005, Sevtsuk et al
2009). Concurrently, on the same college campus, Eagle et al (2009) show the temporal social
patterns of dyadic (pairwise) relationships in terms of calls, SMS messages and co-‐location. This
equally ground-‐breaking study alludes to the role of the campus in providing space for social
groups and pairs, but without a spatial analysis of where dyads meet, where they travel on the
campus and how these factors correlate with friendship ties.
This is not to say that datasets on interpersonal communication and movement have not been
embedded into geography; analyses of inter-‐place networks of social flows such as commodities,
telecommunications, migration and commuting are common in computational urban research
(examples abound). Yet, these represent place-‐to-‐place aggregate flows instead of person-‐to-‐
person flows, and thus do not include the decisions of individuals. Small-‐scale examples of
spatially-‐embedded social networks include topics of gang membership (Radil et al 2010,
Papachristos et al 2013), transportation (Frei and Axhausen 2011 and Arentze et al 2012) and
epidemiology (Emch et al 2008). We take these initiatives a step further by creating a general
method that can respond to the general nature of human fraternization in a built environment.
These studies can elucidate where and when (different types of) relationships form, and could be
4


used to advise architects, urban and transportation planners in creating places that support
social connectivity.
In working towards this goal, those looking to examine social/spatial problems are aided by the
recent proliferation of large datasets evidencing human social contact and movement (such as
GPS or cell tower usage records) in the city (Reades et al 2007). The integration of human
movement and activity data, such as information from GPS traces (Gao et al 2013), check-‐in data
(Cho et al 2011), online social networks (Scellato et al 2010) and photo-‐sharing sites (Crandall et
al 2011, Girardin et al 2008, Sun et al 2013), into urban models show how humans use the built
environment. Specifically, the use of mobile telephone calls to understand city usage patterns
are becoming a cornerstone of modern urban informatics, planning and transportation (Ratti et
al 2006). We take advantage of mobile telephone call data to test our research questions about
local versus dispersed social ties in the city.
Linked activity spaces
We leverage cell phone call data records (CDRs) to model friendships (i.e. interpersonal
relationships) as a social network, inferred by the frequency calls between of two agents, and
the set of locations (i.e. activity space) of each member of the social network within the city. A
pair of activity spaces of an ego and alter are called linked activity spaces (LAS) if the ego and
alter are friends (i.e. contacts) in the data set. The two activity spaces of friends are manifested
as two polygons in a Geographic Information System (GIS) context, and spatially analyzed for
similarity, via the number of ‘third places’ shared among the pair (Rosenbaum 2006). Moreover,
5


we analyze the social network as a whole to find whether high-‐degree egos (those with many
friends), triangles (groups of three agents) and communities use the city in significant ways.
We have four main hypotheses for the analysis of LAS. (I) We expect that friends’ activity spaces
will overlap more often than a random pair of activity spaces, indicating that friends use the city
more similarly than a random pair of people. (II) We also hypothesize that egos with high
degrees or high clustering coefficients (Jackson 2010) will use be inclined toward the city center,
as this denser environment tends to have more meeting places, diverse services, commercial
areas and nightlife. (III) In terms of city form and groups, we believe that central areas will play
an enhanced role in supporting “clique-‐like” and modular groups instead of being a mixing pot
for many groups. We expect the downtown area to host tight-‐knit social groups who do not
venture to the suburbs often. (IV) Finally, we expect that peripheral areas will be more mixed
since the automobile may be a more popular mode of transportation, and may not limit visits to
those accessible via walking distance.
This chapter proceeds as follows. We first describe study area and the subsetting of the CDR
dataset. We then describe in the methods section how to delineate each user’s activity spaces
from these points. We then analyze how linked activity spaces (LAS) are spatially correlated in an
urban environment by shared points of interest (POIs). We conclude with a discussion of the
usefulness of this method, its drawbacks, potential applications and future work.
II Study area and dataset

Our study area is city located in northeastern China, with population 2.5 million (est. 2010). This
industrial city serves as a producer of wood pulp and newsprint and participates in the global
6


economy via a thriving international trade harbor. The city is nearly 18 by 10 kilometers and on
average citizens travel 1 kilometer in a day (Kang et al 2012). We focus on calls made within the
city area and exclude long-‐distance calls. We use a CDR (call data record) dataset of mobile cell
phone calls, from a mobile phone provider in China.
Dataset and sampling
The original CDR dataset is comprised of nearly 424,000 users’ activity for 31 days. Users are
anonymized in the dataset. All users combined make an average of 1,600,000 calls daily. In the
31 day timespan, each user participates in an average of 328 calls for a total duration of 6.15
hours. Each record of a mobile phone call contains the start time, call duration and locations of
the caller and receiver. The locations are geo-‐referenced to one of 96 cell towers closest to the
mobile phone’s location (Table 1). The dataset does not include text messages (e.g. SMS).
Table 1: Call Data Record (CDR) variables with original data fields (top).A social
network and spatial data summary table are listed in the middle and bottom
rows, respectively.
Caller
ORIGINAL Receiver
Caller Receiver Location (x, Start time Duration
TABLE Location (x, y)
y)
Social Number of Total

User1 User2 -‐ -‐
Network Calls Duration
Spatial Location (x,
User1 Location (x, y) Duration
Patterns y)

We process the dataset into two parts, a social network of agents (social network in Table 1) and
the activity spaces of each agent (spatial patterns in Table 1). We filter the network by including
only those who use at least 3 cell towers during the study period in order to eliminate users who
7


may be confined to home and thus interact with the city differently than a typical mobile user.
Also, an individual may have multiple mobile phones, and a phone with fewer than three cell
towers used may represent a “secondary” or less-‐frequently-‐used device. In the social network,
the number of calls is determined between a unique pair of users and duration is the sum of call
time between the two users. We represent the network as undirected in order to reflect each
member’s inclination to participate in the conversation regardless of the initiator. In other
words, records showing that A calls B, or B calls A, are summed to represent a connection
between unique, undirected pair A, B. Each pair must have at least 10 mutual phone calls and 10
minutes of total call duration in the given month in order to eliminate non-‐friend calls such as
sales calls, as these do not represent persistent relationships.
The spatial patterns table inventories the locations of each user to geo-‐locate a pair of callers in
the social network. The coordinates of the cell phone tower where a user places or receives a
call are summed and weighted by the number of calls the user places or receives at that cell
tower location. We use the resulting set of weighted locations to represent the user’s geographic
activity pattern (such as Carrasco et al 2006), which are known to capture anchor points
(Golledge 1999) such as the home and workplace (or school), as they are the most visited
locations for the average traveler and thus, frequent calling points (Schönfelder and Axhausen
2003).
Social network and geographic characteristics
Because the universal CDR dataset is very large, we draw a sample from the original dataset by
8


selecting a random sample of 150 “seed” users, and query for their contacts (first degree ties),
their second degree and third degree ties. The 150 seed users are connected to 248 first degree,
1,674 second degree and 6,159 third degree ties, totaling 8,231 individual users, and 47,297
friendship links between the users. On average, each user has 44 network friends (distribution in
Figure 1) and participates in a total of 328 calls for 6.15 hours during the study period.
Figure 1: The degree distribution of the universal network is right-‐skewed with mean 44.
Since the network is configured in a step-‐wise process, where 150 “seed” users are randomly
selected, the remainder of the network grows from the relationships of the 150 users. This
configuration yields a network that is focused on the social interactions of a small sample of
users. As a result, the social network does not fully represent a typical “universal” sample of a
9


social network in degree distribution (such as Albert and Barabási 2002), traversability or density
(Newman and Park 2003). The network has 8,197 nodes and 47,297 links. The degree values for
the core network range from 1 – 344, with a mean degree of 43.8. The average weighted degree
(by the duration of calls in seconds) is 22150, where the degree is weighted by number of
seconds of phone call activity. The diameter of the network is 12. Clustering coefficient values
range from 0 – 1, with a mean coefficient of 0.11.The network is visualized in Figure 2.
10

Figure 2: The core network is visualized with a force-‐directed method that places nodes in
feature space based on their density of linkages in Gephi (Bastian et al 2009). Larger and more
purple nodes denote higher degree (max degree= 287) and small orange nodes indicate smaller
degree (minimum degree = 1).
11


Spatial Distribution There are 96 cell phone towers in our dataset, most of which are clustered in the
downtown area. The spatial granularity of CDR datasets reflects the distributions of cell towers, though
these are scattered unevenly throughout the city, as we see in Figures 3A and 3B. The towers’ actual
locations are randomly offset for security and proprietary reasons. The caller’s actual location is
estimated in a larger radius around the tower so the tower’s slightly offset location does not significantly
add to the dataset’s natural spatial noise. It is also possible that a caller’s call is routed to a cell tower that
is not the closest to him or her, as the closest cell tower may be saturated with calls or out of service, as
indicated by the heterogeneity of usage statistics in the downtown (Figures 3A, 3B). This inconsistency is
not expected to detrimentally affect the data analysis as many of the heterogeneous towers are in the
downtown, where reliable nearby towers can help delineate a user’s activity space. It also becomes clear
that some towers are used by many subscribers, while others are used by few (Figure 3C).
12


Figure 3A: A map of calls transferred (e.g. routed) through cell towers show that some towers
host many calls while adjacent towers host very few.
Figure 3B: A map of cell towers by user popularity illustrates that some cell towers are used by
50% of all users, where others are used by less than 1% of the population. Given the clustered
spatial distribution of towers that vary in usage, many callers could be nearest low usage towers,
but these towers are administratively less adroit to host a call, and their call is routed instead to
more fertile nearby tower.
13

Figure 3C: A rank-‐size distribution shows each tower’s popularity with users. Each of the 10 most
popular cell phone towers (x axis) is used by at least 20% of the population whereas the 40th-‐91st
most popular towers each is used by less than 10% of the population.

III Method
Creating Activity Spaces To represent how the user moves in the city (e.g. Axhausen et al
2002, Axhausen 2007) we build activity spaces that likely encompass a user’s home, work and
“third places” (Ahas et al 2009, Schneider et al 2013). We choose a polygon method in order to
represent the area surrounding the cell towers where the user is likely to be found, since he or
she uses the nearby towers.
14


These activity spaces summarize a user’s set of frequently-‐visited points (e.g. cell towers) in an
ellipse-‐shaped polygon that encapsulates 68% (i.e. one standard deviation) of points visited by
capturing points that are concentrated in the center and neglecting sparse points in the
periphery (as in Carrasco et al 2006). Ellipses are first centered on the mean geometric center of
a user’s tower locations (mean of x coordinates and y coordinates; repeating values are allowed
if a user visits towers more than once). The value of the standard deviation is calculated for all x
coordinates to obtain an axis, and y coordinates to obtain a second, perpendicular axis. The
ellipse is tilted in a direction that captures the major axis (long edge) of the distribution (see
Mitchell 2005).
As mentioned, the ellipse does not typically encompass all visited towers and excludes those not
frequently visited, encapsulating daily activity space (Figure 4A). Instead, it captures the essence
of the central tendency, dispersion and direction of the user’s travel patterns without including
anomaly cell tower usage (such as a traveler’s phone call from the airport).
Assessing Similarly of Activity Spaces After each individual is assigned an activity space,
we quantify the similarity between an ego’s and an alter’s activity spaces. A method for finding
whether two activity spaces are similar is not straightforward. The percent overlap between two
activity spaces will not account for how much physical area two friends’ spaces share.
Additionally, using the area that two activity spaces share does not tell us how big their spaces
are, i.e. whether this shared area is actually “convenient” relative to their whole activity spaces.
Also, with these two methods, we will not be able to understand what amenities and places for
meeting each individual has in his or her activity space. We overcome these drawbacks by
15


creating a third layer of relevant points, as suggested by geometric probability theory (Santalo
2004).
Agents’ activity spaces are qualified by the Points of Interest (POIs) they spatially intersect
(Figure 3C). These POIs are where ‘optional’ activities are likely to occur (Gehl 1987) and include
services, transport and recreational areas as a proxy for how agents use the city. POIs are good
landmarks for social interactions, as third places and the activities performed in third places,
have been shown to be essential for relationships, social health and quality of life (Rosenbaum
2006). Friends may visit POIs to dine or to do business.
POIs are selected and digitized from Google Maps (Google 2013) with data provided by Auto-‐
Navi (2011) and analyzed in the ESRI ArcMap 10.1 environment (ESRI 2012). POIs include the
city’s recreation spots (including parks, internet cafés, personal wellness centers), commercial
centers (restaurants and bars, stores, markets and shopping centers), public services and
institutions (hospitals, post offices, police stations), transportation centers (airports, train
stations), and named villages in the exurban area. POIs of similar types (e.g. restaurants) in a 50
meter radius are grouped into a single POI to eliminate redundant information.
As mentioned, two activity spaces are considered linked if they are connected via the social
network. These friendships are embedded in geographic space through the two activity spaces.
Each unique pair of linked activity spaces is compared by the number common of POIs shared by
the two activity spaces (in absolute number and percent of the user’s total POIs). In Figure 4C,
two pairs of linked activity space ellipses and share few POIs (red pair) or many POIs (yellow
16


pair). We use a statistical t-‐test to compare friends’ typical number of shared POIs versus that of
a random pair of users. Our hypothesis is that friends share more POI points than random pairs.
Figure 4A (Top Left): An individual activity space is denoted as an ellipse and captures a user’s
most frequently-‐used towers. Figure 4B (Upper Right): Two single activity space ellipses intersect
underlying Points of Interest (POIs). The green ellipse intersects more POIs than the pink ellipse.
Figure 4C (Lower Left): Two pairs of linked activity spaces, in red and yellow, show that the
yellow pair intersects more common POIs. Figure 4D (Lower Right): The activity space of a focal
17


user (in pink) is overlaid alongside the activity spaces of her frequently-‐contacted friends (black).
The pattern shows that the focal user and friends use the downtown area.
IV Results
In this section, we respond to our initial hypotheses regarding how dyads (pairs), social network
personas, triads (groups of three), and communities use the city. First, we find that friends tend
to use the city more frequently than a random pair. Second, we also find that egos with high
degrees are inclined toward the city center, while ego’s with high clustering coefficients show no
significant spatial correlation. Third, counter to our hypothesis, central areas are found to be a
mixing pot for many groups and clusters, many of whom frequent the inner city and the suburbs.
We do, however, find two specific neighborhoods that tend to harbor enclosed (‘clique-‐like’)
social groups. Finally, also counter to our initial hypothesis, peripheral areas show less frequent
mixing of social groups and friends than any other part of the city.
Dyadic Relationships
We find, as hypothesized, that friends’ activity spaces will overlap more often than a random pair
of activity spaces, indicating that friends use the city more similarly than a random pair of
people.
The percentages of friends and of random pairs found to share no POIs are 0.50% and 0.11%,
respectively. Of pairs who share POIs, friend pairs share an average of 55.8 POIs and a random
pair shares 45.77 out of 212 total possible POIs (Figure 5A). A t-‐test using the ellipse activity-‐
18


spacespace method yields a t-‐value of 56.03, (degree of freedom (df)) = 44,706, p-‐value < 0.001),
and allows us to reject the null hypothesis that the mean difference between these two groups is
insignificant, indicating that friends share more POIs than random pairs.
Second degree friends also utilize urban infrastructure significantly more similarly, in terms of
shared POIs, than random users (Figure 5B). The t-‐value is 94.82 (df = 152,070, p-‐value < 0.001).
The count of POIs shared by second degree friends on average is 55.23.
Figure 5A (Left): The distribution of frequency of shared POIs in the linked activity spaces of a
random pair (in blue) and of a pair of friends (in red) depicts a similar rate or frequency of shared
POIs, though direct friends have more shared POIs on average. Friends also have higher
frequencies of sharing POIs (depicted by red bars above blue bars). We exclude pairs (random or
linked) who share no common POIs. Figure 5B: The distribution of second-‐degree linked activity
spaces shows that second degree friends share more POI’s on average than random pairs.
Additionally, it is rare that an ego’s alters will visit a POI that the ego herself does not visit. The
average proportion of total egos who have visited a specific POI (e.g. “Flower Park”) is
19


proportional to the average percentage of their friends who have also visited the POI. For
example, consider a group of 100 egos, each of whom has 50 alters (summing to a total of 5,000
friends). If an average of 50% of alters (25 alters) per ego have visited a certain POI, then about
50% of egos (50 egos) will have visited there as well. If the average at another park is 20% of all
alters (10 alters), then about 20% of egos (20 egos) will have visited this park as well. The
proportion of egos who have visited a POI is 1.036 times the number of total alters who have
visited the POI, with an r2 correlation of 0.9684. This means there are few, if any, POIs where
one frequents and her friends do not frequent. Conversely, there are also few POIs where one
does not frequent yet a high percentage of her friends visit often.
Social Personas
In traditional social network analysis, the role that a user plays in the network is used to define
her importance and prominence in various facets of social life, such as information about new
job opportunities. For instance, a central figure in a social network (i.e. a figure with many
friends, or who is a “common friend” between poorly-‐connected groups) can be identified
through social network metrics such as betweenness centrality, degree centrality, or brokerage
statistics (Jackson 2010). It has been shown that those with higher network centrality live in
more central places on the Euclidean grid of longitude and latitude for the network (Onnela et al
2011).
Confirming our second hypothesis, we find that high degree users use the city center more often
than expected, given a random set of users. The top 1% high degree users (each with 150 or
more friends) concentrate at the urban center (Figure 6). A Fisher-‐Snedecor test (F-‐test) of
20


ANOVA yields a p value of 0.006 (95% confidence interval) signifying that the spatial variance
between the high-‐degree users’ activity spaces is significantly lower than the spatial variance
between activity spaces of the universal population. This result illustrates high degree agents’
proclivity towards high-‐density areas that are shown to be more innovative, dynamic and
energetic environments (Bettencourt 2013).
Figure 6. Spatial distribution of activity spaces in the city shows that high degree users generally
congregate near the city center.
We do not find significant spatial patterning with ego characteristics such as clustering
coefficient (Jackson 2010). Users with high centrality measures, such as betweenness centrality,
21


congregate in the central east side of the city. Because we initially sampled 150 seed users, the
centrality measures may privilege these egos, and would not fully characterize users.
Additionally, the relationship between degree and total call duration (Table 2) is minimal,
indicating that having more contacts does not lead to more time talking. Generally, users with
more contacts make shorter calls, while those with fewer friends speak for longer. We also find
no correlation between degree or phone activity and area of activity space.
Table 2: Spearman correlation analysis results

Degree Number of Calls Duration
Number of Calls 0.7803
Total Call Duration .0252 0.0120
Area of Activity Space -‐0.0085 .0003 -‐.0.0032

Community and City Form
We hypothesized that central areas play an enhanced role in supporting social communities of
friends. The results do not confirm this hypothesis; the central area of the city does not favor
tight-‐knit social circles but hosts heterogeneous groups of friends.
We define “groups” of friends (i.e. communities) in the Gephi environment with Blondel et al’s
(2008) modularity algorithm. This process produces 58 clusters with a modularity quotient of
0.732, indicating that, on average, those assigned to a cluster call within the same cluster 73.2%
of the time.
Each network agent (node) is assigned one modularity group. These agents are denoted by their
ellipse centroid (geographic centers) (Figure 7A). Agents denoted by yellow or teal (Figure 7A)
22


are examples from two social network clusters with significant spatial clusters that differ from
the overall distribution, deviating westward and north-‐eastward, respectively. The yellow and
teal clusters are examples of spatially-‐embedded social groups that are significantly clustered in
a way that deviates from the expected spatial distribution of agents the city.
Including these two examples, the spatial distribution of 58 social clusters, in total, do form
statistically significant “hot spots” (significantly dense clusters) and “cold spots”: a mixture of
modular groups (Getis-‐Ord Gi* statistic (Getis and Ord 1992), red and blue areas have P values
>= .0001). Hot spots (dark red areas in Figure 7A) contain agents of the same modular group in
two major regions. Cold spots (blue areas in Figure 7A) cover the downtown, signifying that the
groups that frequent the downtown are not clustered in the downtown, but have other group
members around the city.
23


Figure 7A. A modularity algorithm is applied to the “aspatial” social network. Two clusters are
now mapped in yellow and teal to show how social network groups use the city. Tight, insular,
social groups cluster in some parts of the city (in dark red), and show no clear pattern in others
(yellow, light red). In the downtown core, covered in blue, clusters are significantly mixed,
meaning that many social groups use the downtown but are not confined to the region.
Another prominent pattern of community configuration, at a more local scale, is the prevalence
of social triangles in the network. A social triangle can be defined as a group three nodes who
connect to one another (Latapy 2008), and pragmatically, will have meeting needs that are
different and more complex than those of a dyad but perhaps not as complex as a modular
24


group, which can contain many nodes. In our dataset, agents with the most social triangles
cluster in the downtown area. However, this may be an artifact of the high-‐degree users
downtown, as they are likely to have more social triangles.
The distribution of high-‐triangle nodes (denoted in green and yellow in Figure 7B) follows a
series of parallel roads downtown. Those with the highest number of social triangles (green
activity space centroids) form a tight linear cluster in the core area. This pattern is statistically
significant, showing high Getis-‐Ord GI* Z scores with P values > .0001 in the area slightly east of
the downtown (in red, Figure 7B). P values are not significant in other parts of the city.
Interestingly, centroids covering this neighborhood also saw the most significant modularity
clusters (Figure 7A).
25

Figure 7B. A clustering of agents who are a part of many social triangles (i.e. groups of three
friends) congregate towards the urban core (green and yellow triangles). This clustering is
statistically significant in one area east of downtown (denoted by red), via statistical Gi* Z scores.
We also hypothesized that peripheral areas may be more mixed. We reject this hypothesis as the
peripheral areas seem to host more cohesive communities than the downtown area. The POIs
shared by linked activity spaces are more frequently on the periphery of the city. A POI can have
from .01%-‐2.8% of all its pairs shown to be friends. Higher values represent places where linked
activity spaces frequent, and lower values, where random pairs frequent. The POI hosting 2.8%
linked activity spaces is located near many popular hotels and the city’s largest park—a notable
26


tourist site in the city. This site might be a popular meeting spot for friends, but it may also be
the artifact of many local business calls to one another in nearby buildings.
More generally, this ratio increases further from the city center (Figures 8A and 8B), so that POIs
on the outskirts of the city are more likely to host pairs of contacts. This may be because many
people have activity spaces that stretch into the downtown for work, but have their social
contacts closer to their residences in another part of the city. Though agents with higher degree
tend to frequent the dense downtown, given a POI with 100 random people, one is more likely
to friend the suburban areas. In other words, although the downtown core attracts more people,
the majority seem to be mutual strangers, while in the suburban area, POIs serve as intentional
meeting points. We interpret this finding with care, as there are fewer POI points on the
periphery of the city, thus reducing the granularity and precision to capture activity spaces found
in the city’s outskirts. For instance, an activity space in the shape of a narrow line can be
captured via the dense, granular points in downtown, but such a detailed structure could not be
found in the periphery, since there are so few cell towers and POIs to delineate a more complex
polygon.
27

Figure 8A: As the POIs are located farther away from the city center, there is a higher ratio of
friends to not-‐friend pairs using these points.
28

Figure 8B: The ratio of POI usage for friends vs. a random pairs shows that friends are more likely
to use the same POIs if they live on the periphery of the city. These POIs are denoted in red,
where up to 2.8% of their usage is linked with friendship.
In summary, friends and even second-‐degree friends tend to use common points of interest in
the city. Additionally, high degree users (i.e. those with many friends) tend to congregate
towards the downtown (central business district), but that those with many social triangle
friendships center in a neighborhood east of the central business district. The downtown core
hosts many heterogeneous social groups instead of small social clusters.
V Discussion, research challenges and conclusion

29


Discussion
In summary, we leverage social-‐spatial data from call data records in a new way that emphasizes
social relationships embedded in urban physical space. We find that the density (clustering) of
social ties deviates from the density users in the city.
It is clear that we are only at the beginning of understanding how interpersonal relationships
manifest themselves in the built environment. Yet it is a phenomena we experience daily as we
meet colleagues at work, family at home, and perhaps, friends in third places. These dueling
variables also guide our decisions about raising families through the choice of neighborhoods
and school districts, as well as migrating through the choice of leaving established social circles
for new circles (or vice versa).
We know that a healthy city is built on strong networks (Gilchrist 2009), but we still do not know
what kinds of ties exist in cities and neighborhoods, and we cannot see these networks in detail.
We cannot currently use type of social network configuration (clustered, decentralized, hub-‐
spoke, etc.) as a cause or effect variable in in assessing planning choices for new or existing
neighborhoods and cities. Yet, they certainly are a cause and effect variable.
Our ability to socialize with others is affected by urban planning and government decisions
regarding low-‐income housing, immigration reform, and health codes, such as number of people
to an urban residence as well as building subdivisions versus condominiums (Farber and Li 2013)
and narrow versus wide roads (Montgomery 2013). The spatial and social clustering of ties change
with the creation and dissolution of institutions: such as firms, universities, military bases, sports
franchises and religious institutions. Less socialization may also lead to stagnated mobility, not
30


due to lack of accessibility, but to a reduced need for third places to socialize (Rosenbaum 2006),
and visit others’ homes.
Urban planners, geographers, government officials, civil engineers and transportation planners
that focus on improving social life in their city can improve residents’ quality of life (Cacioppo
and Patrick 2008), more so than traditional economic stimuli, such as an increase in income
(Montgomery 2013). Instead, planners and geographers have worked towards better urban
environments investigating residents’ accessibility to amenities (such as hospitals), travel time to
work, and social justice issues such as susceptibility to industrial and environmental hazards
(Cutter et al 2003). In addition to these variables, we should continue to gage social capital
within the city.
We cannot infer these social patterns from city form, or social networks alone. There must be a
combination, as currently there is no one-‐to-‐one ratio that links a certain type of urban fabric to
a certain type of social network. Thus, our task with this chapter is to illustrate the linked activity
space method that allows for the integration of information from a social network into
geographic space, a combination that is rarely perused (Andris 2011). Although we use a call
dataset record (CDR) for our analysis, this method can be employed to any dataset that has both
evidence of social ties between agents and the geo-‐location of the agents.
We do find a number of methodological and pragmatic challenges to this type of research. Many
of these challenges stem from the nascent state of big data analysis that will perhaps become
more fluid in the future, with better collection methods. Nevertheless, there are issues with
these data that can be addressed: CDR datasets do not capture an ego with alters who do not
31


appear in the social network. Multiple cell phones per person and multiple people per cell
phones do not ensure that the telephone number is a proxy for an individual’s communication
patterns. Without figures on the provider’s market penetration rate is difficult to understand
friendships via calls to users who use a different provider. We also note a number of subjective
decisions in creating a meaningful sample, such as the minimum number of towers frequented in
order to be included in the dataset, the number of seed users, and number of friend “levels” to
draw from the networks. Future endeavors include accounting for evening, holiday and weekend
dynamics, where social life may be different than a traditional workday, and the use of this
method in larger and smaller cities. We hope to see more research on the integration of social
networks and urban areas in the future.
Acknowledgements
Thank you to Joe Hand for discussions and revisions. This research was partially supported by the
Army Research Office (grant no. W911NF-‐121-‐0097), John Templeton Foundation (grant no.
15705), the Bill and Melinda Gates Foundation (grant no. OPP1076282), the Rockefeller
Foundation, the James S. McDonnell Foundation (grant no. 220020195), the National Science
Foundation (grant no. 103522) a gift from the Bryan J. and June B. Zwan Foundation and the
University of Georgia.
List of Abbreviations
Call Data Records (CDRs)
Environmental Systems Research Institute (ESRI)
32


Geographic Information System(s) (GIS)
Linked Activity Spaces (LAS)
Points of Interest (POIs)
References
1. Ahas, R., Silm, S., Saluveer, E. and Järv, O. (2009) Modelling home and work locations of
populations using passive mobile positioning data. Location Based Services and
TeleCartography II Lecture Notes in Geoinformation and Cartography, Section III Ch. 18.
2. Albert, R., and Barabási, A. L. (2002) Statistical mechanics of complex networks. Reviews
of Modern Physics, 74(1), 47.
3. Andris, C. (2011) Methods and metrics for social distance. Doctoral Dissertation, MIT,
Dept. of Urban Studies and Planning.
4. Arentze,T., van den Berg, P. and Timmermans, H. (2012) Modeling social networks in
geographic space: approach and empirical application" Environment and Planning A,
44(5), 1101–1120.
5. Axhausen, K. W., Zimmermann, A., Schönfelder, S., Rindsfüser, G. and Haupt, T. (2002)
Observing the rhythms of daily life: A six-‐week travel diary. Transportation, 29(2), 95-‐124.
6. Axhausen, K. W. (2007) Activity spaces, biographies, social networks and their welfare
gains and externalities: Some hypotheses and empirical results. Mobilities, 2(1), 15-‐36.
7. Bastian, M., Heymann, S. and Jacomy, M. (2009) Gephi: an open source software for
exploring and manipulating networks. In: Proceedings of the International conference on
weblogs and social media (ICWSM) 17–20 May, San Jose, California.
33


8. Bettencourt, L. M. A., Lobo, J., Helbing, D., Kühnert, C. and West, G. B. (2007) Growth,
innovation, scaling, and the pace of life in cities. Proceedings of the National Academy of
Sciences, 104(17), 7301-‐7306.
9. Bettencourt, L. M. A. (2013) The Origins of Scaling in Cities. Science, 340(6139), 1438–
1441.
10. Blondel, V., Guillaume, J-‐L. Lambiotte, R. and Lefebvre, E. (2008) Fast unfolding of
communities in large networks. Journal of Statistical Mechanics: Theory and Experiment
(10), 1000.
11. Cacioppo, J. T. and Patrick, W. (2008) Loneliness: Human nature and the need for social
connection. New York: WW Norton & Company.
12. Carrasco, J.A., Miller, E. and Wellman, B. (2006) Spatial and Social Networks: The case of
travel for social activities. 11th International Conference on Travel Behavior Research. 16-‐
20 August, Kyoto, Japan.
13. Centola, D. and Macy, M. (2007) Complex contagions and the weakness of long ties1.
American Journal of Sociology, 113(3), 702-‐734.
14. Christakis, N. and Fowler, J. (2007) The spread of obesity in a large social network over 32
years. New England Journal of Medicine, 357(4), 370-‐379.
15. Cho, E. Myers, S. A. and Leskovec. J. (2011) Friendship and mobility: User movement in
location-‐based social networks. In Proceedings of the 17th ACM SIGKDD International
Conference on Knowledge Discovery and Data Mining (KDD). 21 – 24 August, San Diego,
California. 1082-‐1090.
34


16. Crandall, D. J., Backstrom, L., Cosley, D., Suri, S., Huttenlocher, D. and Kleinberg, J. (2010)
Inferring social ties from geographic coincidences. Proceedings of the National Academy
of Sciences, 107(52), 22436-‐22441.
17. Cutter, S. L., Boruff, B. J., & Shirley, W. L. (2003). Social vulnerability to environmental
hazards*. Social science quarterly, 84(2), 242-‐261.
18. De Montis, A,, Chessa, A., Campagna, M., Caschili, S. and Deplano, G. (2010) Modeling
commuting systems through a complex network analysis: A study of the Italian islands of
Sardinia and Sicily. The Journal of Transport and Land Use, 2(3), 39-‐55.
19. Eagle, N., Pentland, A. S., and Lazer, D. (2009) Inferring friendship network structure by
using mobile phone data. Proceedings of the National Academy of Sciences, 106(36),
15274-‐15278.
20. Emch, M., Root, E. D., Giebultowicz, S., Ali, M., Perez-‐Heydrich, C. and Yunus, M. (2012)
Integration of spatial and social network analysis in disease transmission studies. Annals
of the Association of American Geographers, 102(5), 1004-‐1015.
21. Farber, S. and Li, X. (2013) Urban sprawl and social interaction potential: an empirical
analysis of large metropolitan regions in the United States. Journal of Transport
Geography (31) 267-‐277
22. Frei, A. and Axhausen, K.W. (2011) Modeling spatial embedded social networks, Working
paper 685: Transport and Spatial Planning, IVT, ETH Zurich, Zurich.
23. Gao, S., Wang, Y., Gao, Y. and Liu, Y. (2013) Understanding urban traffic-‐flow
characteristics: a rethinking of betweenness centrality. Environment and Planning B:
Planning and Design, 40(1), 135–153.
35


24. Gehl, J. (1987) Life between buildings: Using public space. New York: Van Nostrand
Reinhold. Trans: Jo Koch.
25. Getis, A. and Ord, J.K. (1992) The analysis of spatial association by use of distance
statistics. Geographical Analysis 24(3).
26. Gilchrist, A. (2009) The well-‐connected community: A networking approach to
community development, second edition. London: Policy Press.
27. Girardin, F., Calabrese, F., Fiore, F. D., Ratti, C., and Blat, J. (2008) Digital footprinting:
Uncovering tourists with user-‐generated content. IEEE Pervasive Computing, 7(4), 36-‐43.
28. Golledge, R. G. (1999) Human wayfinding and cognitive maps. In: Wayfinding behavior:
Cognitive mapping and other spatial processes, Ed. R. Golledge. Baltimore: Johns Hopkins
University Press. 5-‐45.
29. Granovetter, M. (1973) The strength of weak ties. American Journal of Sociology,
78(6),1360-‐1380.
30. Granovetter, M. (2005) The impact of social structure on economic outcomes. The
Journal of Economic Perspectives, 19(1), 33-‐50.
31. Jackson, M. O. (2010) Social and economic networks. Princeton University Press.
32. Kang, C., Ma, X., Tong, D., and Liu, Y. (2012) Intra-‐urban human mobility patterns: An
urban morphology perspective. Physica A: Statistical Mechanics and its Applications,
391(4), 1702–1717.
33. Latapy, M. (2008) Main-‐memory triangle computations for very large (sparse (power-‐law))
graphs. Theoretical Computer Science 407 (1-‐3) 458-‐473.
36


34. Mitchell, A. (2005) The ESRI guide to GIS analysis, Volume 2 . Redlands, California: ESRI
Press.
35. Montgomery, C. (2013) Happy city: Transforming our lives through urban design. New
York: Farrar, Straus and Giroux.
36. Newman, M. E., and Park, J. (2003) Why social networks are different from other types of
networks. Physical Review E, 68(3), 036122.
37. Onnela, J. P., Arbesman, S., González, M., Barabási, A. L., and Christakis, N. (2011)
Geographic constraints on social network groups. PLoS one, 6(4), e16939.
38. Papas, M. A., Alberg, A. J., Ewing, R., Helzlsouer, K. J., Gary, T. L. and Klassen, A. C. (2007)
The built environment and obesity. Epidemiologic reviews, 29(1), 129-‐143.
39. Radil, S.M. Flint, C. and Tita, G. (2010) Spatializing social networks: Using social network
analysis to investigate geographies of gang rivalry, territoriality, and violence in Los
Angeles. Annals of the American Association of Geographers, 100(2), 307-‐326.
40. Papachristos, A., Hureau, D. and Braga, A. (2013) The corner and the crew: The influence
of geography and social networks on gang violence. American Sociological Review 78(3)
417–447.
41. Ratti, C., Williams, S., Frenchman, D. and Pulselli, R. M. (2006) Mobile landscapes: using
location data from cell phones for urban analysis. Environment and Planning B: Planning
And Design, 33(5), 727-‐748.
42. Reades, J., Calabrese, F., Sevtsuk, A. and Ratti, C. (2007) Cellular census: Explorations in
urban data collection. IEEE Pervasive Computing, 6(3), 30-‐38.
37


43. Rosenbaum, M. S. (2006) Exploring the social supportive role of third places in
consumers' lives. Journal of Service Research, 9(1), 59-‐72.
44. Salganik, M. J. and Watts, D. J. (2008) Leading the herd astray: An experimental study of
self-‐fulfilling prophecies in an artificial cultural market. Social Psychology Quarterly, 71(4),
338-‐355.
45. Santaló, L. A. (2004) Integral geometry and geometric probability. Cambridge: Cambridge
University Press.
46. Scellato, S., Noulas, A., Lambiotte, R. and Mascolo, C. (2011) Socio-‐spatial properties of
online location-‐based social networks. In Proceedings of the International Conference on
Weblogs and Social Media (ICWSM), 329-‐336.
47. Schneider, C. M., Belik, V., Couronné, T., Smoreda, Z. and González, M. C. (2013)
Unravelling daily human mobility motifs. Journal of the Royal Society Interface, 10(84),
20130246.
48. Schönfelder, S. and Axhausen, K. W. (2003) Activity spaces: Measures of social exclusion?
Activity spaces: Measures of social exclusion? Arbeitsbericht Verkehrs-‐ und Raumplanung,
140, Institut für Verkehrsplanung und Transportsysteme (IVT), ETH Zürich, Zürich.
49. Sevtsuk, A. and Mekonnen, M. (2012) Urban network analysis: A new toolbox for ArcGIS.
Revue Internationale de Géomatique, 22(2), 287-‐305.
50. Sevtsuk, A., Huang, S., Calabrese, F., and Ratti, C. (2009). Mapping the MIT campus in real
time using WiFi. In: Handbook of Research on Urban Informatics: The Practice and
Promise of the Real-‐Time City. Ed. Marcus Foth Hershey, PA: IGI Global. 326-‐337.
38


51. Sevtsuk, A. and Ratti, C. (2005) iSpots. How Wireless Technology Is Changing Life on the
MIT Campus. Proceedings of CUPUM 05: The Ninth International Conference on
Computers in Urban Planning and Urban Management, 29 June -‐ 1 July, London.
52. Sun, Y., Fan, H., Helbich, M. and Zipf, A. (2013) Analyzing human activities through
volunteered geographic information: Using Flickr to analyze spatial and temporal pattern
of tourist accommodation. In: Progress in Location-‐Based Services, Springer Berlin
Heidelberg, 57-‐69.
39

15 07 026 PDF

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

15 07 026 PDF

Uploaded by

Copyright:

Available Formats

Linked Activity Spaces:

Embedding Social Networks in

SFI WORKING PAPER: 2015-07-026

of social capital in urban planning.

systems across the same city (Granovetter 1973).

The state of social-­‐spatial modeling

represent connections between agents. Alternatively, spatial (e.g. Geographic Information

made between these entities.

As a result, social/spatial phenomena are examined separately by those inclined towards

computational sociologists or geography who analyze a social network or a GIS model,

campus and how these factors correlate with friendship ties.

telecommunications, migration and commuting are common in computational urban research

spatially-­‐embedded social networks include topics of gang membership (Radil et al 2010,

local versus dispersed social ties in the city.

Linked activity spaces

those accessible via walking distance.

II Study area and dataset

phone calls, from a mobile phone provider in China.

Dataset and sampling

Social Number of Total

member’s inclination to participate in the conversation regardless of the initiator. In other

sales calls, as these do not represent persistent relationships.

Social network and geographic characteristics

degree (minimum degree = 1).

host many calls while adjacent towers host very few.

more fertile nearby tower.

she uses the nearby towers.

friends, or who is a “common friend” between poorly-­‐connected groups) can be identified

energetic environments (Bettencourt 2013).

congregate near the city center.

Table 2: Spearman correlation analysis results

Community and City Form

tight-­‐knit social circles but hosts heterogeneous groups of friends.

members around the city.

clusters (Figure 7A).

friends to not-­‐friend pairs using these points.

V Discussion, research challenges and conclusion

for new circles (or vice versa).

and visit others’ homes.

within the city.

networks and urban areas in the future.

University of Georgia.

List of Abbreviations

Call Data Records (CDRs)

Environmental Systems Research Institute (ESRI)

Linked Activity Spaces (LAS)

Points of Interest (POIs)

of Modern Physics, 74(1), 47.

Dept. of Urban Studies and Planning.

Sciences, 104(17), 7301-­‐7306.

connection. New York: WW Norton & Company.

20 August, Kyoto, Japan.

American Journal of Sociology, 113(3), 702-­‐734.

years. New England Journal of Medicine, 357(4), 370-­‐379.

of Sciences, 107(52), 22436-­‐22441.

hazards*. Social science quarterly, 84(2), 242-­‐261.

of the Association of American Geographers, 102(5), 1004-­‐1015.

Geography (31) 267-­‐277

characteristics: a rethinking of betweenness centrality. Environment and Planning B:

Planning and Design, 40(1), 135–153.

Reinhold. Trans: Jo Koch.

statistics. Geographical Analysis 24(3).

community development, second edition. London: Policy Press.

The state of social-‐spatial modeling

spatially-‐embedded social networks include topics of gang membership (Radil et al 2010,

friends, or who is a “common friend” between poorly-‐connected groups) can be identified

tight-‐knit social circles but hosts heterogeneous groups of friends.

friends to not-‐friend pairs using these points.

Sciences, 104(17), 7301-‐7306.

American Journal of Sociology, 113(3), 702-‐734.

years. New England Journal of Medicine, 357(4), 370-‐379.

of Sciences, 107(52), 22436-‐22441.

hazards*. Social science quarterly, 84(2), 242-‐261.

of the Association of American Geographers, 102(5), 1004-‐1015.

Geography (31) 267-‐277

University Press. 5-‐45.

Journal of Economic Perspectives, 19(1), 33-‐50.

graphs. Theoretical Computer Science 407 (1-‐3) 458-‐473.

The built environment and obesity. Epidemiologic reviews, 29(1), 129-‐143.

And Design, 33(5), 727-‐748.

urban data collection. IEEE Pervasive Computing, 6(3), 30-‐38.

consumers' lives. Journal of Service Research, 9(1), 59-‐72.

Weblogs and Social Media (ICWSM), 329-‐336.

Revue Internationale de Géomatique, 22(2), 287-‐305.

of tourist accommodation. In: Progress in Location-‐Based Services, Springer Berlin