Professional Documents
Culture Documents
Fallsem2019-20 Cse3021 Eth Vl2019201006662 Reference Material I 24-Jul-2019 Sin m1 Full
Fallsem2019-20 Cse3021 Eth Vl2019201006662 Reference Material I 24-Jul-2019 Sin m1 Full
Networks (SIN)
Course Code : CSE3021
Slot: C2/G2
Instructor Details
O Dr. W.B. Vasantha
O Cabin Number: SJT 310 A19
O Phone: 9094230877
O Email: vasantha.wb@vit.ac.in
Module one –
Introduction
O Introduction to social network analysis
O Fundamental concepts in network analysis
O Social network data
O Notations for social network data
O Graphs and Matrices.
Module Two –
Measures & Metrics
O Strategic network formation
O network centrality measures:
O Degree, betweenness, closeness,
eigenvector
O Network centralization–density –
reciprocity – transitivity
O Ego network – measures for ego network
O Dyadic network – triadic network - cliques
- groups- clustering – search.
Module Three –
Community Networks
O Community structure
O Modularity,
O Overlapping communities
O Detecting communities in social networks
O Discovering communities: methodology,
applications –
O Community measurement
O Evaluating communities
O Applications
Module Four –
Models
O Small world network
O Watts–Strogatz networks
O Statistical Models for Social Networks
O Network evolution models: dynamical models,
growing models
O Nodal attribute model: exponential random
graph models
O Preferential attachment - Power Law
O Random network model: Erdos-Renyi and
Barabasi-Albert Epidemics
O Hybrid models of Network Formation
Module Five – Semantic Web
O Modelling and aggregating social network
data
O Developing social semantic application
O Evaluation of web-based social network
extraction
O Data Mining
O Text Mining in social network
O Tools
O Case study
Module Six – Visualization
O Visualization of social networks
O Novel visualizations and interactions for
social networks
O Applications of social network analysis
O Tools - Social Network Analysis
O R Tools for Social Network Analysis
O Social Networks Visualiser (SocNetV) -
Pajek.
Module Seven –
Security & Applications
O Managing Trust in online social network
O Security and Privacy in online social
network
O Security requirement for social network in
Web 2.0
O Say It with Colors: Language-Independent
Gender Classification on Twitter
O Friends and Circles - TUCAN: Twitter User
Centric ANalyzer.
Module eight: Recent Trends
O Based on recent research
Ref Books
O Stanley Wasserman,
Katherine Faust, Social
network analysis:
Methods and
applications, Cambridge
university press, 2009.
O PDF is available online
Ref Books
O R. Zafarani, M. A. Abbasi,
and H. Liu, Social Media
Mining: An Introduction,
Cambridge University
Press, 2014.
O Free book and slides at
http://socialmediamining
.info/
Ref Books
O Peter Mika, Social
network and semantic
web, Springer 2007.
O PDF is available online
Course Split up
O Weightage is as follows:
O Quiz – 10 Marks (Before CAT I)
O DA1 – 10 marks (Visualisation Tools)
O DA2 – 10 marks (Recent Trends)
O CAT 1 – 15 marks
O CAT2 -15 marks
O FAT – 40 marks
O Additional Learning – Extra 10 marks
J-component – Project
O Maximum 5 Members
O Review 1 – Topic selection and Literature
Survey – Before CAT I
O Review 2 – 40% implementation of project
– Before CAT II
O Review 3 – Complete project + Viva – one
week before FAT
Relevance of Social Networks
Mark Zuckerberg
Chief Executive Officer of
Facebook
• Co-Founder of Facebook
• World’s 3rd richest person
• Age 34
Guess the Company from the
logo
• Cambridge Analytica Ltd was a British political
consulting firm which combined data mining, data
brokerage, and data analysis with strategic
communication during the electoral processes.
• Alongside social media giant Facebook, Cambridge
Analytica is at the center of an ongoing dispute over
the alleged harvesting and use of personal data.
• Trump's election victory and the Brexit vote.
• Social network is very important in todays
world
Introduction to Social Network
Analysis
O Social network analysis is based on an
assumption of the importance of
relationships among interacting
units.
O The social network perspective
encompasses theories, models, and
applications that are expressed in
terms of relational concepts or
processes.
Important Terms
O Actors and their actions are viewed as
interdependent rather than independent,
autonomous units
O Relational ties (linkages) between actors are
channels for transfer or "flow" of resources (either
material or nonmaterial)
O Network models focusing on individuals view the
network structural environment as providing
opportunities for or constraints on individual action
O Network models conceptualize structure (social,
economic, political, and so forth) as lasting patterns
of relations among actors
O The unit of analysis in network analysis is not
the individual, but an entity consisting of a
collection of individuals and the linkages
among them.
O Network methods focus on dyads (two actors
and their ties), triads (three actors and their
ties), or larger systems (subgroups of
individuals, or entire networks.
O As defined in Wasserman, S. and K. Faust,
1994, Social Network Analysis. Cambridge:
Cambridge University Press.
O Social network analysis has emerged as a set
of methods for the analysis of social
structures, methods which are specifically
geared towards an investigation of the
relational aspects of these structures.
O The use of these methods, therefore,
depends on the availability of relational
rather than attribute data.
O Scott, J., 1992, Social Network
Analysis. Newbury Park CA: Sage.
O Network analysis (or social network
analysis) is a set of mathematical methods
used in social psychology, sociology,
ethology, and anthropology.
O Network analysis assumes that the way the
members of a group can communicate to
each other affect some important features
of that group (efficiency when performing a
task, moral satisfaction, leadership).
O Network analysis makes use of mathematical
tools and concepts that belong to graph theory.
O A network models a communication group. It
consists of a number of nodes (each node
corresponding to a member of the group) and a
number of edges (or ties)
O each one being associated to a communication
connection between two actors.
O Network data is stored in an adjacency matrix.
Commonly, the [i,j] element of the adjacency
matrix corresponds to the communication
behavior of actor i to actor j.
O Social network analysis is focused on uncovering
the patterning of people's interaction.
O Network analysis is based on the intuitive notion
that these patterns are important features of the
lives of the individuals who display them.
O Network analysts believe that how an individual
lives depends in large part on how that individual is
tied into the larger web of social connections.
O Many believe, moreover, that the success or failure
of societies and organizations often depends on the
patterning of their internal structure.
O From the outset, the network approach to the study of
behavior has involved two commitments:
O (1) it is guided by formal theory organized in mathematical
terms,
O (2) it is grounded in the systematic analysis of empirical data.
O It was not until 1970s, therefore--when modern discrete
combinatorics (particularly graph theory) experienced rapid
development and relatively powerful computers became
readily available--that the study of social networks really
began to take off as an interdisciplinary specialty.
O Since then its growth has been rapid. It has found important
applications in organizational behavior, inter-organizational
relations, the spread of contagious diseases, mental health,
social support, the diffusion of information and animal social
organization.
The Social Networks Perspective
O social network analysis is a distinct research
perspective within the social and behavioral
sciences; distinct because social network analysis
is based on an assumption of the importance of
relationships among interacting units.
O encompasses theories, models, and applications
that are expressed in terms of relational concepts
or processes.
O That is relations defined by linkages among units
are a fundamental component of network theories.
O Actors and their actions are viewed as
interdependent rather than independent,
autonomous units
O Relational ties (linkages) between actors
are channels for transfer or "flow" of
resources (either material or nonmaterial)
O Network models focusing on individuals
view the network structural environment as
providing opportunities for or constraints
on individual action
O Network models conceptualize structure
(social, economic, political, and so forth) as
lasting patterns of relations among , actors
Fundamental concepts in
network analysis
O Actors : Social network analysis is concerned
with understanding the linkages among social
entities and the implications of these linkages.
O The social entities are referred to as actors .
O Actors are discrete individual, corporate, or
collective social units.
O Examples of actors are people in a group,
departments within a corporation, public
service agencies in a city, or nation-states in the
world system.
Relational Ties
O Actors are linked to one another by
social ties.
O The range and type of ties can be
quite extensive.
O The defining feature of a tie is that it
establishes a linkage between a pair
of actors.
Types of Relational ties
O Evaluation of one person by another (for example expressed
friendship, liking, or respect)
O Transfers of material resources (for example business transactions,
lending or borrowing things)
O Association or affiliation (for example jointly attending a social
event, or belonging to the same social club)
O Behavioural interaction (talking together, sending messages)
O Movement between places or statuses (migration, social or physical
mobility)
O Physical connection (a road, river, or bridge connecting two points)
O Formal relations (for example authority)
O Biological relationship (kinship or descent)
Dyad
O At the most basic level, a linkage or relationship establishes
a tie between two actors.
O The tie is inherently a property of the pair and therefore is
not thought of as pertaining simply to an individual actor.
O Many kinds of network analysis are concerned with
understanding ties among pairs.
O All of these approaches take the dyad as the unit of analysis.
O A dyad consists of a pair of actors and the (possible) tie(s)
between them.
O Dyadic analyses focus on the properties of pairwise
relationships, such as whether ties are reciprocated or not,
or whether specific types of multiple relationships tend to
occur together.
Triad
O Relationships among larger subsets of actors may also
be studied.
O Many important social network methods and models
focus on the triad; a subset of three actors and the
(possible) tie(s) among them.
O Balance theory has informed and motivated many
triadic analyses. Of particular interest are whether the
triad is transitive (if actor i "likes" actor j, and actor j in
turn "likes" actor k, then actor i will also «like" actor k),
and whether the triad is balanced (if actors i and j like
each other, then i and j should be similar in their
evaluation of a third actor, k, and if i and j dislike each
other, then they should differ in their evaluation of a
third actor, k).
Subgroups
O Dyads are pairs of actors and
associated ties, triads are triples of
actors and associated ties.
O It follows that we can define a
subgroup of actors as any subset of
actors, and all ties among them.
O Locating and studying subgroups
using specific criteria has been an
important concern in social network
analysis.
Group
O Network analysis is not simply concerned
with collections of dyads, or triads, or
subgroups. To a large extent, the power of
network analysis lies in the ability to model
the relationships among systems of actors.
O A system consists of ties among members
of some (more or less bounded) group. The
notion of group has been given a wide range
of definitions by social scientists.
O A group is the collection of all actors on
which ties are to be measured.
Relation
O The collection of ties of a specific
kind among members of a group is
called a relation.
O For example, the set of friendships
among pairs of children in a
classroom, or the set of formal
diplomatic ties maintained by
pairs of nations in the world, are
ties that define relations.
Social Network
O Having defined actor, group, and
relation we can now give a more explicit
definition of social network.
O A social network consists of a finite set
or sets of actors and the relation or
relations defined on them.
O The presence of relational information
is a critical and defining feature of a
social network.
Social network data
O What are Network Data?
O Social network data measures at
least one structural variable in a
set of actors
O Concerns and theories focus on
identifying structural variables
and measurement techniques
Social network data
Structural and Composition
Variables
O Structural Variables – variables measured on
pairs of actors, cornerstone of social network
data sets (ex. transactions among corporations,
friendships between people, trade between
nations)
O Composition Variables – actor attribute
variables; measurements of actor attributes that
are of the standard social and behavioral science
variety, and defined at the level of the individual
(ex. gender, ethnicity for people, geographical
location)
Modes
Mode –
distinct set of entities on which
the structural variables are
measured:
one mode, (one set of actors)
two-mode, (two set of actors)
etc.
Affiliation Variables
O Affiliation variables – variables that are part of
affiliation networks
O Affiliation networks – special social networks that
arise in two-mode networks when there are two
modes, one of them an event (ex. clubs or volunteer
organizations), one set of actors
O Example: Considering a set of actors, and three elite
clubs in some city, we define an affiliation variable for
each of these three clubs. Each of these variables gives
us a subset of actors who belong to one of the clubs.
Boundary Specification and
Sampling
O What is your population?
O Boundary – allowing a researcher to describe and
identify the populations of a study
O Defined based on frequency of interaction and
intensity of ties among members as contrasted with
non-members
O Approaches to boundary specification - while the
realist approach defines boundaries as actors in the
data set perceive themselves, the nominalist approach
defines boundaries through the theoretical research
concerns
O A set of actors consist of all social units to which
there are measurements (either structural
variables, or structural and compositional
variables)
O Small populations have clearly defined actor set
boundaries (ex. classrooms, offices, social club,
and villages); large populations have less well-
defined boundaries (ex. interorganizational
networks in a community)
O Snowball sampling and random nets – special
sampling techniques when the boundary is
unknown
Types of Networks
O Number of modes refers to the
number of distinct kinds of
social entities in the network
O Networks categorized by how
many modes the network has,
and by whether affiliational
variables are present
One-Mode Networks
Actors
Actors may be people, subgroups, organizations,
collectives/aggregates (communities, nation-
states)
Subgroups usually consist of people
Collectives/aggregates usually consist of
organizations and subgroups
O Actor attributes.
O The characteristics of the actors constituting
the network can be measured
Relations
O Relations are usually viewed as representing specific
connections, or “relational contents”
O The kinds of relations may be:
O Individual evaluations: friendship, liking, respect, etc
O Transactions or transfer of material resources:
lending/borrowing, buying/selling
O Transfer of non-material resources: communications,
sending/receiving information
O Interactions
O Movement: physical (from place-to-place), social
(between occupations or statuses)
O Formal roles
O Kinship: marriage, descent
Two-Mode Networks
O Two Sets of Actors
O Actors can be of the general types as those
in one-mode networks
O Relations are measured in at least one way
between actors in the two sets
O Ex. Collection of corporate headquarters
and non-profit groups in the a city
measuring flow of donations from
corporations to non-profit groups
(unidirectional flow)
Affiliation Networks
O One Set of Actors – One Set of Events
O Affiliation Networks (or Membership
Networks), arise when one set of actors
is measured with respect to attendance
at, or affiliation with, a set of events or
activities; the first mode in an affiliation
network is a set of actors, and the
second is a set of events which affiliates
the actors
O Actors types can be exactly the same as those in
one-mode and two-mode networks
O Actors must be affiliated with at least one event
O Events are defined on the basis of membership,
attendance, or socializing in a group, etc.
O Nature of events depend on the types of actors
involved
O Attributes. Actor attribute variables are of the same
types as those for one-mode and two-mode
networks
O Two set of attribute variables can be found in actors
and events
Special Dyadic Networks
O Special Dyadic – non-network
relational data sampled from a larger
population centering on the interaction
between pairs; ex. husband-wife, father-
son
O An actor may relate to a limited number
of “special” other actors; this design can
constrain interactions among actors so
that all people cannot interact with all
others
Ego-centered Network
O Ego-centered Network consists of a
focal actor, ego, as sets of alters with
ties to the ego and measurements on
the ties
O Ex. Each respondent reports on a set
of alters to whom they are tied, and
on the ties among these alters
(Personal network data)
Network Data, Measurement,
and Collection
O Measurement
O Social network data is different from standard social and
behavioral sciences in that its data consist of at least one
relation measured among a set of actors
O Presence of relations has implications among many
measurements such as the unit of observations (actor,
pair of actors, relational tie, or event), modeling unit
(the actor, dyad, triad, subset of actors, or network), and
the quantification of relations (directional vs.
nondirectional)
O Modeling unit – level of network analysis being studied
O Unit of observation is the entity on which
measurements are taken and an actor from
whom information about ties is elicited.
O Modeling Unit
O Levels at which network data can be
modeled or summarized are Actor, Dyad,
Triad, Subgroup, Set of actors or network
Rational quantification
O Rational quantification refers to measurements
and whether the relation is directional or
nondirectional, and whether it is dichotomous or
valued
O Directional – relational tie has an origin and
destination
O Nondirectional – relation has no direction,
O Dichotomous – relation is coded as either present or
absent
O Valued – relation has values such as strength,
intensity, or frequency
Collection
O Social network data can be collected through the
following techniques: Questionnaires, Interviews,
Observations, Archival records, Experiments, and Other
techniques such as ego-centered, small world, and
diaries/journals
O Questionnaire
O Most commonly used collection method
O Usually contains questions used to identify relations
between actors
O Three types of question formats: Roster vs. free recall,
Free vs. fixed choice, and Ratings vs. complete rankings
O Roster vs. Free Recall – issue of whether
questionnaires should be presented with a complete
list, roster, or respondents be allowed to generate lists
of names, free recall
O Roster can only be used when researcher knows entire
members of the set prior to data gathering (Ex. Friends
in a class)
O Free recall (a format where respondents generate the
lists of names) can be used when entire members of
the set may not be presented (Ex. Actors are asked to
name other actors, but were not presented with a
roster in studies such as Friends in two junior high
schools or community elites)
O Free vs. Fixed Choice
O On how many nominations respondents can provide,
O Free choice – actors are not given constraints
O Fixed choice – actors are given constraints
O Ratings vs. Complete Ranking
O used to reflect intensity of strength of ties
O Ratings require respondents to assign a value or rating
to each tie
O Complete rankings require respondents to rank their
ties to all other actors
O Full rank-orders and rating scales with multiple
responses generate valued relations; dichotomous and
directional
O Interview
O Used to gather network data when questionnaires are not
feasible (Ex. CEO interviews in Minneapolis/St. Paul)
O Observation
O Used to gather network data in field research, usually
relations among relatively small groups of people who are
engaged in face-to-face interactions
O Useful with people who are not able to respond to
questionnaires or interviews
O Useful for collecting affiliation data for attendance at events
O Archival records
O Measures ties through examining measurements from past
recorded interactions
O Ex. Patterns of citations among scholars examining “who
cites whom” to study diffusion of a scientific innovation
Other
O Cognitive social structure is a design where
respondents are asked about perceptions of
network ties and perceived relations are
measured (Ex. Fast food restaurant
perceptions)
O This design can collect more information
than general sociometric designs as the
respondent reports not only on their own
ties, but ties belonging to other actors
Experimental
O Method 1 – select a group of actors, observe
their interactions in an experimentally
controlled situations, then record interactions
between pairs of actors
O Method 2 – select a group of actors, specify
which actors can interact with each other
during the experiment, then record
interactions between only those specified
pairs of actors (Ex. Group problem-solving
experiments)
O Ego-centered – respondent is set up as ego with data
measured among the ties from the ego to the alters (Ex.
Survey about the people with whom you discussed matters
important to you)
O Small world is an attempt to determine how many actors a
respondent is removed from a target individual based on
acquaintanceship
O Can be used to compare demographic characteristics and
chains
O Reverse small world focuses on ties from a specific
respondent to a variety of hypothetical targets
O Diary – respondents are asked to keep a daily record of
whom they interact
O Variance of ego-centered
O Data sets include information on the relation type and
characteristics of the alters
Longitudinal Data Collection
O Focuses on how ties in a network
change over time and how well the
past can predict the future using
methods previously discussed
(questionnaires, interview,
observation, etc.)
O Commonly used to examine
friendships over time
O Ex. Interaction among fraternity
members over time
Measurement Validity,
Reliability, Accuracy, Error
O True structure – social structure referring to
a relatively prolonged and stable pattern of
interpersonal relations
O Observed structure – measured network
data that might contain error
Accuracy
O Issue of informant accuracy – information
collected using verbal reports and information
collected through observation
O People are not good at accurately reporting on their
interactions in particular situations
O “True” structures are of most interest and
network studies should study focusing on long-term
patterns, not particular interactions of individuals
O Issue comes up when looking at interactions among
organizations being reported on by members with
imperfect information about the organization
Validity
O A concept of a measure is valid to the extent
that it measures what it is intended to
measure
O Construct validity, a more formal construct,
arises when measures of a concept behave
as expected in theoretical predictions and
can be studied by examining how these
measures behave in a range of theoretical
propositions
Reliability
O A measure of a variable or concept is reliable if
repeated measurements give the same estimates
of the variable
O Three approaches have been used to assess the
reliability of social network data: test-retest
comparison, comparison of alternative question
formats, and reciprocity of sociometric choices
O “True” value of a variable must be assumed to
not change over time for test-retest comparison
to be appropriate
O Reliability can be assessed at analysis
O Sociometric questions using ratings or full rank
orders are more reliable than fixed choice
designs
O Sociometric questions about more intense or
intimate relations have higher rates of
reciprocation than sociometric questions about
less intense or intimate relations (Marsden, et
al.)
O Reliability of aggregate measures (ex.
popularity) is higher than the reliability of
“choices” made by individual actors (Burt, et
al.) different levels of analysis
Measurement error
O Occurs when there is a discrepancy
between the “true” score or value of a
concept and the observed (measured)
value of that concept
O Measurement error – the difference
between the true and observed values
O Levels of analysis must be kept in mind
when determining the implications of
measurement error
Notations for social network data