Topic 2

You might also like

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 46

Big Data Analytics

MS4252
Social Network Analysis II

Sources: Jennifer Golbeck 2013. Analyzing the Social Web.


Elsevier (chapter 4, 5, 6)

1
Chapter 4

NETWORK VISUALIZATION

2
Information Visiualization
 Humans are wired to find pattern visually.
 Having natural ability to see anomalies, patterns,
clusters, and changes.
 Recognize many of these things without consciously
looking at them.

3
Information Visiualization
 In visual data patterns can be recognized that may
otherwise be difficult to see in lists of numbers,
adjacency lists, or other textual representations of data.

 Information visualization deals with the presentation of


data in visual format.
 The data may be numeric, categorical, network data
(like social networks), text, and other types.

 Good information visualization supports users in better


understanding the data they are seeing.

4
Information Visiualization
 Take advantage of humans’ natural abilities to see
 Patterns
 Anomalies
 Relationships
 Trend
 Clusters
 Overview of the complex data and explore more from
visualization!
 Visulizations are a qualitative way to begin
understanding the data.
 From there, quantitative experiments or analysis can follow to
explain any insights

5
Graph Layout
 Network is made up of nodes and edges.
 How to laid out is critical to what an observer is able to
understand about a network.

 There are many types of layout algorithms that position


the nodes and edges in different ways for network
visulizaiton.
 Random layout
 Circular layout
 Grid layout
 Force-directed layout

6
What makes a good visualization?
Criteria from Dunne and Shneiderman (2009)

 Every node is visible.


 For every node you can count its degree.
 For every link you can follow it from source to
destination.
 Clusters and outliers are identifiable.

7
Random layout

8
Circular layout

9
Grid layout

10
Force directed layout
The layout is dynamic and determined by the connections
between the nodes.

11
Visualizing network features
 Labels (node label and edge label; hard to show all the
labels even for small network)

 Size, Shape, and color

 Larger graph properties

12
Size, Shape, and color
 Showing other attributes of nodes and edges in graphs
can be easier.
 Categorical or quantitative attributes are
particularly easy to show by adjustments in size,
shape, or colour.

 There are many statistics about the nodes in that


network:
 degree, centrality, and so on.
 These can be encoded using colour, size, or both.

13
Node size and color
 A graph indicating degree with node color and clustering
coefficient with node size

14
Edge weight
 Indicate the strength of a relationship, the
frequency of communication, or other factors.

15
Large graph properties (clusters)
 Example: Youtube videos, where nodes represent videos
and edges connected video that share a common tag!

16
Scale Issues
 Too many nodes (~ 10, 000 or more) or edges are
almost impossible to visualize.

 Dense network may not reveal patterns.

 Filtering for visualization is crucial!

17
Example: Senate Voting Records
 Density can be a problem even the number of nodes is
small.
 Shows a network of members of the U.S. Senate.
 Senators voted
 There are only 100 nodes but over 4,100 edges
 Edge indicates the senators have voted the same way in
at least 40% of the time.

18
Filtering for visual patterns
 One way to compensate for this is to filter the
networks when possible.
 Filter out the edges based on the weight
 Based on how many times senators have voted
together (at least two-thirds of the bills)

19
Visualization Tools
 Gephi

 SAS Enterprise Miner (link analysis)

 R – we will use R!

 Python

20
Chapter 5

TIE STRENGTH

21
Tie Strength
 Social relationships are complicated.

 The type of relationship people have will draw on many


things like their history and similarity, each person’s
personal background and preferences, environmental
factors, and more.

 Relationships are also multifaceted, and many


relationship types can be used in social network
analysis.
 One of the most useful is the idea of tie strength

22
Tie Strength
 Measure of the strength of a relationship between
people (Mark Granovetter, 1973)

 `The strength of a tie is a combination of the


amount of time, the emotional intensity, the
intimacy (mutual confiding), and the reciprocal
services which characterize the tie’

23
Tie Strength
 Two main types:
 Strong ties are rare, trusted and are usually family
members or very close friend.
 Usually people a person sees frequently, with whom one
shares personal details of one’s life, and for whom the
person will do and expect favours.
 Weak ties are much more common and include
acquaintances and more casual friendships.
 Co-workers or people who you know from a class but you
don’t spend a lot of time with

 A spectrum of tie strength, and any relationship


may fall along the scale from weak to strong.

24
The strength of Weak Tie
 Tie strength is a very important factor to consider
in social network analysis.
 Consider the flow of information through a
network
 Weak ties often connect to diverse groups of people
with different perspectives
 These ties allow information to move throughout the
network
 E.g. A disease.
 Someone is more likely to catch a cold from a weak tie
 But because of the high level of close contact, it will
likely spread quickly to one’s strongest connections.

25
The strength of Weak Tie
 NOT to say that tie strength is the only factor
influencing trust, reliability, and closeness in
social networks.

 Weak ties may provide highly trusted


information.

 E.g., a physician may be more trusted about medical


information than someone’s family members.

 The authority of the physician outweighs tie strength

26
Replicating Migram’s `six degrees’
 Send booklets from original participants to a
target, unknown person.

 (Lin et al., 1978) show that successful chains


made heavy use of weak ties.

27
The benefit of weak tie
 Connect people to different social circles,
exposing them to more information
 A person has more weak ties than strong ties.

28
Tie strength and network
 To analyze tie strength in social network analysis,
the network must include relationship
information.

29
Network Structure- forbidden triad
 What does that tell us about the relationship with
Bob and Chuck?
 Cannot draw any absolute conclusions
 Some sort of tie exists between B and C, either strong
or weak.
 Counter example: A is married to B and having
an affair with C?

30
Network Structure – Bridge
 Many Forbidden triad can be found. i.e. PFO, PFH,
and PFN.
 The edge between P and F would no longer be a
bridge

31
Tie Strength and Propagation
 Tie strength
 Strong tie – more trusted
 Weak tie – wider spread

 Network propagation
 a phenomenon where things spread through a network

 Diseases spreading through a social network,


 Computer viruses on the Internet, or
 Rumors and fads through a social network

32
Chapter 6

TRUST

33
Definition – trust
 Trust is a relationship with which we are all
familiar, but which we rarely define or describe 。
 Load money
 We expect the person will pay us back
 Ask for a movie recommendation
 The person's recommendation will match our taste and the
movie or restaurant or hotel will be good
 Tell a secret
 The person will keep a secret, not tell others, and not
judge us for it
 Ask for a recommendation or reference
 The recommendation will be positive and help us get the
position we are applying to

34
General definition
 Trust is putting oneself in a vulnerable position
based on the belief that another person will act
with our best interest in mind.

 Definition: `A person trusts another if she is


willing to take a risk based on her expectation
that the trusted person’s actions will lead to a
positive outcome’

35
Development of trust
 Calculation-based trust:
 A rational decision about whether to trust someone,
where the costs and benefits of trusting are factored in.
 Personal-based trust:
 A person's propensity to trust, developed over the
course of their life.
 Cognition-based trust:
 The instant rapport and trust that can develop between
people who share similar backgrounds, beliefs, and
values
 Institution-based trust:
 How trust may form in the presence of guarantees and
protections offered by an institution.

36
Asymmetry
 Trust is not necessarily identical in both directions.
 Extreme example: parents and children
 A child must have almost complete trust in his parent
while the parent should have very little trust in a child
(on substantive matters)

 More common asymmetries are smaller


 E.g. Bosses and employees
 Employees tend to trust superiors more

37
Context and Time
 Trust will vary among contexts
 I may trust Bob to recommend a restaurant, but not to
repair my car.
 Trust can also transfer between contexts
 I may build trust in a co-worker that is entirely in work
context, but later trust that person to recommend a
plumber.

 Trust changes over time


 People tend to develop trust over time, but trust may
disappear completely if there has one dramatic failure.

38
Measuring trust
 Measuring trust is important but difficult.

1. A person’s propensity to trust.


 This can be measured with a simple though experiment
called the Investment Game.

2. One person’s decision about the other person

39
Trust in social media
 Apply these same estimates to people we know
online.

 Ask people explicitly to rate trust in others


 Customer rating

 Issue: most people online are strangers


 Find some way to leverage information that people have
shared about the trustworthiness of others to infer how
 much one person may trust a stranger

 Example: eBay
40
Trust inference
 Infer trust between two unknown people using
network structure.
 If A-B have trust, and B-C have trust, how much
should A trust C?
?

A B C
𝑡 𝐴𝐵 𝑡 𝐵𝐶

41
Trust inference algorithm
 Network-based Inference
 Use network structure to infer trust

 Example approach
 Find neighbors who are trusted.
 Ask them how much to trust the stranger.
 Average their responses weighted by how much we trust
each neighbor.
 Neighbors repeat this if they do not know the stranger.

 A lot of Algorithms to do this by computer scientists

42
Network based inference
 Inferring over many paths?
 Favor highly trusted connections and short paths over
long ones

43
Similarity based trust inference
 Research has shown that people who trust one
another tend to be similar (Ziegler and Golbeck,
2007).

 A person will trust his friend about movies if they have


similar taste.

 A parent will trust a babysitter to watch her child if they


have similar ideas about the appropriate way to care for
the child and respond in an emergency.

44
Application of Trust
 Once trust is computed, how can we use it?

 Filtering information
 e.g. show reviews only from the most trusted people

45
Application of Trust
 Sorting Information
 Show Facebook posts from my most trusted friends
first, and least trusted friends last

 Aggregating Information

 Give more weight to restaurant ratings from trustworthy


people and less weight to lower-trust people when
computing an average rating.

46

You might also like