Social Media Analytics

You might also like

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 6

Social Media Analytics

For Part A, there are two sheets:


 G30_A1.xlsx: Contains the metrics of all twitter users, overall graph metrics and has
Group by connected components partitioning.
 G30_A2.xlsx: We have done group by cluster partitioning.

Comments and Observations:

 This table shows the properties of 2 twitter profiles highest – one is Salesforce and 2nd
one is verified account of Dreamforce, the lead actor of the movie.

o The higher in degree means that Salesforce has been more famous than the
software usage related conversations.
o Also, the Eigenvector score of the Dreamforce is much less compared to the
Salesforce’s profile. This means that compared to Dreamforce, Salesforce is
connected to more influential people.

In- Out- Betweenness Closeness Eigenvector


Vertex Degree Degree Centrality Centrality Centrality
salesforce 458 75 1432538.018 0.001 0.034
dreamforce 131 16 178171.916 0.000 0.011
G30_A1.xlsx: The partition used in this sheet is Group by Group by connected components.
As a result, total of 394 groups were formed, as shown in figure 1.
Figure 1

Below is the overall graph metrics for connected by components partition.

Graph Metric Value


Graph Type Directed
   
Vertices 1903
   
Unique Edges 3197
Edges With Duplicates 1307
Total Edges 4504
   
Number of Edge Types 1543
   
Self-Loops 747
   
Reciprocated Vertex Pair Ratio 0.039529566
Reciprocated Edge Ratio 0.076052797
   
Connected Components 392
Single-Vertex Connected Components 283
Maximum Vertices in a Connected Component 1343
Maximum Edges in a Connected Component 3849
   
Maximum Geodesic Distance (Diameter) 11
Average Geodesic Distance 3.664077
   
Graph Density 0.000879125
Modularity 0.305135
 The average geodesic distance is quite small, which means the information about the
Salesforce is travelling very fast through the network.

G30_A2.xlsx: The partition used in this sheet is Group by Group by cluster – used Clauset-
Newman-Moore algo and put all neighbor-less into single cluster and got below graph in Figure
2.

Figure 2
 Since all neighbor-less components were put into single group, overall no of groups is
much lesser compared to Connected components partition.

Below is the overall graph metrics for Group by cluster:

Graph Metric Value


Graph Type Directed
   
Vertices 1903
   
Unique Edges 3197
Edges With Duplicates 1307
Total Edges 4504
   
Number of Edge Types 1543
   
Self-Loops 747
   
0.03952956
Reciprocated Vertex Pair Ratio 6
0.07605279
Reciprocated Edge Ratio 7
   
Connected Components 392
Single-Vertex Connected Components 283
Maximum Vertices in a Connected Component 1343
Maximum Edges in a Connected Component 3849
   
Maximum Geodesic Distance (Diameter) 11
Average Geodesic Distance 3.664077
   
0.00087912
Graph Density 5
Modularity 0.551622
 The modularity has been increased further, which specified the connections among the
sub groups are of good quality.
 The graph depicts high modularity (> 0.5) which means the connections inside the groups
are quite strong and dense, i.e., the overall group quality has been good.

You might also like