Professional Documents
Culture Documents
Rezvanian Meybodi Computers in Human Behavior 2016 Off Print
Rezvanian Meybodi Computers in Human Behavior 2016 Off Print
Rezvanian Meybodi Computers in Human Behavior 2016 Off Print
a r t i c l e i n f o a b s t r a c t
Article history: Social networks are usually modeled and represented as deterministic graphs with a set of nodes as users
Received 5 April 2016 and edges as connection between users of networks. Due to the uncertain and dynamic nature of user
Received in revised form behavior and human activities in social networks, their structural and behavioral parameters are time
11 July 2016
varying parameters and for this reason using deterministic graphs for modeling and analysis of behavior
Accepted 22 July 2016
of users may not be appropriate. In this paper, we propose that stochastic graphs, in which weights
associated with edges are random variables, may be a better candidate as a graph model for social
network analysis. Thus, we rst propose generalization of some network measures for stochastic graphs
Keywords:
Complex social networks
and then propose six learning automata based algorithms for calculating these measures under the
Social network analysis situation that the probability distribution functions of the edge weights of the graph are unknown.
User behavior Simulations on different synthetic stochastic graphs for calculating the network measures using the
Stochastic graphs proposed algorithms show that in order to obtain good estimates for the network measures, the required
Network measures number of samples taken from edges of the graph is signicantly lower than that of standard sampling
method aims to analysis of human behavior in online social networks.
2016 Elsevier Ltd. All rights reserved.
http://dx.doi.org/10.1016/j.chb.2016.07.032
0747-5632/ 2016 Elsevier Ltd. All rights reserved.
622 A. Rezvanian, M.R. Meybodi / Computers in Human Behavior 64 (2016) 621e640
behavior of users on networks vary with time. Moreover, in user behaviors and activities of users in online social networks. At
analyzing online social networks not only understanding the the end of this section, learning automata and variable action set
structure and topology of the network is important but the degree learning automata are introduced.
of association among the users in network is also important for
analysis of user behaviors in online social networks. 2.1. Network measures for deterministic networks
According to the aforementioned points, it seems that stochastic
graphs in which weights associated with the edges are random Network measures and calculating them play a signicant role
variables is a better candidate as a graph model for real-world in social network analysis (Borgatti, 2005). Popular network mea-
network applications with time varying nature. By choosing a sures such as degree, betweenness, closeness and clustering coef-
stochastic graph as a graph model, every feature, measures and cient not only used for evaluating the node importance in actual
concept of the graph such as path (Beigy & Meybodi, 2006), cover complex network studies but also used as a part of some algorithms
(Rezvanian & Meybodi, 2015b), clique (Rezvanian & Meybodi, such as Girvan-Newman community detection algorithm using
2015a), and spanning tree (Akbari Torkestani & Meybodi, 2012) betweenness (Girvan & Newman, 2002), overlapping community
should be treated as stochastic features. For example, choosing detection using node closeness (Badie et al., 2013). In this section
stochastic graph as the graph model of an online social network and some of well-known network measures for deterministic networks
dening community structure in terms of clique, and the associa- are introduced.
tions among the humans within the community as random vari-
ables, the concept of stochastic clique may be used to study 2.1.1. Degree
community structure properties. Degree as a basic network measure has been widely used in
In this paper, after a brief overview of recent studies for user many studies and degree of node vi dened in binary network (also
behavior and human activities in online social network, we rst called unweighted network) as follows
redene some network measures for stochastic graphs and then
X
design six algorithms for calculating them under the situation that ki aij (1)
the probability distribution functions of the weights associated jsi
with the edges are unknown. The proposed algorithms for calcu-
lating network measures by taking samples from the edges of the where j is the index of all other nodes of graph and aij is 1 if node vi
stochastic graph try to estimate the distribution of the network is adjacent to node vj, and 0 otherwise. In other words, degree of
measures. The process of sampling from the edges of the graph is node vi is the number of nodes that directly connected to node vi.
guided by the aid of learning automata in such a way that the Degree centrality is useful in the context of nding the single hu-
number of samples needed to be taken from the edges of the sto- man which gets affected by the diffusion of any information in the
chastic graph for estimating the network measures to be reduced as network. It follows from the fact that the human with high degree
much as possible. In the proposed algorithms, the guided sampling centrality has the chance of getting affected from many numbers of
process implemented by learning automata aims to take more sources (Freeman, 1979).
samples from the promising region of the graph, the regions that
reects higher rate of changes (e.g., higher rate of user activities), 2.1.2. Strength
instead of walking around and taking unnecessary samples from Node strength of node vi is dened as the sum of adjacent edge
non-promising region of the graph. weights for weighted network as follows
In order to study the performance of the proposed algorithms X
for calculating network measures in stochastic graphs, several si wij (2)
experimental studies on different synthetic stochastic graphs are jsi
conducted. Experimental results show that in order to obtain good
estimates for the network measures the number of samples needed where wij is greater than 0 if node vi is adjacent to node vj and its
to be taken from the edges of the graph by the proposed guided value indicates the weight of edge between node vi and node vj.
sampling algorithms is signicantly lower than the required Similar to degree, a human with high strength centrality is known
number of samples needed to be taken from the edges when as popular human with high strength links to other humans;
standard sampling method is used aims to analysis of human ac- however a human with high strength may not consist of necessarily
tivities and user behaviors in online social networks. the maximum number of friends. Strength centrality is useful in the
The rest of this paper is organized as follows. Section 2 dedi- context of nding a human which gets affected by the amount of
cated to material and methods including brief introduction to some spreading of any information in the network (Freeman, 1979).
of existing network measures for deterministic networks, an
overview of recent works about distribution of user behaviors in 2.1.3. Closeness
online social network studies and a brief description about learning Closeness of a node is the inverse sum of shortest paths to all
automata theory. In section 3, the proposed network measures for other nodes from that node and dened for binary network with n
stochastic graphs are described. The proposed algorithms for nodes as follows
calculating the proposed network measures in stochastic graphs
are described in section 4 and section 5 presents the simulation 1
ci P (3)
results. In section 6, discusses about this study and results. Finally, jsj dij
section 6 concludes the paper.
where dij is the length of shortest path between node vi and node vj.
2. Material and methods Closeness can be regarded as a measure of how long it will take to
spread information from that node to all other nodes sequentially.
In this section, to provide the necessary background, we present Since, the spread of information can be modeled by the use of
a brief description of some network measures for deterministic shortest paths, in applications such as spread of information, a
unweighted and weighted networks. We also briey review the human with high closeness centrality can be considered as the
studies performed by the researchers about the distributions of central point because that human can spread the information faster
A. Rezvanian, M.R. Meybodi / Computers in Human Behavior 64 (2016) 621e640 623
than other human with lower closeness centrality (Freeman, 1979). behaviors in a systematic manner. In viral marketing, companies
can exploit user behavior models and then spread or promote their
2.1.4. Betweenness new products or services to predict the user interactions and pro-
Betweenness of a node is the number of shortest paths from all cess of adoption of an idea or spread them. In some knowledge
nodes to all other nodes of graph that pass through that node. based researches, understanding the patterns of user participations
Betweenness of node vi is dened for binary network as follows to generate new contents and propagate information on the net-
works via users is very imperative in order to detect active users
gst i from spamming users, attract new users and keep some users,
bi (4)
gst predict the trends of topics in user communities, and perform
efcient content management in their basic system requirements.
where gst is the number of shortest paths between all pair of nodes Modeling user behavior and human activities of online social net-
of source node vs and destination node vt in a binary network and works in quantitative manner is also important for understanding
gst(i) is the number of those paths that pass through node vi. Since, a user interactions and is a challenging task due to the variety of user
node with high betweenness centrality will typically be the one interactions on the structure of online social networks and abun-
which acts as a bridge between many pairs of nodes, the dance of data from some online social networks. In recent years,
betweenness introduced as a measure for quantifying the control of several research works regarding user behavior modeling, charac-
a human on the communication between other humans in a social terizing the distribution of user behaviors and user activities and
network. In this conception, humans that have a high probability to statistical observations from online users have been reported in the
occur on a randomly chosen shortest path between two randomly literature, some of which are pointed out in the rest of this section.
chosen humans have a high betweenness. Betweenness can be Golder et al. (Golder, Wilkinson, & Huberman, 2007) analyzed
regarded as a measure of how to control Information ows over the interactions among Facebook users. Their observations discov-
communication links. Also, Information ows could be dominated ered strong daily and weekly patterns and found that the
by humans with high betweenness centrality in a network communication of the users is clustered mostly based on the uni-
(Freeman, 1979; Newman, 2005). versities. They reported that the interactions occurred almost
exclusively among friends and only small percentages of friends
2.1.5. Clustering coefcient interact with other friends which results in the distribution of
The clustering coefcient for node vi is dened as a proportion of number of messages sent by a user followed a power-law distri-
the number of all edges among its neighbors over the possible bution. Leskovec et al. (Leskovec, McGlohon, Faloutsos, Glance, &
number of edges between them. The local clustering coefcient for Hurst, 2007) analyzed about 45 thousands blogs and 2.2 million
a node vi in binary network can be dened as follows postings, they reported that the popularity of posts drops with a
X Power-law, instead of exponentially and the size of the cascades
1
CCi aij :aih :ajh (5) distribution follows a perfect Zipan distribution. Leskovec et al.
ki ki 1 vj ;vh 2Ni (Leskovec, Backstrom, Kumar, & Tomkins, 2008), also focused
directly on the microscopic node behavior of four large online social
where ki is the degree of node vi, n is the total number of nodes of network datasets (Flickr, Delicious, Yahoo Answers, and LinkedIn).
graph, Ni is the set of nodes vj adjacent to node vi and aij is 1 if node They observed that user's lifetime follows exponential distribution,
vi is adjacent to node vj, and 0 otherwise. The clustering coefcient time gap between link creations follows a Power-law with expo-
of a node is a measure of the connectivity among the neighborhood nential cutoff and user activity is uniform over its lifetime or de-
of the node (transitivity of a node). A human with high clustering creases with time exponentially. Nazir et al. (Nazir, Raza, & Chuah,
coefcient indicates the high tendency of that human to form 2008) presented three applications for Facebook to measure the
cluster with other humans. In other words, the clustering coef- behavior of the users. The authors detected communities, the
cient measures transitivity of a network. And also, most of the members of the communities frequently interact with each other
friends of a human with high clustering coefcient can collaborate and several kinds of human activities were monitored. Based on the
with one another even if the focal human is removed from the measurements, they showed that the popularity of the applications
network. follows a power-law distribution with an exponential decay.
Guo et al. (Guo, Tan, Chen, Zhang, & Zhao, 2009) analyzed the
2.2. User behavior in online social networks patterns of user content generation in online social networks. They
identied three groups of users. The rst group is users with
Since, generating and sharing information through social net- distinct posting behaviors including steadily posting inactively
works are done by users; the online users have become the main posting, and the second group is users with occasional posting in
features of online social networks. In general, activities of users in the network. They found that the overall user lifetime does not
most online social networks can be included doing the social follow the exponential distribution and also the distribution of
relationship with their friends, answering an online questionnaire, different users' posting activities does not follow Power-law dis-
joining in online communities, participating in an online discus- tributions rather it follows the stretched exponential distribution
sion, posting and sharing new favorite contents (e.g., photos, music, similar to the reference rank distribution of objects in Internet
videos, scientic papers, services, and etc.), visiting a user prole media systems, including viral video social networks such as You-
and its contents, taking comment or like on a post. Studying how Tube. Kwak et al. (Kwak, Lee, Park, & Moon, 2010) crawled the entire
users behave and interact with their friends in online social net- Twitter site and obtained user proles, social relations, trending
works play a signicant role in social network analysis for several topics and tweets. In their analysis on follower-following topology,
reasons (Buccafurri, Lax, Nicolazzo, & Nocera, 2015; Shen, Brdiczka, they found that the distribution of followers is not power-law;
& Ruan, 2013; Zhu, Su, & Kong, 2015). In some fundamental studies, rather the distribution of the users in a Retweet tree follows
researchers can generate synthetic graphs similar to real networks power-law distribution. Gyarmati et al. (Gyarmati & Trinh, 2010)
in order to further simulations and investigation on a number of analyzed the degree distribution of nodes for online sessions of
phenomena using a computer generated graph models in order to users in Bebo, MySpace, Netlog, and Tagged based on the crawled
exploit the structures and dynamics of networks and their user public part of users' prole pages, which contained online status
624 A. Rezvanian, M.R. Meybodi / Computers in Human Behavior 64 (2016) 621e640
information of the users. They revealed that the time spent online learning automaton (LA) refers to an abstract model which
by users follows a Weibull distribution that a fraction of users tend randomly chooses an action out of its nite set of allowed actions
to lose interest surprisingly fast soon after subscribing to the ser- and performs it on an unknown random environment. The action is
vice. They claimed that the distribution of session times of users chosen at random based on a probability distribution kept over the
and the number of sessions can be modeled with power-law action-set and at each instant the chosen action is served as the
distributions. input to the random environment. Environment then evaluates the
Ding et al. (Ding et al., 2010) studied behaviors of users in the chosen action and responds to the automata with a reinforcement
Bulletin board system social network by analyzing the read and signal. Based on chosen action, and received signal, the automaton
reply data and found that the distributions of discussion sizes and updates its internal state and chooses its next action. Generally, the
user participation levels follows a power-law distribution. Galuba goal of a learning automaton is to nd the optimal action from the
et al. (Galuba, Aberer, Chakraborty, Despotovic, & Kellerer, 2010) action-set so that the average penalty received from the environ-
analyzed the information propagation in Twitter and investigated ment is minimized.
the propagation of URLs through posts. They found that posting Learning automata can be classied into two main classes: xed
frequencies followed a power-low distribution across users and and variable structure learning automata (VSLA) (Narendra &
URLs. Also the authors based on the obtained datasets proposed a Thathachar, 1989) which is used in this paper. A VSLA is repre-
model that predicts the propagation of information. Falck-Ytter sented by a 6-tuple b, f, a, P, G, T, where b denotes the set of in-
et al. (Falck-Ytter & verby, 2012) compared the number of fol- puts, f is a set of internal states, a is the action sets as output, P is
lowers who followed popular contents in Twitter and YouTube. Their the state probability vector governing the choice of state as each
observations show that Zipf's law was not suitable to describe instant k, G is the output mapping, and T is the learning algorithm
followers in Twitter. In contrast, it was suitable to describe YouTube (also known as learning scheme). The learning algorithm T refers to
viewers. Zhong et al. (Zhong, Fan, Wang, Xiao, & Li, 2012) studied a recurrence relation which is used to update the action probability
user behavior and interests in different online social networks and vector. Let ai(k)2a be the action that is chosen by a learning au-
observed that they inuenced one another. Besides, only small tomaton and p(k) is the probability vector dened over the set of
fraction of users actually participated in many activities in online action at instant k. At each instant k, the action probability vector
social networks which resulted in a distribution of users with a p(k) is updated by the linear learning algorithm given in equation
long-tail property. Morales et al. (Morales, Losada, & Benito, 2012) (6) if the chosen action ai(k) is rewarded by the random environ-
analyzed the effects of user behavior on social structure emergence ment (b 0) and it is updated according to equation (7) if the
using the information ow on Twitter and found that community chosen action is penalized (b 1).
structure was formed inside the network. In addition, the distri- h i
bution of the number of posts by users was shown to t an expo- pj k a 1 pj k ji
pj k 1 (6)
nentially truncated Power-law. 1 apj k cjsi
Yan et al. (Yan, Wu, & Zheng, 2013) explored the behavior of
users in posting microblogs systems and realized that the interval 8
time distribution of people posting did not t a normal distribution, < 1 bpj k
>
ji
rather it was a power-law distribution. Liu et al. (Liu, Nazir, Joung, & pj k 1 b (7)
>
: 1 bpj k cjsi
Chuah, 2013) found that the number of daily active users using r1
some applications in Facebook follows a power-law distribution.
Feng et al. (Feng, Cong, Chen, & Yu, 2013) explored pin distribution where a denotes reward parameter which determines the amount
(visual bookmarks of images) and board distributions as well as of increases of the action probability values, b is the penalty
number of comments, likes, repins in Pinterest and discovered that parameter determining the amount of decrease of the action
those characteristics follows a Power-law. Vongsingthong et al. probabilities values and r is the number of actions that learning
(Vongsingthong, Boonkrong, Kubek, & Unger, 2015) tracked the automaton can take. If a b, the learning algorithm is called linear
activities of some users in a Facebook through a questionnaire. Their reward-penalty (LRP) algorithm; if b a where 0< < 1, then the
observation on the user activities such as posting photos and learning algorithm is called linear reward--penalty (LRP) algo-
videos, visiting popular pages and friend proles, playing online rithm; and nally if b 0, the learning algorithm is called linear
game, and questioning on products follows a Power-law distribu- reward-Inaction (LRI) algorithm which is perhaps the earliest
tion. Bild et al. (Bild, Liu, Dick, Mao, & Wallach, 2015), studied scheme considered in mathematical psychology.
empirically aggregate user behavior using an exhaustive observa- In the recent years, learning automata have been successfully
tions on Twitter with a focus on quantitative descriptions. They applied to a wide variety of applications such as optimization
found that the lifetime tweet distribution is a type-II discrete (Mahdaviani, Kordestani, Rezvanian, & Meybodi, 2015), image
Weibull stemming from a Power-law hazard function, that the processing (Mofrad, Sadeghi, Rezvanian, & Meybodi, 2015), graph
tweet rate distribution, although asymptotically Power-law, ex- problems (Mousavian, Rezvanian, & Meybodi, 2013, 2014), social
hibits a lognormal cutoff over nite sample intervals, and that the network analysis (Soleimani-Pouri, Rezvanian, & Meybodi, 2012;
inter-tweet interval distribution is a Power-law with exponential Soleimani-Pouri, Rezvanian, & Meybodi, 2014), sampling social
cutoff. networks (Rezvanian & Meybodi, 2016; Rezvanian, Rahmati, &
The summary of studies for type of user behaviors/activities of Meybodi, 2014), and community detection in complex networks
online social networks and their respective distributions are given (Khomami, Rezvanian, & Meybodi, 2016).
in Table 1.
2.3. Learning automata theory 3. Proposed network measures for stochastic graphs
The rst learning automata models were developed in mathe- In this section, we rst dene stochastic graphs and then dene
matical psychology (Burke, Estes, & Hellyer, 1954; Estes, 1950) and network measures: strength, closeness, betweenness and clus-
have become a topic of interest to computer engineers in view of its tering coefcient for stochastic graphs.
role in machine intelligence (Narendra & Thathachar, 1989). A Denition 1. A stochastic graph G can be described by a triple
A. Rezvanian, M.R. Meybodi / Computers in Human Behavior 64 (2016) 621e640 625
Table 1
Summary of studies for user behaviors and their respective distributions.
where wij is the random variable associated with edge eij outgoing Egst i
from node vi. EBi (10)
Egst
Before we give denitions for closeness and betweenness in
stochastic graphs we need to dene the concept of shortest path in where Egst i is the number of stochastic shortest path between
stochastic graphs. node vs and node vt that passes through node vi and Egst is the
Denition 3. In a stochastic graph, a path pi with weight of wpi number of all stochastic shortest paths between node vs and node
and length of ni from source node vs to destination node vd can be vt.
dened as an ordering fpi1 ; pi2 ; ; pini g3V of nodes in such a way Denition 7. Clustering coefcient of a node vi in a stochastic
that pi1 vs and pini vs are source and destination nodes, graph is a random variable and can be dened as follows
respectively and edge e(pij , pij1 )2E for 1 j ni, where pij is the jth
node in path pi.
1 X X wij wik
Denition 4. Let in a stochastic graph, there are r distinct paths ECCi aij :aik :ajk (11)
Ps,t {p1, p2, pr} between source node vs and destination node vd, ki 1:ESi jsi 2
ksi;j
the shortest path between source node vs and destination node vd is
dened as a path with minimum expected weight wp* min fwp g where ki is the number of nodes adjacent to node vi, and aij is 1 if
p2P
and called stochastic shortest path. node vi is adjacent to node vj, and 0 otherwise. ESi is the strength of
Denition 5. Closeness of node vi in a stochastic graph is a node vi and wij is the random variable associated with weight of
random variable dened by the following equation edge eij.
626 A. Rezvanian, M.R. Meybodi / Computers in Human Behavior 64 (2016) 621e640
4. Proposed algorithms for calculating network measures in action a1 of all learning automata are rewarded proportional to the
stochastic graphs amount of improvement and penalized otherwise.
In the previous section, we dened some network measures for 4.3. Stopping phase
stochastic graphs. In this section, several algorithms based on
learning automata are proposed for calculating network measures Updating phase is repeated until one of the following stopping
in stochastic graphs under the situation that the probability dis- conditions is met.
tribution functions of the weights associated with the edges of
graph are unknown. The proposed algorithms for calculating 1. The number of iterations k exceeds a given number Kmax,
network measures by taking samples from the edges of the sto- 2. The difference between the calculated measure in two consec-
chastic graph try to estimate the distribution of the network utive iterations becomes lower than a specied error Emin, or
measures. The process of sampling from the edges of the graph is 3. The average of entropy of probability vector of learning
guided by the aid of learning automata in such a way that the automata reaches a predened value Tmin.
number of samples needed to be taken from the edges of the sto-
chastic graph for estimating the network measures to be reduced as In the proposed algorithm, the entropy is used for measuring
much as possible. In the proposed algorithms, the guided sampling learning process. The information entropy for a learning automaton
process implemented by learning automata aims to take more with r actions can be dened as follows (Mousavian et al., 2013):
samples from the promising region of the graph, the regions that
reects higher rate of activities, instead of walking around and X
r
Fig. 1. Pseudo-code for algorithm 1 for calculating network measures in stochastic graphs.
iteration k and Hi(k) is the information entropy of learning au- 4.4.3. Improvement 3
tomaton Ai which reect the uncertainty of Ai about their actions. In In algorithm 1, we may reach a point that the changes in prob-
this way, the nodes whose estimated edge weights have more ability action vectors of some of the learning automata in two
changes will be more probable to be chosen in the future for the consecutive iterations are negligible which in this case we may turn
sampling purpose. The results of experiment 5 show that such a off these learning automata in order to prevent taking more sam-
strategy will lead to a higher rate of convergence and also fewer ples from the edges which may not lead us to a much better esti-
samples taken from the graph. Algorithm 1 in which the learning mation of the edges weights and consequently better estimate of
rate is adapted using the information entropy is called Algorithm 3. network measures. Algorithm 1 in which the above improvement is
628 A. Rezvanian, M.R. Meybodi / Computers in Human Behavior 64 (2016) 621e640
applied is called Algorithm 4. Experiment 5 shows that algorithm 4 have a greater similarity; and as closer as it is to unit, the two
comparing to algorithm 1 produces higher convergence speed for distributions will show a greater discrepancy. This measure has
all test graphs. been dened as
Fig. 2. Pseudo-code for algorithm 6 for calculating network measures in stochastic graphs.
630 A. Rezvanian, M.R. Meybodi / Computers in Human Behavior 64 (2016) 621e640
parameter distribution in each region is calculated as distribution as strength, closeness and betweenness in an ongoing fashion. The
percentiles. The absolute difference between these extracted values algorithm proposed in section 4 can be used by the network as a
for real and estimated distributions in the original data and ob- means for observing the time varying parameters of the network
tained samples data is considered as the distance between two for the purpose of network's measurements computations. The aim
distributions as given following equation. of this algorithm is to collect information from the network in order
to nd good estimates for the network's measurements using fewer
X
8
numbers of samples than that of standard sampling methods. The
DDQCG; G0 jQi G Qi G0 j (18) rst experiment that follows gives a hint how one can use the
i1
approach proposed in the paper to analyze a social network
where Q(G) IDP(I(i))i1, ,8 is the set of values for eight regions modeled as a stochastic graph. The remaining experiments (ex-
of graph G and IDP(I) is the interval distribution probability and can periments 2 through 5) study the performance of the algorithms
be calculated as given below proposed for estimation of the network's measurements.
0.04 0.05
0.03
0.04
0.03
P(X)
P(X)
P(X)
0.02 0.03
0.02
0.02
0.01
0.01 0.01
0 0 0
-2 0 2 4 6 8 -5 0 5 10 0 0.01 0.02 0.03 0.04
X -3
x 10 X -3
x 10 X
0.05 0.08
0.03
0.04
0.06
P(X)
P(X)
P(X)
0.03 0.02
0.04
0.02
0.01
0.01 0.02
0 0 0
0 0.005 0.01 0.015 0.02 -5 0 5 10 -5 0 5 10 15 20
X X -3
x 10 X -3
x 10
Fig. 3. Distributions of betweenness for some randomly selected nodes of the synthetic stochastic BA network.
P(X)
P(X)
0 0 0
0.16 0.18 0.2 0.22 0.24 0.35 0.4 0.45 0.5 0.15 0.16 0.17 0.18 0.19
X X X
P(X)
P(X)
0 0 0
0.12 0.14 0.16 0.18 0.2 0.22 0.2 0.4 0.6 0.8 1 0.1 0.12 0.14 0.16 0.18
X X X
Fig. 4. Distributions of clustering coefcient for some randomly selected nodes of the synthetic stochastic BA network.
synthetic stochastic ER, BA and WS graphs, respectively. In order to shown as y, x and z when corresponding algorithm is better
show the signicant statistical difference among the proposed al- than, worse than, and similar to that of Algorithm 6, respectively.
gorithms, a non-parametric statistical test called Wilcoxon rank From the results, several conclusions can be made: 1) For all the
sum test is conducted for independent samples (Garca et al., 2009; proposed algorithms, the average number of samples taken from
Wilcoxon, 1945) at the 0.05 signicance level. In Tables 7e9, the test each edge of graph is fewer than the number of samples taken using
result regarding corresponding Algorithm versus Algorithm 6 is SSM for all synthetic stochastic graphs and different condence
632 A. Rezvanian, M.R. Meybodi / Computers in Human Behavior 64 (2016) 621e640
0.8
0.6 0.3
0.6
P(X)
P(X)
P(X)
0.4 0.2
0.4
0.2 0.1
0.2
0 0 0
0.2 0.4 0.6 0.8 0.2 0.3 0.4 0.5 0.6 0.7 0.2 0.3 0.4 0.5 0.6 0.7
X X X
0.4
0.6 0.6
0.3
P(X)
P(X)
P(X)
0.4 0.4
0.2
0.2 0.2
0.1
0 0 0
0.2 0.3 0.4 0.5 0.6 0.2 0.3 0.4 0.5 0.6 0.2 0.3 0.4 0.5 0.6
X X X
Fig. 5. Distributions of closeness for some randomly selected nodes of the synthetic stochastic BA network.
P(X)
P(X)
0 0 0
8 9 10 11 12 12 14 16 18 20 24 26 28 30 32
X X X
0.2
0.15 0.15
0.15
P(X)
P(X)
P(X)
0.1 0.1
0.1
0.05 0.05
0.05
0 0 0
15 16 17 18 19 7 8 9 10 11 8 10 12 14
X X X
Fig. 6. Distributions of strength for some randomly selected nodes of the synthetic stochastic BA network.
levels; 2) From the results, we may observe that Algorithm 6 re- and ImaneDavenport with 95% condence level (a 0.05) among
quires the lowest number of samples from each edge as compared the proposed algorithms are also conducted (Garca et al., 2009) for
to algorithms 1 through 4 for all synthetic stochastic graphs and all synthetic stochastic graphs. As a statistical analysis, Friedman's
different condence levels. test is rst applied to obtain rankings. To obtain the adjusted p-
Moreover, a multi-comparison statistical tests using Friedman values for each comparison between the control algorithm (the
A. Rezvanian, M.R. Meybodi / Computers in Human Behavior 64 (2016) 621e640 633
Table 2
Average ranking of Friedman's test for randomly selected nodes of the synthetic stochastic BA network.
Average ranking Ranking Average ranking Ranking Average ranking Ranking Average ranking Ranking
Table 3 Table 5
p-values of the comparing nodes on the synthetic stochastic BA networks in terms of p-values of the comparing nodes on the synthetic stochastic BA networks in terms of
betweenness of nodes. closeness of nodes.
i Hypothesis Unadjusted p-value Holm p-value Shaffer p-value i Hypothesis Unadjusted p-value Holm p-value Shaffer p-value
15 Node 12 vs. Node 19 0.00E00 3.33E-03 3.33E-03 15 Node 19 vs. Node 70 0.00E00 3.33E-03 3.33E-03
14 Node 19 vs. Node 46 0.00E00 3.57E-03 5.00E-03 14 Node 19 vs. Node 52 0.00E00 3.57E-03 5.00E-03
13 Node 12 vs. Node 27 0.00E00 3.85E-03 5.00E-03 13 Node 12 vs. Node 70 0.00E00 3.85E-03 5.00E-03
12 Node 12 vs. Node 52 0.00E00 4.17E-03 5.00E-03 12 Node 27 vs. Node 70 0.00E00 4.17E-03 5.00E-03
11 Node 12 vs. Node 70 0.00E00 4.55E-03 5.00E-03 11 Node 19 vs. Node 46 0.00E00 4.55E-03 5.00E-03
10 Node 27 vs. Node 46 0.00E00 5.00E-03 5.00E-03 10 Node 12 vs. Node 52 0.00E00 5.00E-03 5.00E-03
9 Node 46 vs. Node 52 0.00E00 5.56E-03 7.14E-03 9 Node 27 vs. Node 52 0.00E00 5.56E-03 7.14E-03
8 Node 46 vs. Node 70 0.00E00 6.25E-03 7.14E-03 8 Node 46 vs. Node 70 0.00E00 6.25E-03 7.14E-03
7 Node 19 vs. Node 70 0.00E00 7.14E-03 7.14E-03 7 Node 19 vs. Node 27 0.00E00 7.14E-03 7.14E-03
6 Node 19 vs. Node 52 0.00E00 8.33E-03 8.33E-03 6 Node 12 vs. Node 46 0.00E00 8.33E-03 8.33E-03
5 Node 19 vs. Node 27 0.00E00 1.00E-02 1.25E-02 5 Node 12 vs. Node 19 0.00E00 1.00E-02 1.25E-02
4 Node 12 vs. Node 46 0.00E00 1.25E-02 1.25E-02 4 Node 27 vs. Node 46 0.00E00 1.25E-02 1.25E-02
3 Node 27 vs. Node 70 1.12E-02 1.67E-02 1.67E-02 3 Node 52 vs. Node 70 0.00E00 1.67E-02 1.67E-02
2 Node 27 vs. Node 52 2.20E-02 2.50E-02 2.50E-02 2 Node 46 vs. Node 52 0.00E00 2.50E-02 2.50E-02
1 Node 52 vs. Node 70 8.06E-01 5.00E-02 2.50E-02 1 Node 12 vs. Node 27 0.00E00 5.00E-02 5.00E-02
Note: Holm's procedure rejects those hypotheses that have an unadjusted p-value Note: Holm's procedure rejects those hypotheses that have an unadjusted p-value
5.00E-02; Shaffer's procedure rejects those hypotheses that have an unadjusted p- 3.33E-03; Shaffer's procedure rejects those hypotheses that have an unadjusted p-
value 3.33E-03. value 3.33E-03.
Table 4 Table 6
p-values of the comparing nodes on the synthetic stochastic BA networks in terms of p-values of the comparing nodes on the synthetic stochastic BA networks in terms of
clustering coefcient of nodes. strength of nodes.
i Hypothesis Unadjusted p-value Holm p-value Shaffer p-value i Hypothesis Unadjusted p-value Holm p-value Shaffer p-value
15 Node 46 vs. Node 52 0.00E00 3.33E-03 3.33E-03 15 Node 19 vs. Node 46 0.00E00 3.33E-03 3.33E-03
14 Node 12 vs. Node 52 0.00E00 3.57E-03 5.00E-03 14 Node 19 vs. Node 70 0.00E00 3.57E-03 5.00E-03
13 Node 19 vs. Node 46 0.00E00 3.85E-03 5.00E-03 13 Node 27 vs. Node 46 0.00E00 3.85E-03 5.00E-03
12 Node 27 vs. Node 46 0.00E00 4.17E-03 5.00E-03 12 Node 27 vs. Node 70 0.00E00 4.17E-03 5.00E-03
11 Node 52 vs. Node 70 0.00E00 4.55E-03 5.00E-03 11 Node 19 vs. Node 52 0.00E00 4.55E-03 5.00E-03
10 Node 12 vs. Node 19 0.00E00 5.00E-03 5.00E-03 10 Node 12 vs. Node 46 0.00E00 5.00E-03 5.00E-03
9 Node 12 vs. Node 27 0.00E00 5.56E-03 7.14E-03 9 Node 12 vs. Node 70 0.00E00 5.56E-03 7.14E-03
8 Node 46 vs. Node 70 0.00E00 6.25E-03 7.14E-03 8 Node 27 vs. Node 52 0.00E00 6.25E-03 7.14E-03
7 Node 27 vs. Node 52 0.00E00 7.14E-03 7.14E-03 7 Node 12 vs. Node 19 0.00E00 7.14E-03 7.14E-03
6 Node 19 vs. Node 70 0.00E00 8.33E-03 8.33E-03 6 Node 46 vs. Node 52 0.00E00 8.33E-03 8.33E-03
5 Node 19 vs. Node 52 0.00E00 1.00E-02 1.25E-02 5 Node 52 vs. Node 70 0.00E00 1.00E-02 1.25E-02
4 Node 27 vs. Node 70 0.00E00 1.25E-02 1.25E-02 4 Node 12 vs. Node 52 0.00E00 1.25E-02 1.25E-02
3 Node 12 vs. Node 70 0.00E00 1.67E-02 1.67E-02 3 Node 19 vs. Node 27 0.00E00 1.67E-02 1.67E-02
2 Node 12 vs. Node 46 0.00E00 2.50E-02 2.50E-02 2 Node 12 vs. Node 27 0.00E00 2.50E-02 2.50E-02
1 Node 19 vs. Node 27 9.24E-03 5.00E-02 5.00E-02 1 Node 46 vs. Node 70 0.00E00 5.00E-02 5.00E-02
Note: Holm's procedure rejects those hypotheses that have an unadjusted p-value Note: Holm's procedure rejects those hypotheses that have an unadjusted p-value
3.33E-03; Shaffer's procedure rejects those hypotheses that have an unadjusted p- 3.33E-03; Shaffer's procedure rejects those hypotheses that have an unadjusted p-
value 3.33E-03. value 3.33E-03.
best-performing one) and the other algorithms, Nemenyi, Holm According to the results of statistical signicance in Table 10, one
and Shaffer tests are conducted as post-hoc methods (if signicant can conclude that Algorithm 6 outperforms the other proposed
differences are detected). The rankings obtained by Friedman's test algorithms in terms of average number of samples taken from each
are presented in Table 10. The p-value computed by Friedman's test edge.
is 4.64E-11, which is below the signicance interval of 95%
(a 0.05). Thus, a signicant difference exists among the observed
results. Post-hoc methods (Nemenyi, Holm and Shaffer tests) are 5.2.3. Experiment III
also performed to obtain the adjusted p-values. Table 11 shows the This experiment is conducted to study the performance of the
adjusted p-values of the Nemenyi, Holm and Shaffer tests. proposed algorithms (Alg. 1, Alg. 2, Alg. 3, Alg. 4, Alg. 5 and Alg. 6)
and standard sampling method (SSM) for sampling from stochastic
634 A. Rezvanian, M.R. Meybodi / Computers in Human Behavior 64 (2016) 621e640
Table 7
Average results [ standard deviation (std)] for SSM and the proposed algorithms for synthetic stochastic ER graphs with different error rates in terms of ESR.
Methods CL
Note: y and x indicate a 0.05 level of signicance by Wilcoxon rank sum test. y, x and z denote that the performance of the corresponding algorithm is better than,
worse than, and similar to that of Algorithm 6, respectively.
Table 8
Average results [ standard deviation (std)] for SSM and the proposed algorithms for synthetic stochastic BA graphs with different error rates in terms of ESR.
Methods CL
SSM 24.66 3.39x 30.16 2.28x 37.19 2.82x 46.65 3.55x 60.59 4.65x 85.62 6.59x
Algorithm 1 9.27 1.74x 12.83 1.70x 15.04 1.78x 19.19 1.94x 27.18 2.57x 43.45 4.39x
Algorithm 2 9.77 1.19x 12.48 1.36x 15.55 2.24x 19.56 2.15x 24.98 2.53x 42.62 5.86x
Algorithm 3 8.92 1.32x 11.98 2.09x 15.36 4.43x 21.19 7.70x 23.50 2.23x 38.61 2.81x
Algorithm 4 10.09 0.49x 11.91 1.19x 14.30 1.49x 17.75 1.94x 24.95 3.56x 40.59 6.18x
Algorithm 5 7.51 1.53z 8.75 1.13z 11.46 1.87x 16.04 1.25x 21.72 3.28z 35.22 6.89x
Algorithm 6 7.29 1.52 8.31 1.81 10.51 1.84 14.88 1.63 21.30 5.76 30.86 3.03
Note: y and x indicate a 0.05 level of signicance by Wilcoxon rank sum test. y, x and z denote that the performance of the corresponding algorithm is better than,
worse than, and similar to that of Algorithm 6, respectively.
Table 9
Average results [ standard deviation (std)] for SSM and the proposed algorithms for synthetic stochastic WS graphs with different error rates in terms of ESR.
Methods CL
Note: y and x indicate a 0.05 level of signicance by Wilcoxon rank sum test. y, x and z denote that the performance of the corresponding algorithm is better than,
worse than, and similar to that of Algorithm 6, respectively.
Table 10
Average ranking of Friedman's test of comparison algorithms on the stochastic test networks.
graphs in terms of different distance metrics. For this experimen- terms of KS, SD, ND and PCC for mentioned metrics. For all test
tation, different synthetic stochastic graphs (BA, WS and ER) with graphs, the results for betweenness, clustering coefcient and
size from 1000 to 10000 are used. The results of this experimen- strength is more reliable than the results for closeness in terms of
tation are averages taken over all the test networks with respect to KS, SD and ND. Also, the results for betweenness in terms of ND are
different distance metrics: Kolmogorov-Smirnov (KS) distance, larger than the results for other metrics.
skew divergence (SD), normalized L1 distance (ND) and Pearson's
correlation coefcient (PCC) for estimated measures including:
betweenness, clustering coefcient, closeness and strength. The 5.2.4. Experiment IV
results of average and standard error with 95% condence interval This experiment is conducted to compare the proposed algo-
of mentioned metrics are given in Fig. 7 through Fig. 9 for synthetic rithms (Algorithm 1 and algorithm 6) with the pure chance algo-
stochastic graphs. From Figs. 7e9, Algorithm 6 outperforms other rithm (algorithm 1 in which the leaning automaton residing in each
proposed algorithms and standard sampling method for all test node is replaced by a pure chance automaton) with respect to the
graphs and standard sampling method achieves the low accuracy in number of samples taken from the edges of the graph. In pure
chance automaton, the actions are chosen with equal probabilities.
A. Rezvanian, M.R. Meybodi / Computers in Human Behavior 64 (2016) 621e640 635
Table 11
p-values of the comparing algorithms on the stochastic test networks.
Note: Nemenyi's procedure rejects those hypotheses that have an unadjusted p-value 6.67E-03; Holm's procedure rejects those hypotheses that have an unadjusted p-value
1.67E-02; Shaffer's procedure rejects those hypotheses that have an unadjusted p-value 6.67E-03.
In order to perform this experiment, we plot average number of taken from each edge (ESR) abundantly increases in initial itera-
samples taking from each edge of graph (ESR) during the execution tions for all algorithms and then reduces and gradually approaches
of the algorithms for calculating network measures versus iteration zero for Algorithm 1 and Algorithm 6, however ESR value for pure
number for both proposed algorithms (Algorithm 1 and Algorithm chance algorithm remains unchanged. This implies that for the
6) and pure chance algorithm. The results of this experiment are proposed algorithms, the average number of samples taken from
taken averages for each synthetic stochastic test graphs and are each edge of the graph gradually approaches zero. And also it is
given in Fig. 10. In Fig. 10, the points along the curves show the done with fewer numbers of samples taken from the graph as
average value and the error bars represent 95% condence interval. compared to the case where learning is absent that indicates the
As it is shown, as the algorithms proceeds, the number of samples important impact of learning on the superiority of the proposed
Fig. 7. Comparing results of average KS distances, average skew divergence, average normalized L1 distance and average Pearson's correlation coefcient over betweenness,
clustering coefcient, closeness and strength for synthetic stochastic ER graphs.
636 A. Rezvanian, M.R. Meybodi / Computers in Human Behavior 64 (2016) 621e640
Fig. 8. Comparing results of average KS distance, average skew divergence, average normalized L1 distance and average Pearson's correlation coefcient over betweenness,
clustering coefcient, closeness and strength for synthetic stochastic BA graphs.
learning automaton based algorithms for calculating network network models indicate that the difference between algorithms
measures in stochastic graphs. From the error bars shown in Fig. 11, 5e6 and algorithms 1e4 for synthetic stochastic small world
we conclude that the variances in the results decreases as the graphs is more obvious than other network models.
iteration number increases. According to the results for comparison
among network models, we can see that for Synthetic stochastic ER
graphs, the average number of samples taking from each edge of 6. Discussion
graph is much more than that of both Synthetic stochastic BA and
WS graphs. Similar result can be obtained for other algorithms. This study argued about the important role of activities of online
users as the main features of online social networks for generating
and sharing information through social networks. In general, ac-
5.2.5. Experiment V tivities of users in online social networks such as interacting with
In this experiment, we study the convergence behavior of the their friends, answering comments of their friends, participating in
proposed algorithms using information entropy (as dened in online communities, posting and sharing new information contents
equation (12)) of probability vector of learning automata residing in on the page of their friends, visiting prole pages of their friends,
the nodes of the graph responsible for deciding whether or not taking comment or like on a post of their friends are considered as
taking samples from the edges of graph. For this purpose, for each the weights associated with the edges of graph and due to nature of
algorithm we plot the average information entropy taken over all user activities in social network these weights may be uncertain,
leaning automata residents in the nodes versus iteration number as unpredictable and time-varying. Therefore in this study, we pro-
given in Fig. 11 for three kinds of synthetic stochastic graphs as posed that stochastic graphs, in which weights associated with the
mentioned before (ER-2000, ER-5000, ER-10000; BA-2000, BA- edges are random variables, may be a better candidate as a graph
5000, BA-10000; WS-2000, WS-5000, WS-10000). As it is shown, model for social network analysis. We assumed that the structural
for every algorithm the average entropy decreases as the algorithm and behavioral parameters of online social networks which are
proceeds. Using the fact the average entropy decreases as the al- time varying parameters can be observed by the network analyzers.
gorithm proceeds and also the result of previous experiment, we Using the observed parameters, the network analyzers can calcu-
may conclude that the learning automaton of each node gradually late different measurements such as strength, closeness and
converges to the action of do not take sample from the edges of betweenness in an ongoing fashion. For example, one may want to
chosen node. We can also conclude that Algorithm 2 has the nd a user which can inuence a large number of users by
highest and Algorithm 6 has the lowest speed of convergence. The spreading of inuence in a social network for marketing goals or
convergence behaviors of the proposed algorithms among the control the information ows over communication between other
Fig. 9. Comparing results of average KS distance, average skew divergence, average normalized L1 distance and average Pearson's correlation coefcient over betweenness,
clustering coefcient, closeness and strength for synthetic stochastic WS graphs.
Fig. 10. The plot of average cumulative ESR versus iteration number for stochastic test graphs.
638 A. Rezvanian, M.R. Meybodi / Computers in Human Behavior 64 (2016) 621e640
1 1 1
Algorithm 1 Algorithm 1 Algorithm 1
Algorithm 2 Algorithm 2 Algorithm 2
0.8 Algorithm 3 0.8 Algorithm 3 0.8 Algorithm 3
Algorithm 4 Algorithm 4 Algorithm 4
0.6 Algorithm 5 0.6 Algorithm 5 0.6 Algorithm 5
Entropy
Entropy
Entropy
Algorithm 6 Algorithm 6 Algorithm 6
0 0 0
0 1000 2000 3000 4000 5000 0 2000 4000 6000 8000 10000 0 10000 20000 30000 40000 50000
Iteration Iteration Iteration
(a) ER-2000 (b) ER-5000 (c) ER-10000
1 1 1
Algorithm 1 Algorithm 1 Algorithm 1
Algorithm 2 Algorithm 2 Algorithm 2
0.8 Algorithm 3 0.8 Algorithm 3 0.8 Algorithm 3
Algorithm 4 Algorithm 4 Algorithm 4
0.6 Algorithm 5 0.6 Algorithm 5 0.6 Algorithm 5
Entropy
Entropy
Entropy
Algorithm 6 Algorithm 6 Algorithm 6
0 0 0
0 1000 2000 3000 4000 5000 0 2000 4000 6000 8000 10000 0 10000 20000 30000 40000 50000
Iteration Iteration Iteration
(d) BA-2000 (e) BA-5000 (f) BA-10000
1 1 1
Algorithm 1 Algorithm 1 Algorithm 1
Algorithm 2 Algorithm 2 Algorithm 2
0.8 Algorithm 3 0.8 Algorithm 3 0.8 Algorithm 3
Algorithm 4 Algorithm 4 Algorithm 4
0.6 Algorithm 5 0.6 Algorithm 5 0.6 Algorithm 5
Entropy
Entropy
Entropy
Algorithm 6 Algorithm 6 Algorithm 6
0 0 0
0 1000 2000 3000 4000 5000 0 2000 4000 6000 8000 10000 0 10000 20000 30000 40000 50000
Iteration Iteration Iteration
(g) WS-2000 (h) WS-5000 (i) WS-10000
Fig. 11. Comparison of convergence behaviors of the proposed algorithms with respect to the information entropy for synthetic stochastic graphs.
users for managing misinformation. This may be easily realized by variables with exponential distributions. Furthermore, it was
calculating network centralities using one of the proposed algo- indicated that how the proposed algorithms can collect informa-
rithms as the network operates. The proposed algorithms for tion from the network in order to nd good estimates for the
calculating network measures can be used by the network ana- network's measurements using fewer numbers of samples than
lyzers as a means for observing the time varying parameters of the that of standard sampling methods. The overall results conrm that
network for the purpose of network's measurements using two set of learning automata can be achieve better results
computations. than that of one set of learning automata in terms of accuracy and
In the simulation, it was showed that modeling social networks convergence behavior.
as stochastic graph models by applying the framework of learning Analyzing social networks and its user behavior with stochastic
automation can be very benecial to reach good results with a graph models can be applied for some applications. Such applica-
certain condence level. Since, in many fundamental studies, re- tion can be in viral marketing in such a way that companies want to
searchers generate synthetic graphs similar to real networks in exploit user behavior models and then spread or promote their new
order to further simulations and investigate a number of phe- products or services to predict the user interactions and process of
nomena on a computer generated graph models in order to be able adoption of an idea or spread them. Another example can be in
to exploit their structures and dynamics of a network in a sys- domain of knowledge based systems by understanding the patterns
tematic manner, in the simulation, it was used several synthetic of user participation to generate new contents, propagating infor-
stochastic networks such as Baraba si-Albert model (Barabasi & mation on the networks via users, detecting active users from
Albert, 1999) as a synthetic scale-free network, WattseStrogatz spamming users, attracting new users and keeping some users,
model (Watts & Strogatz, 1998) as a synthetic small world network predicting the trends of topics in user communities, and perform-
and Erdo seRenyi model (Erdos & Re nyi, 1960) as a synthetic ing efcient content management.
random network in which edge weights are assumed to be random In summary, due to important role of modeling user behavior of
A. Rezvanian, M.R. Meybodi / Computers in Human Behavior 64 (2016) 621e640 639
online social networks in quantitative manner and variety of user 402e409). IEEE.
Freeman, L. C. (1979). Centrality in social networks conceptual clarication. Social
interactions on the structure of social networks, analyzing social
Networks, 1(3), 215e239.
networks with stochastic graph models may be a better candidate Galuba, W., Aberer, K., Chakraborty, D., Despotovic, Z., & Kellerer, W. (2010). Out-
as a graph model for analysis of social network and its user be- tweeting the twitterers-predicting information cascades in microblogs. In
haviors which can take into consideration the continuum of Proceedings of the 3rd conference on Online social networks (Vol. 39, pp. 1e9).
Garca, S., Molina, D., Lozano, M., & Herrera, F. (2009). A study on the use of non-
behavioral parameters of network occurring over time. parametric tests for analyzing the evolutionary algorithms' behaviour: A case
study on the CEC2005 special session on real parameter optimization. Journal
of Heuristics, 15(6), 617e644.
7. Conclusions
Girvan, M., & Newman, M. E. J. (2002). Community structure in social and biological
networks. Proceedings of the National Academy of Sciences, 99(12), 7821e7826.
The conventional social network models solely consider either Golder, S. A., Wilkinson, D. M., & Huberman, B. A. (2007). Rhythms of social inter-
the existence of connections between users in the form of binary action: Messaging within a massive online network. In Communities and tech-
nologies 2007 (pp. 41e66). Springer.
networks or consider xed weights for the edges in the form of Guo, L., Tan, E., Chen, S., Zhang, X., & Zhao, Y. E. (2009). Analyzing patterns of user
weighted networks. In this paper, we rst proposed stochastic content generation in online social networks. In Proceedings of the 15th ACM
graph as a model for social networks and then redened some of SIGKDD international conference on Knowledge discovery and data mining (pp.
369e378). ACM.
the social network measurements to be applicable to stochastic Gyarmati, L., & Trinh, T. A. (2010). Measuring user behavior in online social net-
graphs. Also several algorithms based on learning automata were works. IEEE Network, 24(5), 26e31.
designed for nding good estimations of these measurements for Jalali, Z. S., Rezvanian, A., & Meybodi, M. R. (2015). A two-phase sampling algorithm
for social networks. In 2015 2nd international conference on knowledge-based
the purpose of analysis. We believe that the approach proposed in engineering and innovation (KBEI) (pp. 1165e1169). IEEE.
this paper for modeling and analysis of social networks can provide Jalali, Z. S., Rezvanian, A., & Meybodi, M. R. (2016). Social network sampling using
a better way for studying online social networks such analysis of spanning trees. International Journal of Modern Physics C, 27(5), 1650052.
Janssen, J., Hurshman, M., & Kalyaniwalla, N. (2012). Model selection for social
user behavior and online human activities. In the future, we would networks using graphlets. Internet Mathematics, 8(4), 338e363.
like to generalize our measures and algorithms for some applica- Jin, L., Chen, Y., Wang, T., Hui, P., & Vasilakos, A. V. (2013). Understanding user
tions of social network analysis such as network sampling and in- behavior in online social networks: A survey. IEEE Communications Magazine,
51(9), 144e150.
formation diffusion.
Khomami, M. M. D., Rezvanian, A., & Meybodi, M. R. (2016). Distributed learning
automata-based algorithm for community detection in complex networks. In-
Acknowledgements ternational Journal of Modern Physics B, 30, 1650042.
Kwak, H., Lee, C., Park, H., & Moon, S. (2010). What is Twitter, a social network or a
news media?. In Proceedings of the 19th international conference on World wide
The authors appreciate the valuable comments of the editor and web (pp. 591e600). ACM.
anonymous reviewers. Leskovec, J., Backstrom, L., Kumar, R., & Tomkins, A. (2008). Microscopic evolution of
social networks. In Proceedings of the 14th ACM SIGKDD international conference
on Knowledge discovery and data mining (pp. 462e470). ACM.
References Leskovec, J., McGlohon, M., Faloutsos, C., Glance, N. S., & Hurst, M. (2007). Patterns
of Cascading behavior in large blog graphs. In SDM (Vol. 7, pp. 551e556). SIAM.
Akbari Torkestani, J., & Meybodi, M. R. (2012). A learning automata-based heuristic Liu, H., Nazir, A., Joung, J., & Chuah, C.-N. (2013). Modeling/predicting the evolution
algorithm for solving the minimum spanning tree problem in stochastic graphs. trend of osn-based applications. In Proceedings of the 22nd international con-
The Journal of Supercomputing, 59(2), 1035e1054. ference on world wide web (pp. 771e780). ACM.
Aliakbary, S., Habibi, J., & Movaghar, A. (2015). Feature extraction from degree Li, Y., Wu, C., Wang, X., & Luo, P. (2014). A network-based and multi-parameter
distribution for comparison and analysis of complex networks. The Computer model for nding inuential authors. Journal of Informetrics, 8(3), 791e799.
Journal, 58(9), 2079e2091. Luo, P., Li, Y., Wu, C., & Zhang, G. (2015). Toward cost-efcient sampling methods.
Badie, R., Aleahmad, A., Asadpour, M., & Rahgozar, M. (2013). An efcient agent- International Journal of Modern Physics C, 26(5), 1550050.
based algorithm for overlapping community detection using nodes' closeness. Mahdaviani, M., Kordestani, J. K., Rezvanian, A., & Meybodi, M. R. (2015). LADE:
Physica A: Statistical Mechanics and Its Applications, 392(20), 5231e5247. Learning automata based differential evolution. International Journal on Arti-
Barabasi, A. L., & Albert, R. (1999). Emergence of scaling in random networks. Sci- cial Intelligence Tools, 24(6), 1550023.
ence, 286(5439), 509e512. Mofrad, M. H., Sadeghi, S., Rezvanian, A., & Meybodi, M. R. (2015). Cellular edge
Beigy, H., & Meybodi, M. R. (2006). Utilizing distributed learning automata to solve detection: Combining cellular automata and cellular learning automata. AEU-
stochastic shortest path problems. International Journal of Uncertainty, Fuzziness International Journal of Electronics and Communications, 69(9), 1282e1290.
and Knowledge-Based Systems, 14(5), 591e615. Morales, A. J., Losada, J. C., & Benito, R. M. (2012). Users structure and behavior on an
Bild, D. R., Liu, Y., Dick, R. P., Mao, Z. M., & Wallach, D. S. (2015). Aggregate char- online social network during a political protest. Physica A: Statistical Mechanics
acterization of user behavior in Twitter and analysis of the retweet graph. ACM and Its Applications, 391(21), 5244e5253.
Transactions on Internet Technology (TOIT), 15(1), 4. Mousavian, A., Rezvanian, A., & Meybodi, M. R. (2013). Solving minimum vertex
Borgatti, S. P. (2005). Centrality and network ow. Social Networks, 27(1), 55e71. cover problem using learning automata. In Proceedings of 13th iranian confer-
Buccafurri, F., Lax, G., Nicolazzo, S., & Nocera, A. (2015). Comparing twitter and ence on fuzzy systems (IFSC 2013) (pp. 1e5).
facebook user behavior: Privacy and other aspects. Computers in Human Mousavian, A., Rezvanian, A., & Meybodi, M. R. (2014). Cellular learning automata
Behavior, 52, 87e95. based algorithm for solving minimum vertex cover problem. In 2014 22nd
Burke, C. J., Estes, W. K., & Hellyer, S. (1954). Rate of verbal conditioning in relation Iranian conference on electrical engineering (ICEE) (pp. 996e1000). IEEE.
to stimulus variability. Journal of Experimental Psychology, 48(3), 153. Narendra, K. S., & Thathachar, M. A. L. (1989). Learning Automata: An introduction.
Cha, M., Kwak, H., Rodriguez, P., Ahn, Y.-Y., & Moon, S. (2007). I tube, you tube, Printice-Hall.
everybody tubes: Analyzing the world's largest user generated content video Nazir, A., Raza, S., & Chuah, C.-N. (2008). Unveiling facebook: A measurement study
system. In Proceedings of the 7th ACM SIGCOMM conference on internet mea- of social network based applications. In Proceedings of the 8th ACM SIGCOMM
surement (pp. 1e14). ACM. conference on Internet measurement (pp. 43e56). ACM.
Costa, L. F., Rodrigues, F. A., Travieso, G., & Boas, P. R. V. (2007). Characterization of Newman, M. E. J. (2005). A measure of betweenness centrality based on random
complex networks: A survey of measurements. Advances in Physics, 56(1), walks. Social Networks, 27(1), 39e54.
167e242. Rezvanian, A., & Meybodi, M. R. (2015a). Finding maximum clique in stochastic
Ding, F., Cheng, H., Si, X.-M., Liu, Y., Xiong, F., & Shen, B. (2010). Read and reply graphs using distributed learning automata. International Journal of Uncertainty,
behaviors in a BBS social network. In 2010 2nd international conference on Fuzziness and Knowledge-Based Systems, 23(1), 1e31.
advanced computer control (ICACC) (Vol. 4, pp. 571e576). IEEE. Rezvanian, A., & Meybodi, M. R. (2015b). Finding minimum vertex covering in
Erdos, P., & Renyi, A. (1960). On the evolution of random graphs. Publications of the stochastic graphs: A learning automata approach. Cybernetics and Systems,
Mathematical Institute of the Hungarian Academy of Sciences, 5, 17e61. 46(8), 698e727.
Estes, W. K. (1950). Toward a statistical theory of learning. Psychological Review, Rezvanian, A., & Meybodi, M. R. (2016). A new learning automata-based sampling
57(2), 94. algorithm for social networks. International Journal of Communication Systems,
Falck-Ytter, M., & verby, H. (2012). An empirical study of valuation and user 1e21. http://dx.doi.org/10.1002/dac.3091. in-press.
behavior in social networking services. In World telecommunications congress Rezvanian, A., Rahmati, M., & Meybodi, M. R. (2014). Sampling from complex net-
(WTC) (pp. 1e6). IEEE. works using distributed learning automata. Physica A: Statistical Mechanics and
Feng, Z., Cong, F., Chen, K., & Yu, Y. (2013). An empirical study of user behaviors on Its Applications, 396, 224e234.
pinterest social network. In 2013 IEEE/WIC/ACM international joint conferences Shen, J., Brdiczka, O., & Ruan, Y. (2013). A comparison study of user behavior on
on web intelligence (WI) and intelligent agent technologies (IAT) (Vol. 1, pp. Facebook and Gmail. Computers in Human Behavior, 29(6), 2650e2655.
640 A. Rezvanian, M.R. Meybodi / Computers in Human Behavior 64 (2016) 621e640
Soleimani-Pouri, M., Rezvanian, A., & Meybodi, M. R. (2012). Solving maximum Nature, 393(6684), 440e442.
clique problem in stochastic graphs using learning automata. In 2012 fourth Wilcoxon, F. (1945). Individual comparisons by ranking methods. Biometrics
international conference on computational aspects of social networks (CASoN) (pp. Bulletin, 1(6), 80e83.
115e119). Yan, Q., Wu, L., & Zheng, L. (2013). Social network based microblog user behavior
Soleimani-Pouri, M., Rezvanian, A., & Meybodi, M. R. (2014). Distributed learning analysis. Physica A: Statistical Mechanics and Its Applications, 392(7), 1712e1723.
automata based algorithm for solving maximum clique problem in stochastic Zhong, E., Fan, W., Wang, J., Xiao, L., & Li, Y. (2012). Comsoc: Adaptive transfer of
graphs. International Journal of Computer Information Systems and Industrial user behaviors over composite social network. In Proceedings of the 18th ACM
Management Applications, 6, 484e493. SIGKDD international conference on Knowledge discovery and data mining (pp.
Vongsingthong, S., Boonkrong, S., Kubek, M., & Unger, H. (2015). On the distribu- 696e704). ACM.
tions of user behaviors in complex online social networks. In Recent advances in Zhu, Z., Su, J., & Kong, L. (2015). Measuring inuence in online social network based
information and communication technology 2015 (pp. 237e246). Springer. on the user-content bipartite graph. Computers in Human Behavior, 52, 184e189.
Watts, D. J., & Strogatz, S. H. (1998). Collective dynamics of small-world networks.