Emergence of Scaling in Random Networks

Author(s): Albert-László Barabási and Réka Albert

Source: Science, New Series, Vol. 286, No. 5439 (Oct. 15, 1999), pp. 509-512
Published by: American Association for the Advancement of Science
ing systems form a huge genetic network these two ingredients,we show that they are cited in a paper. Recently Redner (11) has
whose vertices are proteins and genes, the responsible for the power-law scaling ob- shown that the probability that a paper is
chemical interactions between them repre- served in real networks. Finally, we argue cited k times (representingthe connectivityof
sentingedges (2). At a differentorganization- that these ingredientsplay an easily identifi- a paper within the network) follows a power
al level, a large network is formed by the able and importantrole in the formation of law with exponent ycite = 3.
nervoussystem, whose vertices are the nerve many complex systems, which implies that The above examples (12) demonstratethat
cells, connected by axons (3). But equally our results are relevant to a large class of many large randomnetworks share the com-
complex networks occur in social science, networks observed in nature. mon featurethatthe distributionof their local
where vertices are individuals or organiza- Although there are many systems that connectivityis fiee of scale, following a power
tions and the edges are the social interactions form complex networks, detailed topological law for large k with an exponent oybetween
between them (4), or in the WorldWide Web data is available for only a few. The collab- 2.1 and 4, which is unexpected within the
(WWW), whose vertices are HTML docu- oration graph of movie actors represents a framework of the existing network models.
ments connected by links pointing from one well-documented example of a social net- The randomgraphmodel of ER (7) assumes
page to another(5, 6). Because of their large work. Each actor is representedby a vertex, thatwe startwith N vertices and connect each
size and the complexity of their interactions, two actors being connected if they were cast pair of vertices with probabilityp. In the
the topology of these networks is largely together in the same movie. The probability model, the probability that a vertex has k
unknown. that an actorhas k links (characterizinghis or edges follows a Poisson distributionP(k) =
Traditionally,networksof complex topol- her popularity)has a power-law tail for large e- Xk/k!, where
ogy have been described with the random k, following P(k) - k- Yactor,where yactor
graph theory of Erd6s and Renyi (ER) (7), 2.3 ? 0.1 (Fig. IA). A more complex net-
A=Nz I -k
but in the absence of data on large networks, work with over 800 million vertices (8) is the jpk( 1-p)N-

the predictions of the ER theory were rarely WWW, where a vertex is a documentand the \k/
tested in the real world. However, driven by edges are the links pointing from one docu- In the small-world model recently intro-
the computerizationof data acquisition,such ment to another. The topology of this graph duced by Watts and Strogatz (WS) (10), N
topological informationis increasinglyavail- determinesthe Web's connectivity and, con- vertices form a one-dimensional lattice,
able, raising the possibility of understanding sequently, our effectiveness in locating infor- each vertex being connected to its two
the dynamical and topological stability of mation on the WWW (5). Informationabout nearest and next-nearest neighbors. With
large networks. P(k) can be obtained using robots (6), indi- probabilityp, each edge is reconnected to a
Here we reporton the existence of a high cating that the probabilitythat k documents vertex chosen at random. The long-range
degree of self-organizationcharacterizingthe point to a certainWeb page follows a power connections generated by this process de-
large-scale properties of complex networks. law, with ywww= 2.1 ? 0.1 (Fig. iB) (9). A crease the distance between the vertices,
Exploring several large databasesdescribing network whose topology reflects the histori- leading to a small-world phenomenon (13),
the topology of large networks that span cal patternsof urbanand industrialdevelop- often referred to as six degrees of separa-
fields as diverse as the WWW or citation ment is the electricalpower grid of the west- tion (14). Forp = 0, the probability distri-
patterns in science, we show that, indepen- ern United States, the vertices being genera- bution of the connectivities is P(k) = 6(k -
dent of the system and the identity of its tors, transformers,and substations and the z), where z is the coordination number in
constituents,the probabilityP(k) that a ver- edges being to the high-voltage transmission the lattice; whereas for finite p, P(k) still
tex in the network interacts with k other lines between them (10). Because of the rel- peaks around z, but it gets broader (15). A
vertices decays as a power law, following atively modest size of the network, contain- common feature of the ER and WS models
P(k) - k-Y. This result indicates that large ing only 4941 vertices, the scaling region is is that the probability of finding a highly
networksself-organize into a scale-free state, less prominent but is nevertheless approxi- connected vertex (that is, a large k) decreas-
a feature unpredictedby all existing random mated by a power law with an exponent es exponentially with k; thus, vertices with
networkmodels. To explain the origin of this 'Ypower 4 (Fig. IC). Finally, a ratherlarge large connectivity are practically absent. In
scale invariance,we show that existing net- complex network is formed by the citation contrast, the power-law tail characterizing
work models fail to incorporategrowth and patternsof the scientific publications,the ver- P(k) for the networks studied indicates that
preferentialattachment,two key features of tices being paperspublished in refereedjour- highly connected (large k) vertices have a
real networks. Using a model incorporating nals and the edges being links to the articles large chance of occurring, dominating the
10' There are two generic aspects of real net-
es ~~~~~~~
0 0
~10 \10 6
works that are not incorporatedin these mod-
3. X A t' B * , C els. First, both models assume that we start
6~~~~~~~~~~10l- with a fixed number(N) of vertices that are
then randomlyconnected (ER model), or re-
105 1093 10
connected (WS model), without modifying
Ne~~~~~~~~ 107 i o2
N. In contrast,most real world networks are
open and they form by the continuous addi-
tion of new vertices to the system, thus the
6~~~~~~~~~~~~~~ 60
number of vertices N increases throughout
- 10' -3
the lifetime of the network.For example, the
10 102 1n' 102 103 104 100 actor network grows by the addition of new
100 10' 103 100 101
k actors to the system, the WWW grows expo-
time by the addition of new
Fig.1. The distributionfunctionof connectivitiesforvariouslargenetworks.(A) Actorcollaboration nentially over
graph with N = 212,250 vertices and average connectivity (k) = 28.78. (B) WWW,N = Web pages (8), and the research literature
325,729, (k) = 5.46 (6). (C) Powergrid data, N = 4941, (k) = 2.67. The dashed lines have constantly grows by the publication of new
slopes (A) )Yactor = 2.3, (B) -ywW = 2.1 and (C) 'Ypower = 4 papers. Consequently, a common feature of

510 15 OCTOBER 1999 VOL 286 SCIENCE

This content downloaded from on Thu, 6 Feb 2014 19:43:52 PM

All use subject to JSTOR Terms and Conditions

these systems is that the networkcontinuous- that a new vertex is connected with equal i11'tlk 2)
= 1 - ;112tl/k2(t + ino). The prob-
ly expands by the addition of new vertices probabilityto any vertex in the system [that ability density P(k) can be obtained from
that are connected to the vertices already is, fl(k') = couist = 1/1(i0 + t - 1)]. Such P(k) = aP[k1(t) < k]/ak, which over long
present in the system. a model (Fig. 2B) leads to P(k) - time periods leads to the stationarysolution
Second, the random network models as- exp(- k), indicating that the absence of
sume thatthe probabilitythattwo vertices are preferentialattachmenteliminates the scale- P(k) = k3
connected is random and uniform. In con- free feature of the distribution.In model B,
trast, most real networks exhibit preferential we start with N vertices and no edges. At giving y = 3, independentof rn. Although it
connectivity. For example, a new actor is each time step, we randomly select a vertex reproduces the observed scale-free distribu-
most likely to be cast in a supportingrole and connect it with probabilityfl(k1) = kil tion, the proposedmodel cannot be expected
with more established and better-knownac- Ijkj to vertex i in the system. Although at to account for all aspects of the studied net-
tors. Consequently,the probabilitythata new early times the model exhibits power-law works. For that, we need to model these
actor will be cast with an established one is scaling, P(k) is not stationary:because N is systems in more detail. For example, in the
much higher than that the new actor will be constant and the number of edges increases model we assumed linear preferentialattach-
cast with other less-known actors. Similarly, with time, after T N2 time steps the system ment; that is, H(k) k However, although
a newly createdWeb page will be more likely reaches a state in which all vertices are con- in general H(k) could have an arbitrarynon-
to include links to well-known popular doc- nected. The failure of models A and B indi- linear folrn f(k) - ka, simulations indicate
uments with already-highcoinectivity, and a cates thatboth ingredients growthand pref- that scaling is present only for a = 1. Fur-
new manuscriptis more likely to cite a well- erential attachment are needed for the de- thermore,the exponents obtainedfor the dif-
known and thus much-cited paper than its velopment of the stationarypower-law distri- ferentnetworksare scatteredbetween 2.1 and
less-cited and consequentlyless-known peer. bution observed in Fig. 1. 4. However, it is easy to modify our model to
These examples indicate that the probability Because of the preferentialattachment,a account for exponents different from y = 3.
with which a new vertex connects to the vertex that acquires more connections than For example, if we assume thata fractionp of
existing vertices is not uniform; there is a anotherone will increase its connectivity at a the links is directed,we obtain y(p) = 3 -
higher probabilitythat it will be linked to a higher rate; thus, an initial difference in the p, which is supportedby numerical simula-
vertex that already has a large number of connectivity between two vertices will in- tions (16). Finally, some networksevolve not
connections. crease furtheras the networkgrows. The rate only by adding new vertices but by adding
We next show thata model based on these at which a vertex acquires edges is aki/lt = (and sometimes removing) connections be-
two ingredients naturally leads to the ob- ki/2t, which gives k1i(t)= mII(t/ti)0.5, where tween established vertices. Although these
served scale-invariantdistribution.To incor- ti is the time at which vertex i was added to and other system-specific features could
porate the growing characterof the network, the system (see Fig. 2C), a scaling property modify the exponent y, our model offers,the
startingwith a smallnumber(io0) of vertices, that could be directly tested once time-re- first successful mechanismaccountingfor the
at every time step we add a new vertex with solved dataon networkconnectivitybecomes scale-invariantnatureof real networks.
/n(?m<770)edges that link the new vertex to in available. Thus older (with smaller t,) verti- Growth and preferential attachment are
different vertices alreadypresent in the sys- ces increase their connectivity at the expense mechanisms common to a number of com-
tem. To incorporatepreferentialattachment, of the younger (with larger ti) ones, leading plex systems, including business networks
we assume that the probabilityH7that a new over time to some vertices that are highly (17, 18), social networks(describingindivid-
vertex will be connectedto vertex i depends colunected, a "rich-get-richer"phenomenon uals or organizations), transportationnet-
on the connectivityki of that vertex, so that that can be easily detected in real networks. works (19), and so on. Consequently, we
fl(ki) = kill kj. After t time steps, the Furthermore,this property can be used to expect that the scale-invariantstate observed
model leads to a random network with t + calculate y analytically. The probabilitythat in all systems for which detailed data has
m 0 vertices and 711t edges. This network a vertex i has a connectivity smaller than k, been available to us is a generic propertyof
evolves into a scale-invariantstate with the P[k,(t) < k], can be written as P(t1 > many complex networks, with applicability
probabilitythat a vertex has k edges, follow- III2tl/k2).Assunmingthat we add the vertices reaching far beyond the quoted examples. A
ing a power law with an exponent Ymode to the system at equal time intervals, we better description of these systems would
2.9 + 0.1 (Fig. 2A). Because the power law obtain P(ti > ;112tl/k2) = 1 - P(ti ' help in understandingother complex systems
obselved for real networksdescribes systems
of ratherdifferentsizes at different stages of
100 0
their development, it is expected that a cor- 3

rect model should provide a distribution

whose main featuresare independentof time.
Indeed, as Fig. 2A demonstrates, P(k) is
independentof time (and subsequentlyinde- I 0~~~~~~b 0
pendentof the system size in0 + t), indicat- 10-2 ~0 :\102
1o 0 0 80
ing that despite its continuous growth, the
system organizes itself into a scale-free sta- -6
tionaly state. 2 1 0 1 2 3 4 5
The development of the power-law scal- 10 ~ 0 0 4061030 0 01 1 1
ing in the model indicates that growth and
preferentialattachmentplay an importantrole k k t
in network development.To verify that both
Fig. 2. (A) The power-law connectivity distribution at t = 1 50,000 (0) and t = 200,000 (L) as
ingredients are necessary, we investigated obtained from the model, using mo = m = 5. The slope of the dashed line is y = 2.9. (B) The
two variantsof the model. Model A keeps the exponential connectivity distribution for model A, in the case of mo = m = 1 (O), MO = m =
growing characterof the network,but prefer- 3 (0), MO= m = 5 (O), and mo= m = 7 (A). (C) Time evolution of the connectivity for two
ential attachmentis eliminated by assuming vertices added to the system at t1 5 and t2 = 95. The dashed line has slope 0.5. SCIENCE VOL 286 15 OCTOBER 1999 511

This content downloaded from on Thu, 6 Feb 2014 19:43:52 PM

All use subject to JSTOR Terms and Conditions

as well, for which less topological informa- References and Notes diagram of a computer chip (see http://visicad.cs.
tion is currently available, including such 1. R. Gallagher and T. Appenzeller, Science 284, 79 We found that P(k)
(1999); R. F. Service,ibid., p. 80. for both was consistent with power-law tails, despite
importantexamples as genetic or signaling the fact that for C. elegans the relativelysmall size of
2. G. Weng, U. S. Bhalta,R. Iyengar,ibid., p. 92.
networksin biological systems. We often do 3. C. Kochand G. Laurent,ibid., p. 96. the system (306 vertices) severely limits the data
quality, whereas for the wiring diagramof the chips,
not think of biological systems as open or 4. S. Wasserman and K. Faust, Social Network Analysis
vertices with over 200 edges have been eliminated
growing, because their featuresare genetical- (CambridgeUniv. Press,Cambridge,1994).
from the database.
5. Members of the Clever project, Sci. Am. 280, 54
ly coded. However, possible scale-free fea- (June 1999).
13. S. Milgram,Psychol. Today 2, 60 (1967); M. Kochen,
ed., The Small World(Ablex, Norwood, NJ, 1989).
tures of genetic and signaling networkscould 6. R. Albert, H. Jeong, A.-L. Barabasi,Nature 401, 130 14. J. Guare,Six Degrees of Separation:A Play (Vintage
reflect the networks' evolutionaiy history, (1999); A.-L.Barabasi,R. Albert, H. Jeong, PhysicaA Books, New York,1990).
272, 173 (1999); see also 15. M. Barthelemyand L. A. N. Amaral,Phys. Rev. Lett.
dominatedby growth and aggregationof dif- works. 82, 15 (1999).
ferent constituents,leading from simple mol- 7. P. Erd6sand A. Renyi,Publ. Math. Inst. Hung. Acad. 16. For most networks, the connectivity m of the newly
ecules to complex organisms. With the fast Sci. 5, 17 (1960); B. Bollobas,RandomGraphs(Aca- added vertices is not constant. However, choosing m
demic Press, London,1985). randomly will not change the exponent y (Y. Tu,
advances being made in mappingout genetic 8. S. Lawrenceand C. L. Giles, Science 280, 98 (1998); personal communication).
networks, answers to these questions might Nature 400, 107 (1999). 17. W. B. Arthur,Science 284, 107 (1999).
not be too far away. Similar mechanisms 9. In addition to the distributionof incoming links,the 18. Preferential attachment was also used to model
WWW displays a numberof other scale-free features evolving networks (L. A. N. Amaral and M. Bar-
could explain the origin of the social and characterizing the organization of the Web pages thelemy, personal communication).
economic disparities goveming competitive within a domain [B. A. Hubermanand L.A. Adamic, 19. J. R. Banavar,A. Maritan,A. Rinaldo,Nature 399, 130
systems, because the scale-free inhomogene- Nature 401, 131 (1999)], the distributionof searches (1999).
[B. A. Huberman, P. L. T. Pirotli,J. E. Pitkow, R. J. 20. We thank D. J. Watts for providingthe C. elegans and
ities are the inevitable consequence of self- Lukose,Science 280, 95 (1998)], or the number of power grid data, B. C. Tjadenfor supplyingthe actor
organizationdue to the local decisions made links per Web page (6). data, H. Jeong for collecting the data on the WWW,
by the individualvertices, based on informa- 10. D. J. Watts and S. H. Strogatz, Nature 393, 440 and L.A. N. Amaralfor helpful discussions.This work
tion that is biased toward the more visible (1998). was partiallysupported by NSF CareerAwaOdDMR-
11. S. Redner,Eur.Phys.J. B 4, 131 (1998). 9710998.
(richer) vertices, irrespective of the nature 12. We also studied the neural network of the worm
and origin of this visibility. Caenorhabditiselegans (3, 10) and the benchmark 24 June 1999; accepted 2 September 1999

