Professional Documents
Culture Documents
Google PageRank
Google PageRank
Beat Signer
Global Information Systems Research Group
Department of Computer Science
ETH Zurich
P2 P3 P4 P6
R2 R3 R4 R6
1.5
1 1.5
1
▪ where
▪ Bi is the set of pages
that link to page Pi P3
R is an eigenvector of H
with eigenvalue 1
Vrije Universiteit Brussel, August 25, 2008 Beat Signer, signer@inf.ethz.ch 7
Matrix Representation ...
▪ We can use the power method to find R
t +1
R = HR t
0 1 2 1
For our example H = 1 0 0
0 1 2 0
0 0 C
H= and R = 0 0
1 0
0 1 2 0 1 2
C= and S = H + C =
0 1 2 1 1 2
Vrije Universiteit Brussel, August 25, 2008 Beat Signer, signer@inf.ethz.ch 9
Strongly Connected Pages (Graph)
1-d
▪ Add new transition P1 P2
probabilities between
all pages
▪ with probability d we follow
P4 P3
the hyperlink structure S
▪ with probability 1-d we
choose a random page
P5
1-d 1-d
G = (1 − d ) 1 + dS
1
R = GR
n
Vrije Universiteit Brussel, August 25, 2008 Beat Signer, signer@inf.ethz.ch 10
G = (1 − d ) 1 + dS
1
Examples n
A2
0.37
A1 A3
0.26 0.37
A2 B2
0.185 0.185
A1 A3 B1 B3
A2 B2
0.14 0.20
A1 A3 B1 B3
A2 B2
0.23 0.095
A1 A3 B1 B3
A2 B2
0.24 0.07
A1 A3 B1 B3
A2 B2
0.17 0.06
A1 A3 B1 B3
A4 P(B ) = 0.14
P( A) = 0.86
0.125
▪ Is PageRank fair?
Vrije Universiteit Brussel, August 25, 2008 Beat Signer, signer@inf.ethz.ch 19
Conclusions
▪ PageRank algorithm
▪ absolute quality of a page based on incoming links
▪ random surfer model
▪ computed as eigenvector of Google matrix G
▪ Implications for website development and SEO
▪ PageRank is just one (important) factor