Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

Lecture 5: Case study—PageRank of google and image deconvolution

Problem: Rank websites?

Data: The linkage of webpages.

Example: We have A, B, C, D, E webpages

E
E s F means
F
webpage E has a hyperlink

Ǎ iB to F and so on

ěc
Model: Let Ti denote the score of node i.

In the simplest form, Google uses the following model

Ti 1 p1京 P Til 芳 PR
score from links to it
where n total number of nodes

Neil node numbers pointing to node i

Lj the number of outbound links of node j


p a parameter satisfying opal
Actually,
Ti is the probability representing the likelihood that a person randomly
clicking on links will arrive at webpage i. There are two terms in Ti :

1 pl去 The person will with probability 1 p open an arbitrary

new webpage with equal probability So he or she

opens webpage i with probability 1 p 六


will with probability p click links
p veil 芳 the person

an arbitrary link on the current webpage with


equal probability So the probability that the

person arrives webpage i is


the probability the current
is j
Pii webpage

Dhs there are Lj outbounds of j


so the conditional probability from j i

is 哥
Example: Consider the following website linkages. Find the corresponding (PR).

E 1 1
0 0 1 0
F adjacency
1 0 1 0 1
g
f
Da matrix o i o i o

Ǎ iB 1
0
1
0
0
0
0
0
0
0
0
0
0 0 0 1 0
Ǒc 0

0 卡

氟 仆们 p譬 i ii.in

8 0

0 0 0 0

e.g.chas two outbounds links to A


and B normalize
adjancymatrice
lumn size is 1cs.t.co
This is a linear equation on vector We can use
TER e.g LU decomposition

to solve it If p 0.85 then the solution is

丌 簪
8

So the ranking is

rank node score in out


1 C 2 2
0.305
2 B 0.242 3 2
3 D 0.208 2 1
4 A 0.189 3 2
5 F 0.032 1 1
6 E 0.025 0 3

Remark: 1) Why C ranks higher than B?

2) Why B ranks higher than A?

In general, Equation (PR) can be written in matrix form as

T 兴 T
where PA TERT HER
is the score vector, and A is the normalized

adjacency matrix with each column sum 1. We obtain

II PA 丌 舁T
which can be solved by LU decomposition.
Question: Prove that the solution T satisfy TT 1 and Ti 0 i 1 n

i.e., prove that the solution 丌 is indeed a probability distribution.

proof Since A is the normalized adjacency matrix with each column


sum 1 we have TA TT
We multiply T T on both side of I PA T 号 Tandgē
兴 TT
T T
T T pT AT
T
That is i T pin 1 p
T
Hence T T 1

To prove Ti 0 we can use Perron Frobenius Theorem

It is complicated and we omit the details here

Open questions: (a) How to improve the model (PR)?

(b) Any other possible application of (PR)?

Image Deconvolution

Problem: Recover the clear image from a blurred one.


Model: For simplicity, we consider 1D example, i.e., the unknown clear image is a

vector TER where xi i 1 n is level at pixel i


gray
We also assume the blurring is a moving average with weights 1 1 1

Remark: Other weights are possible, depending on physics model.

Then the observed blurred image 万 ER satis es

X i t t X i t Xi t l
i 1 n
bi 3
2

Illustration i 3iz it iitlitz at time ot

i 2 i t i i t l it2 at time 20t


i_t i i tl it2it3 at time 30t
i 2i l i i t l it2 camera sensor

For i 1 we have b xotx txz 13


For i n we have bn xn itxntxnt.tl3
where Xo xnti are not defined yet
We assume xo Xnti 0 This is called zero boundary condition

Altogether we have

At n where A

信 亨
iii
In the general case

bi Wk Xi_kt W XittWoxitW IXitltW zxitzt.n.tW kX.tk


Úxblǎkemi
品 Wix in is called convolusion
In general we can model blurring by
AT 万
where AER is a matrix representing a convolution

Remark: 1) b contains noise

2) A is very singular since the weight w̅ is non-negative and

Hence solving Ax=b directly will not give a clear image.


balance
regularization
parameter to
Methodology: We use regularization. and
the importanceT of
11AT 51liislarge
We solve 11Ax̅ 万11st
ün 1jflargerx
Tiisge.hn
llxllzissmalle toAx
keipntomaketh.eu
iregnm
image regular

and set it to
i Take gradient of the objective function
Solve AT A x̅ 5 x̅ Ō

I
Solve IATA I x̅ AT5

Symmetric positive
definite
ATAJ 1 I J ATA I
D ATAt XI IT
92 I J j TATAy x
T
y y
For any Jtō JT ATAt
HAJI发 11TH 0

Therefore we can use Cholesky decomposition to solve


ATAt I x̅ AT万

You might also like