Download as pdf or txt
Download as pdf or txt
You are on page 1of 25

Chapter 1: Events and Probability

Thursday, June 9, 2010,7:00 (GMT+7)

1. Background
Pr(E1E2) = Pr(E1) + Pr(E2) - Pr(E1E2)

ng dng:
- 2 independent events: Pr(AB) = Pr(A)Pr(B)
- 2 disjoint events: Pr(E1E2) = Pr(E1) + Pr(E2)
2. Tm tt l thuyt

Cc mc ch Ni dung
o
1.Verifying
Polynomial
Identities

Take note

Gi s ta c mt chng trnh nhn cc Gii thut bn c ngha


a thc.
m u cho phng php
V d:
gii quyt vn bng xc
(x + 1)(x 2)(x + 3(x 4)(x + 5)(x 6) sut.
Khng nn ch trng n:
x6 7x3 + 25
Chng trnh ca ta s output kt qu - vic nghim a thc l s
thc hay nguyn
la x6 7x3 + 25
Ta mun kim tra tnh ng n ca kt - vic chn khong
qu ny. Ta c th nhn ln lt cc s {1..100d} hay
hng vi nhau. Th nhng vic lm ny {1..1000000d}
rt tn km, m thc t ta li thc hin
li gii thut nhn a thc c nh vy l
i theo con ng c ri. Nu sai vn
ra kt qu sai m thi.
By gi ta s s dng gii thut random
gii quyt cu hi : F(x) ? G(x)

ALGORITHM 1.1:
Chn mt s x = a bt k. Kim tra
nu F(a) G(a)ta kt lun ngay
F(x) G(x).

PROBABILISTIC ANALYSIS:
Khi F(a) = G(a) ta cha th kt lun
ngay F(x) G(x) hn bi a c th l
nghim ca phng trnh F(x)
G(x) = 0.
Gi s F(x) l a thc bc d. Khi
F(x) - G(x) khng th c qu d nghim.
Tc nu nh F(x) G(x) trong tt c s
nguyn ta chn ch c d trng hp m
F(a) = G(a).

2. Axioms of
Probability

3. Verifying
Matrix
Multiplication

Nu nh ta chn a trong khong


{0,..100d} tc khng gian mu c 100d
kh nng.
--> Xc sut chn trng nghim ca
F(x) - G(x) l 1/100. y cng l xc
sut ALGORITHM 1.1 sai.
Thc hin thut ton n ln c lp. p
dng cng thc cho cc s kin c lp
ta c xc sut tht bi ca thut ton 1.1
1
l (100)n
Khng gian mu (Sample Space) l
tp hp tt c cc kh nng c th xy - d nh ta s coi
Conditional
ra ca mt s kin.
Hm xc sut l mt nh x t tp cc Probability nh sau: s
kin F l iu kin gii hn
s kin vo tp s thc R.
Ta gi hm s ny l: Pr(E) = Xc sut khng gian mu thnh mt
khng gian mu nh hn.
ca s kin E.
Conditional Probability: Pr(E | F) = - VD:
Pr(EF)
+ khng gian mu ca tp

Pr(F)
cc s t nhin l {0,1,2,3,
Law of Total Probability: E1, E2 l cc ..}.
s kin xung khc (E1 E2 =) m E1 + khng gian mu ca tp
v E2 lp y khng gian mu. Khi cc s t nhin vi iu
vi mt s kin bt k B ta c:
kin nh hn 5 l {0,1,2,3,4}
Pr(B) = Pr(B E1) + Pr(B E2)
+ S kin s chn l tp ca
=> Bayes' Law: Pr(E1 | B) = Pr(B cc s t nhin chn. (S
Pr(BE1)
E1) / Pr(B) =
kin l mt tp con ca
Pr(BE1)! Pr(BE2)
khng gian mu).
+ Hai s kin xung khc
(disjoint) lp u khng
gian mu l E1 s t nhin
chn , v E2 s t nhin l.
- GIi thut xc sut c
Cho 3 ma trn n*n l A, B v C. Ta cn Time (n2 ) m ci t rt
kim tra xem A*B ?= C .
n gin. i li ta ch dm
Trong A,B l cc ma trn n v, ch a ra xc sut chnh xc
bao gm cc s 0 v 1.
trong 1 ln chy l 1/2.
GIi thut c in: Tnh A*B v so snh - Sau 100 ln chy ta c
vi C. Time: (n3 ).
kt qu rt t tin bi xc
1

sut tht bi ch l (2)100 cn
ALGORITHM(1.3) : chn mt vector
nh hn xc sut m mt
n v n chiu ngu nhin r = (r1,r2,
bit trong my tnh ca ta b
....,rn), ri = 0 or 1, 1<=i <= n.
nhm ln

PROBABILISTIC ANALYSIS:

Ta tnh A*B*r = A * (B *r) ri so snh


vi C*r.
Case 1: A B r C r suy ra A B
C
Case 2: A B r = C r. Lc ny vn c
th A B C.

Tt c rj u c chn
Ta tnh xc sut : A B C v
ngu nhin (j = 1,2, ...,n). ch
A B r = C r. Tc xc sut gii
cn li ri ta xt sau cng.
thut tht bi.
Phng php ny c gi
t D = AB - C. Lc ny D 0 v
l deferred decision. Cc
D r = 0 .
Do D 0 nn tn ti mt phn t dij gi tr random ban u ta
coi nh c. bc quyt
trong ma trn D m dij 0
nh ta mi a s kin
Thm vo D r = 0 nn nj!1 dij nhu nhin vo.
n
j!1,j i

V d:
Cho x1,x2,x3,x4,x5,x6 l 6
Trong cc rj u c chn ngu
s t nhin random.
nhin. Gi s ta chn ngu nhin tt Tnh xc sut x1 + x2 +
cc cc rj(j = 1 n n) ch cn li ri. Lc x4 + x5 + x6 l s chia ht
n
d r
ny j!1,j i ij j nhn mt gi tr no cho 6.
dij
p dng deferred decision
c th l 0, 1 hay khc i. Suy ra kh ta c xc sut ny l 1/6.
nng chn ri tha mn phng trnh
(3.1) l khng qu 1/2 bi ri ch c th
nhn gi tr 0 hoc 1.
Vy xc sut tht bi trong mt ln chy
ALGORITHM(1.3) l 1/2.
Chy n ln c lp cho ta xc sut tht
1
bi l (2)n
rj = 0 ri =

4. A
Randomized
Min-Cut
Algorithm
(Karger
Algprithm)

dij rj

dij

(3.1)

Cho th G = (V,E). Ta nh ngha cut- Edge Contraction ca 2 nh


set l tp cc cnh ca th m nu b A v B l vic xc nhp 2
cc cnh i s thnh phn lin thng nh A v B li lm 1 nh
ca th s tng ln. Min-Cut ca trong khi gi nguyn mi
th G l cut-set nh nht ca th y lin h ca chng vi cc
Bi ton t ra l tm min-cut ca hnh khc trong
th. Tnh s cnh trong min-cut .
th.(Hay gi nguyn cc
Gii thut c in c phc tp n^3. cnh vo ra).

ALGORITHM:
mi iteration ta thc hin mt edge
contraction (ch gii phn takenote).
Sau khi thc hin n-2 iteration: ta cn
li 2 nh.

Return Min-Cut = S cnh ni 2 nh


ny

PROBABILISTIC ANALYSIS:
Gi S v V - S l hai tp nh b chia r
bi Min-Cut. Nu ta ch contract cc
nh trong S hoc V-S gii thut s cho
kt qu chnh xc. Bt c contract no
lm mt cnh trong Min-Cut, kt qu
khng chc s chnh xc.
Ta gi Ei l s kin ti iteration th i ta
khng contract bt c cnh no trong
Min-Cut. t Fi l s kin khng c bt
c ln no trong s i iteration u tin
contract mt mt cnh trong Min-Cut.
Ta c: Fi = ij ! 1 Ej
Ta cn tnh Pr(Fn!2 )

Ban u:
Gi n,m l s nh v s cnh ca th
G.
Gi MC l Min-Cut trong G v c l s
cnh trong MC.
Gi s nh A c bc nh nht trong
th deg(A) = k.
Suy ra c <= k. (4.1) (C th chng minh
bng phn ).
Theo nh l v ci bt tay (hand-
shaking)
nk
Suy ra m 2 (4.2)

Iteration th nht:
Thc hin edge contraction cho 2 nh
chn ngu nhin trong G.
S kin E1 = cnh c chn khng
nm trong Min-Cut.
Kt hp vi (4.1) v (4.2) ta c:
Pr(E1 ) = Pr(F1 ) = 1
2
= 1
n

c
k
1
nk Bt ng thc:
m
!x
2 1 x e

Bn c th o hm
chng minh bt ng thc
Nhn xt: xc sut tht bi khng ph ny..
thuc vo s cnh ca th m ch ph
thuc vo s nh.


Iteration th 2:
Sau ln chy u tin th cn n-1
cnh.
2
Do vy Pr(E2 |F1 ) = 1 n!1
Tng t nh vy ti iteration th i:
th cn n - i + 1 cnh
2
Pr(Ei |Fi!1 ) = 1 n!i!1

Tng kt li ta c:

Pr(Fn!2 ) = Pr(En!2 Fn!3 )
= Pr(En!2 |Fn!3 )Pr(Fn!3 )

= ...

= Pr(En!2 |Fn!3 )Pr(En!3 Fn!4 )

= Pr(En!2 |Fn!3 )Pr(En!3 |Fn!4 )

....Pr(E2 |F1 )Pr(F1 )


2

ni+1

) =

n
i=1

ni1
ni+1

) =

n
i=1
2

n(n1)

( 1



Ta ly kt qu nh nht trong ln ln
chyc chng trnh s dng
ALGORITHM 1.4
2
1
Pr(fail) = (n(n!1))n(n!1)ln n e!2 ln n = n2

3. Exercises:
http://docs.google.com/View?id=dgmqjfk5_188cq53p6ft

Chapter 2: Discrete Random Variables and Expectation


Thursday, June 9, 2010,12:00 (GMT+7)

1. Background
1.1. The inclusive-exclusive principle:
Pr(E1E2) = Pr(E1) + Pr(E2) - Pr(E1E2)

ng dng:
- 2 independent events: Pr(AB) = Pr(A)Pr(B)
- 2 disjoint events: Pr(E1E2) = Pr(E1) + Pr(E2)
1.2. Bayes' Law:
Pr(E1 | B) = Pr(BE1) / Pr(B) =Pr(BE1)Pr(BE1)+ Pr(BE2)

2. Tm tt l thuyt

Cc mc
ch o

Ni dung

Take note

1. Random Random Variable: mt bin ngu nhin Mt tp S c coi l c th


Variables and X l mt nh x t tp khng gian mu m c nu tn ti
Expectation vo tp cc s thc R.
mt song nh gia S v tp
Discrete Random Variable: mt bin cc s t nhin.
ngu nhin ri rc X l mt bin ngu
nhin m tp gi tr ca n khng phi l Ta cn ghi nh inh ngha
R na m l mt tp c th m c. ny c th hiu c
The Expectation of a Random
phn tip theo.
inh l ny c p dng lin
Variable: E[X] = x x Pr(X =
x); x
tc bi yu cu cc bin X v
Y ch cn ri rc.
Linearity of Expectation
Gi cho chng minh: s
X v Y l cc bin ngu nhin ri rc.
kin
E[X +Y] = E[X] + E[Y]
((X = x) (Y = y1 )) v s
kin ((X = x) (Y = y2 )) l
2 s kin xung khc (disjoin).
Suy ra: Pr ((X = x) (Y =
y1 )) + Pr ((X = x) (Y =
y2 ))
= Pr ((X = x) ((Y = y1 )
Y = y2 ) ))

Do :
y Pr ((X = x) (Y =
y )) = Pr(X = x)

2. The
Bernoulli Random Variable [ or
Bernoulli and indicator random variable]
Binomial
Xt kt qu ca mt th nghim:
Random
Y = 1 nu kt qu thnh cng
Variables
Y = 0 nu ngc li.
vi Pr(Y = 1) = p;
E[Y] = 1 . p + 0 . (1-p) = p
Binomial Random Variable
Ta gi X l mt Binomial random
variable with parameters n and p nu:

C 2 cch nh ngha mt
bin ngu nhin:
1. nh ngha da trn logic
2. nh ngha da trn xc
sut tc ra ch r tp v xc
sut ca tng s kin trong
tp .
Cch th 2 cho nh ngha
cht ch hn v c s
dng nhiu hn.

n k
p (1 p)n!k
i

Vi nh ngha theo cch th


2 ta c mt Distribution ca
Din gii r hn X l s ln thnh cng bin theo xc sut.
ca n trials, T1,T2, ... , Tn trong m
Binomial random viable
Pr(T1 = 1 ) = Pr(T2 = 1 ) = . . . . . . = p with parameters n and p


Chng minh: E[X] = np
E[X] = np.
Gi T1,T2, ... , Tn l n trials.
Mi Ti l mt Bernoulli
random variable with
parameter p
=> E[Ti] = p; i = 1, 2, 3, ..., n
p dng Linearity of
Expectation ta c:
E[X] = E[ ni!1 Ti ] =
n
E[Ti ] = np
i!1
Pr (X = k) =

3
Conditional Expectation
V d: Xt 2 con xc sc
Conditional Xt mt khng gian mu con ca khng chun (chun tc c 6 mt,
Expectation gian mu , tha mn Z = z;E[Y | Z =
mi mt c xc sut 1/6 v
ghi mt s khc nhau t 1
z] = y yPr(Y = y | Z = z)
c gi l expectation ca bin ngu n 6). Gieo 1 ln c 2 s
l X1 v X2.
nhin Y vi iu kin Z = z.
t X = X1 + X2;

Decomposition Law
xPr(X
E[X ] = y Pr(Y = y)E[X | Y = E[X |X1 = 2] =
x
y]
= x|X1 = 2)
Chng minh cng thc ny tng t nh
Nhn
t
hy
6

>
=
X1, X2 >= 1;
chng minh linearity of expectation.
y X1 = 2 nn 8>= X >= 3
nh l v k vng ca k vng:
8

E[X |X1 = 2] =
x
E[Y] = E[E[Y | Z]
x ! 3
8

Pr(X = x|X1
= 2)
x

x ! 3

1 11
=
6
2



Compare:
E1, E2 l c s kin xung
khc (E1 E2 =) m E1 v
E2 lp y khng gian mu.
Khi vi mt s kin bt k

B ta c:
Pr(B) = Pr(B E1) +
Pr(B E2)


Chng minh:
t: f(Z) = E [Y | Z].
Ta c:
E[E[Y | Z] = E[f(Z)] E[f(Z) ] =
Pr(Z = z)f(z)
z
=

Pr(Z = z)E[Y | Z z]
z

= E[X]
(ng thc cui suy t
decomposition law)

4. The
Geometric Distribution
Geometric X l mt geometric random variable
Expectation with parameter p nu:
Pr(X = n) = (1 p)n!1 p


T y ta tnh c:
Pr(X n) = i!n Pr(X = n) =

(1 p)n!1 p
i!n

nh ngha v geometric
random variable c a ra
di dng phn phi xc sut
(Xem Chapter 1)
Din gii v ngha, X
geometric random variable
with parameter p tc X l s
ln cn th t c

thnh cng u tin bit rng


n!1
i
xc sut thnh cng ca mi
= p(1 p)
(1 p)
ln l p.
i!0
= p(1 p)n!1
Compare:
1
1. Binomial Random Variable:

= (1 p)n!1
1 (1 p)
The number of Trials :

fixed = n

The number os

Successes: X
Cng thc tnh expectation cho bin
2. Geometric Random
nguyn dng::
Variable:
Cho X l mt bin ngu nhin ri rc ch The number of Trials: X
nhn cc gi tr nguyn dng:
The number of success:
E[X] = i!1 Pr(X 1)
fixed = 1


p dng cng thc trn ta tnh c
Chng minh:
expectation ca geometric random
(cng thc tnh expectation
variable:
cho bin nguyn dng)

E[X] = i!1 Pr(X 1) = i!1 (1 S dng nh ngha k vng


1
1
p)n!1 =
=
1!(1!p)
p

E[X] =

jPr(X = j)
j!1
i!j

Pr(X = j)

=
j!1

i!1

Hon i biu thc sig-ma


trn ta c:

j!1

i!1

Extra:
Coupon
Collector's
Problem

i!j
i!1

j!i

Pr(X = j) =
Pr(X = j)


= i!1 Pr(X j)
Problem: C n loi coupons trong hp, s
lng mi loi rt rt ln. Mi ln ta Bi ton ny c nhiu ng
ly ra 1 coupon. Hi ta phi ly bao nhiu dng trong thc t v vy bn
ln c th thu thp c n loi
cn c k c phg php
coupons ny.
phn tch v li gii.
Problem Analysis:

Bi ton yu cu tm s ln ly c th S dng k thut braching
thu c n loi. Nu vy s khc nhau process with 0 generation in
gia n-1 loi v n loi l g? Lc ta ly
memory or memoryless.
c 1 loi coupon ri. Kh nng c

thm loi na l rt d, xc sut ln
ly tip theo c thm 1 loi coupon l (n-
1)/n. Cn nu xt khi c n-1 loi ri,
ly c loi th n kia xc sut ch l
1/n. Nh vy vic ly thm c mt loi
coupon mi khng ph thuc vo cng
vic ta lm trc m ch ph thuc
vo s coupon tnh n thi im hin
ti. Tc s coupon cn ly thm i t
i-1 loi n i loi ch ph thuc vo gi tr
ca i.



Proof:

Gi X_i l s coupon cn ly thm tnh t
lc ta c i-1 loi n lc ta c i loi.
H(n) c gi l Harmonic
Mi X_i (i=1,2,...,n) l mt geometric
number.
random variable with parameter
H(n) = ln(n) + (1).
i!1
n!i!1
pi = 1 n = n . Suy ra: E[Xi ] = chng minh ta ch cn
dng bt ng thc tch phn
1
n
=

n!i!1
pi
t 1 n n cho hm f(x) = 1/x
Suy ra:
f( x ) f(x) f( x )
n
E[X] = E[ i ! 1 Xi ] =
n
n
E[Xi ] = ni ! 1

i ! 1
n!i!1

t k = n - i + 1 ta dc:
n

= n
k ! 1

1
= nH(n)
k

5.
Quick sort l mt gii thut tng i
Application: n gin v hiu qu. Vn cht cng
The Expected trong quik sort chnh l chn pivot sao
Run-Time of cho hp l, trong ri vo worst case
Quicksort
n^2 ca gii thut. Nu nh y ta
chn pivot mt cch ngu nhin liu gii
thut trn c tr nn tt hn khng?
tr li cu hi trn ta s phn tch
thi gian tnh ca Quick Sort vi pivot
chn ngu nhin.
Probabilistic Analysis:
Trong gii thut Quick Sort, sau khi K thut s dng Bernoulli
Random Variable:
chn xong pivot cng vic ca ta l so
snh pivot vi tng s trong dy con.
y nu phn tch k hn th cu lnh so
snh ny chnh l cu lnh c trng ca
vng lp trong Quick Sort. Do , ch cn
tnh s ln so snh ny ta s thu c
thi gian tnh ca thut ton. Gi s ln
so snh ny l X.
Gi s y_1,y_2,...,y_n l dy c xp
xp.
Gi X_ij l bin ngu nhin Beunoulli
tha mn:
X_ij = 1 nu trong qu trnh sp xp ta
c so snh y_i v y_j
i,j = 1,2,3...,n ; i<>j;
X_ij = 0 nu ngc li.

Ta c:
n
X = n!1
Xij
i!1
j!i!1
Suy ta:
n
E[X] = E[ n!1
Xij ] =
i!1
j!i!1
n!1
n
E[Xij ] (5.1)
i!1
j!i!1
Cng vic tip theo ca ta l i tnh
E[X_ij}
Xt cc s trong khong t v tr i n v Harmonic number H(n)
tr j :
y_i,y_i+1,....y_j ; (i <= j)
R rang n mt lc no pivot c
chn phi lm vic vi c y_i v y_j. (C

th chng minh bng phn chng).


Lc ny pivot s phi l mt trong j-i+ s
trn.
Trong c 2 trng hp dn n vic
ta phi so snh y_i v y_j.
Do : Pr(X_ij = 1) = 2/(j-i+1)
Suy ra: E[X_ij] = 2/(j-i+1) ( v X l bin
ngu nhin Bernoulli)
Thay ng thc trn vo (5.1) ta c:
2
n
E[X] = n!1

i!1
j!i!1
j!i!1
t k = j-i+1 ta c:
2
n!i!1
E[X] = n!1

i!1
k!2
k
S dng lut hon i sig-ma ta thu
c:
2
n!k!1
E[X] = nk!2
=
i!1
k
2

n
k!2

n!k!1
k

= (2n + 2)
k!2

1
2(n 2 + 1)
k

Rt gn ta c:
E[X] = (2n+2)H(n) - 4n
hay E[X] = (nln n)

Exercise
Exercise 2.3:
1. Cho f(x) l mt vertex function (f''(x)>=0). Chng minh rng: E[f(x)] >= f(E[x])
2.Chng minh rng: E[Xk ] (E[X])k
Exercise 2.6: C 2 con xc sc chun (chun tc c 6 mt, mi mt c xc sut 1/6 v ghi
mt s khc nhau t 1 n 6). Gieo xc sc ta c 2 s l X1 v X2.
(a) Tnh: E[X | X1 chn]
(b) Tnh E[X | X1 = X2]
(c) Tnh E[X1 | X= 9]
(d) Tnh E[X1-X2 | X = k
Exercise 2.7: Cho X v Y l 2 bin geometric vi tham s (with parameter) ln lt l p
v q.
(a) Tnh Pr(X = Y)
(b) Tnh E[max(X,Y)]
(c) Tnh Pr(min(X,Y) = k)
(d) Tnh E[X | X Y]

Hint: Bn s thy tnh memoryless ca geometric random variable rt c ch.


Exercise 2.12: Ta ly cc tm card t trong hp c n loi card.
(a) Tnh expectation ca s card phi ly cho n khi c n loi card
(b) Nu ta ly ng 2n tm card, k vng ca s tm card khng c chn l bao nhiu?
(c) Nu ta ly ng 2n tm card, k vng ca s tm card c chn ng mt ln l bao
nhiu?
Exercise 2.22: Cho u vo l mt dy ngu nhin n s: a1,a2,....,an. Mi s a_i c i
ch vi s lin k cho n khi n n c v tr cn sp xp.
Tnh Expected Number of Swap of Buble Sort.
Hint: Ta ni a_i v a_j l inverted (b o ln) nu (i < j) AND (a_i > a_j).
Mi bc swap trong Buble Sort lm mt i mt inverted pair.
Exercise 2.23: Cho u vo l mt dy ngu nhin n s: a1,a2,....,an
Tnh Expected Runtime of Linear Insertion Sort.
Hint: Ta ni a_i l out of order nu tn ti a_j tha mn (i < j) AND (a_i > a_j).
Sau iteration th k trong Linear Insertion Sort, phn t th 1,2,...k u in order.

----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Answers:
____________________________________________________
Exercise 2.3:
1. Cho f(x) l mt vertex function (f''(x)>=0). Chng minh rng: E[f(x)] f(E[x])
p dng Taylor Expansion ln cn im :
f(x) = f() +

(0,1)

f'()(x-)
1!

cf''()(x-)2
2!

Do f''(} 0 nn: E [f(x)] E[f() +

; trong c l mt hng s trong khong

f'()(x-)
1!

] = E[f()] + E[

f'()(x-)

f'()(x-)

1!

].

V f() v f'() l hng s nn nn: E [f(x)] E[f() +


] = E[f()] +
1!
E[f'()(x-)] (1)
Ly expectation ca c 2 v ta c: E[f() + f'()(x-) ] = f() + f'()(E[x]-)(2)
Chn = E[X] ri kt hp (1) (2) li ta c: E[f(x)] f(E[x])
2.Chng minh rng: E[Xk ] (E[X])k

t f(x) = x^k. Ta c f''(x) = k(k-1)x^(k-2) >=0;


p dng bt ng thc trong phn 1 ta thu c iu phi chng minh.
____________________________________________________
Exercise 2.6: C 2 con xc sc chun (chun tc c 6 mt, mi mt c xc sut 1/6 v
ghi mt s khc nhau t 1 n 6). Gieo xc sc ta c 2 s l X1 v X2.
(a) Tnh E[X | X1 chn]
Trc ht ta tnh
E[X|X1] = E[(X1+X2)|X1] (Linearity of conditional expectation)
= E[X1 | X1] +E[X2|X1] = X1 + E[X2] ( v X1
v X2 c lp).
= 7/2 + X1
- - trn ta s dng kt qu: E[X 2] = 6x ! 1 xPr(X2 = x) = 6x ! 1 x
1
6

=
6

6
x ! 1

x =

7
2

E[X | X1 chn] = Pr(X1 = 2)E[(X = x |X1 = 2)] + Pr(X1 = 4)E[(X = x |X1 =


4)] + Pr(X1 = 6)E[(X = x |X1 = 6)] (decomposition law mc 2.1)
1
7
1
7
1
7
18
= 6 (2 + 2) + 6 (2 + 4) + 6 (2 + 6) = 5
(b) Tnh E[X | X1 = X2]
E[X | X1 = X2] = 6x ! 1

Pr(X2 = x)E[X |(X1 = X2 (X2 = x) ]


= 6x ! 1 Pr(X2 = x)2x

==
1
3

6
x ! 1

3x =
3

6
x ! 1

x =

21 = 7

(c) Tnh E[X1 | X= 9]


Do 1 X1 6 m X = 9 nn X1 = 3,4,5,6.
E[X1 |X = 9] =

6
x ! 3

xPr(X1 = x| X = 9)

Dng Bayes' Law: Pr(X1 = x| X = 9) =

Pr(X1 ! xX ! 9)
Pr(X ! 9)
6
x ! 3

=
21

1
36
4
36

1
4

Thay vo trn ta c: E[X1 |X = 9] =


x =
4
4
(d) Tnh E[X1-X2 | X = k]
E[X1-X2 | X = k] = E[X1 | X = k] - E[X2 | X = k] ( linearity of expectation)

= 0.
Bi X1 v X2 l 2 bin hon ton c lp, gi vai tr nh nhau trong biu thc trn. Do
vy kt qu ca 2 biu th phi nh nhau.
(Chng minh bng phn chng cng l mt cch hay bi E[X] c nh ngha l mt
nh x t R vo R.
____________________________________________________
Exercise 2.7: Cho X v Y l 2 bin geometric vi tham s (with parameter) ln lt l p
v q.
(a) Tnh Pr(X = Y)
Pr(X = Y) = n!1 Pr(X = x Y = y) = n!1 Pr(X = x )Pr(Y = y)

=
(1-q)

n!1

(1-p)

n-1

p(1-q)

n-1

q = pq

n!1

((1-p)

n-1
1

pq

= pq 1- (1-p) (1-q) = p ! q - pq
(b) Tnh E[max(X,Y)]
Do X v Y l 2 bin geometric vi tham s (with parameter) ln lt l p v q nn E[X] =
1/p v E[Y] = 1/q
Gi X1 l mt Bernoulli random variable tha mn
X1 = TRUE khi v ch khi X = 1 tc ln th u tin thnh cng. Pr(X1 = TRUE) =
p;
X1 = FALSE nu ngc li.
E[max(X,Y)] = Pr(X1 = TRUE) E[max(X | X1 = TRUE , Y) + Pr(X1 = FALSE)
E[max(X | X1 = FALSE , Y)
= p * E[Y] + (1-p)*E[max(X | X1 = FALSE , Y)
] ( v X1 = TRUE khi v ch khi X = 1 nn max(X|X1= 1,Y) = Y) (b-1)
Khi X > 1 , gi X* l s ln cn phi th cho n ln thnh cng u tin. Khi E[X|
X1 = FALSE] = E[X* +1].
E[max(X,Y)] = p * E[Y] + (1-p)*E[max(X* + 1 , Y) ]
Gi Y1, v Y* l bin tng t nh X1 v X*, ch cn thay X bi Y. Lm tng t nh
trn ta thu c:
E[max(X,Y)] = p * E[Y] + (1-p)*( q*E[max( X* + 1 , Y|Y1 = TRUE) + (1 q)*E[max(X* + 1 , Y*+1)] )
= p * E[Y] + (1-p)*( q*E[X* +1] + (1-q)
*E[max(X*,Y*) + 1]).
Do tnh memoryless ca phn phi geometry nn E[X*] = E[X], E[Y*] = E[Y],
E[max(X*,Y*) + 1] = E[max(X,Y)], E[X] = 1/p v E[Y] = 1/q . Thay vo ta c:
E[max(X,Y)] = p/q + (1-p)*(q*(1/p+1)+(1-q)*(E[max(X,Y)]+1) )
Suy ra : E[max(X, Y)] =

p q
q p

1 ! ! - p -q
p ! q -pq

(c) Tnh Pr(min(X,Y) = k)


Pr(min(X, Y) = k) = Pr(X = k Y k + 1) + Pr(X = k Y = k) +
Pr(X k + 1 Y = k)
= Pr(X = k)
Pr(Y k + 1) + Pr(X = k)Pr( Y = k) + Pr(X k + 1 )Pr( Y = k)
= (1-p)
(1-q)k + (1-p)

k-1

p(1-q)

k-1

q + (1-p)k p(1-q)

k-1

k-1

(xem muc 4 chng 2: Pr(Y n ) = (1-q)n-1 )


= (1-p)
(1-q)

k-1

(p + q - pq)

k-1

(d) Tnh E[X | X Y]


Li gii tng t (b)
1
E[X | X Y] = p ! q - pq
____________________________________________________
Exercise 2.12: Ta ly cc tm card t trong hp c n loi card.
(a) Tnh expectation ca s card phi ly cho n khi c n loi card
Bi ny tng t nh Coupon Collector Problem . Xem mc 2.4.Kt qu: E[X] = H(n)
(b) Nu ta ly ng 2n tm card, k vng ca s tm card khng c chn l bao nhiu?
Gi X_i l s loi card ly c ngay sau khi rt card th i. (i = 1,2,...,2n). D thy:
X_1 = 1;
Vi i>=1, c 2 trng hp sau:
1. Card tip theo thuc mt loi no c ri. Nh vy X_i = X_(i-1). Xc sut xy
X

ra s kin ny l Pr(Xi = Xi-1 ) = ni-1


2. Card tip theo thuc mt loi hon ton mi. Nh vy X_i = X_(i-1) + 1. Xc sut
xy ra s kin ny l Pr(Xi = Xi-1 + 1) = 1 -
Xi-1

Xi-1
n
X

Suy ra: E[Xi |Xi-1 ] = Xi-1 n + (Xi-1 + 1)(1 - ni-1) = 1 + Xi-1 (1- n)
Ly Expectation ca 2 v ta c: E[X_i] = E[E[X_i | X_i-1]] = 1 + a*E[X_i-1]; vi a =
1-1/n.
Bin i cng thc truy hi trn ta thu c: E[X2n ] = a2n-1 X[1] + a2n-2 + .. . + a + 1 ;
vi a = 1-1/n.
Thay X_1 = 1 vo ra rt gn ta c: E[X2n ] = a

2n-1

+a

2n-2

+ .. . + a + 1 =

1-a

2n

1-a
-2

Khi n ln ta c th thay: (1- n)n e-1 . Kt qu cui cng: E[X2n ] = n(1-e )

(c) Nu ta ly ng 2n tm card, k vng ca s tm card c chn ng mt ln l bao


nhiu?
L lun tng t nh trn. Ch cn thy trng hp X_i = X_(i-1) bi X_i = X_(i-1) - 1
Lc ny hng s a tr thnh 1 - 2/n
n

-4

E[X2n ] = (1-e )
2

____________________________________________________
Exercise 2.22: Cho u vo l mt dy ngu nhin n s: a1,a2,....,an. Mi s a_i c i
ch vi s lin k cho n khi n n c v tr cn sp xp.
Tnh Expected Number of Swap of Buble Sort.
Proof:
Ta ni a_i v a_j l inverted (b o ln) nu (i < j) AND (a_i > a_j).
Gi X_ij l mt Bernoulli random variable tha mn:
X_ij = 1 nu a_i v a_j l mt inverted pair. Pr(Xij = 1) = nk!1 Pr(ai = k
aj > k) =

n
k!1

1
n

n-k

X_ij = 0 nu ngc li.

= 1 -
n

n
k!1

k = -

2 2n

t X = s ln Swap trong Buble Sort


Trong Buble Sort, s ln swap chnh l s inverted pair.
n
Do vy: X = n-1
Xij . Ly expectation 2 v:
j!i!1
i!1
E[X] = E[
E[X] =

n-1
i!1

n
j!i!1

n-1
i!1

Xij ] =
1

n
j!i!1

n-1
i!1

( - ) =

n
j!i!1

E[Xij ] (Linearity of Expectation)

(n-1)2

2 2n

____________________________________________________
Exercise 2.23: Cho u vo l mt dy ngu nhin n s: a1,a2,....,an
Tnh Expected Runtime of Linear Insertion Sort.
Proof:
Sau khi sp xp cc s c th t t l: 1,2,...,n
Gi s trc khi sp xp cc s c th t 1,2,...,n ang v tr ln lt l x_1,x_2, ...
,x_n. y (x_1,x_2, ... ,x_n ) l mt hon v ca (1,2,...,n)
x_i v n v tr th nht cn thc hin |x_i - i| ln swap. Tng s ln swap l
n
i!1

X =

|xi - i|

Trong (x_1,x_2, ... ,x_n ) l mt hon v ca (1,2,...,n).


E[X] = E[ ni!1 |xi - i|] = ni!1 E[|xi - i|] (Linearity of Expectation)

1
n

E[|ai - i|] =
(

i-1
k!1

i-1
k!1

(i-k) +

Pr(ai = k)(i - k) +
n
k!i!1

= (

i-1
j!1

n
i!1

Suy ra:
n
i!1

E[|xi - i|] = (
n

hon i sig-ma ta c:
1

E[X] = (
1
n

n-1
j!1

n-1
j!1

((n-j)j ) +

n-1
i!j
n-1
j!1
2

= (
n

(n-1) -

2
n-1

j +

n-1
j!1

n-1
j!1

n-i
j!1

i-1
j!1
n-j
i!1

j )

j +

n-i
j!1

n
i!1

j ) . p dng lut

j ) =

((n-j)j ) = 2

j +

( (n-j)j ))

(n-1)n(2n - 1)

E[X] = n2 +

Pr(ai = k)(k - i) =

(k-i))
1

E[X] =

n
k!i!1

Chapter 5: Balls and Bins


Thursday, June 10, 2010,12:30 (GMT+7)

n-1
j!1

j - (
n

n-1
j!1

j ) = n

1. Background
1.1. The inclusive-exclusive principle:
Pr(E1E2) = Pr(E1) + Pr(E2) - Pr(E1E2)

ng dng:
- 2 independent events: Pr(AB) = Pr(A)Pr(B)
- 2 disjoint events: Pr(E1E2) = Pr(E1) + Pr(E2)
1.2. Bayes' Law:
Pr(E1 | B) = Pr(BE1) / Pr(B) =Pr(BE1)Pr(BE1)+ Pr(BE2)

1.3. Expectation
1.4. Binomial Distribution
: n trials + p success
1.5. Geometric Distribution
: n trials + 1 success
2. Tm tt l thuyt

Cc mc
ch o

Ni dung

Take note

1.The Birthday Problem:


Paradox
C 30 ngi trong phng, Hi xc sut tn ti 2 - Khng k nm
nhun (leap
ngi c ngy sinh trng nhau l bao nhiu?
years) v sinh

i (twin)


Problem Analysis:
Ngy sinh c 365 kh nng. S ngi l 30, 29 hay
ch c 1 ,2 c khc g nhau khng? Khi c 1 ngi Thm vo :
chc chn l khng trng vi ai. Khi c 2 ngi th tnh s ngi cn
kh nng khng trng chnh l kh nng ngi 2 trong phng
xc sut tn ti 2
sinh khac ngy ngi 1. Khi c 364 tng
ngi c ngy
hp trong s 365 trng hp c th. Xc sut l
364/365. Nh vy xc sut ngi th i khng snh trng nhau
trng vi ngy sinh ca nhng ngi trc hon bng 1/2
ton khng ph thuc vo ngi trc sinh ngy
no m ch ph thuc vo gi tr ca i. V xc sut

ny l:

1 - (i-1)/365
T cng thc (1)
Proof:
i theo lp lun trn ta tnh dc xc sut 30 c th rt ra gi
tr ca m xc
ngi khng sinh trng ngy l:
sut ny = 1/ 2
i!1
30
( 1 ) 0.2937
i!1
365
l: m2/2n =

2. Balls into
Bins

Tng qut bi ton cho n ngy sinh v m ngi c ln2 hay


ngy sinh khng trng nhau:
m = 2 n ln2
i
i
m
!
!
Gi tr ny ch c
i
i!0
n
Pr = m!1
( 1 ) m
e n=e
=
i!0
i!0
n
tim cn l cn n.
m(m!1)
!
2n
Nh vy l rt
e
(1)
!x
trn ta s dng cng thc 1 x e vi x nh so vi nhn
nh ban u. V
tng i gn 0.
th n c gi
l Paradox.
Tng qut vn Birthday Paradox trn ta xy dng
c mt m hnh ton hc gi l balls into bins.
y c s tng ng s ngi l s bng v s
ngy sinh l s hp. Nu by gi ta nm m balls vo
n bins (gi s khng nm trt qu no) , lc ny
mi bins s c mt s bng nht nh. Ta gi
maximum load l s bng cha trong hp c nhiu
bng nht.
nh l: Xc sut maximum load ln hn 3 ln
n/ln ln n l khng qu 1/n.
Proof:
Xt bin th nht. Xc sut c t nht M balls
trong bin 1 s l:
n
1
M (n)M
(chn ra M qu trong s n qu. Xc sut mi qu
vo hp 1 l 1/n).
Dng Taylor
n
1 M
1
e M
M (n) M! (M)
Expansion cho
e^k
Tron bt ng thc th 2 ta s dng cng thc:
k

k! < i!0 i! = ek ]
Do xc sut tn ti mt bin cha nhiu hn M
balls l:
n
1
e
n M (n)M n(M)M
Thay M = 3 ln n/ln ln n vo ri chuyn ton b sang
dng exp ta chng minh c xc sut ny khng
qu 1/n
3. The Poisson
Distribution
4. Application
Hashing:
Problem Set
Membership
4.1. Chain
Hashing
4.2.
Fingerprint

Problem:
hiu th no
Cho tp S = {s_1,s_2, ... ,s_m} l tp con ca mt tp l rt ln bn c
rt ln universe U.
th coi S l tp
Vi mt phn t x bt k chn t U, ta phi tr li cc bi ht trong
cu hi:" x c l phn t ca S hay khng?".
my tnh ca bn.
Cu hi ny c gi l Set Membership Problem . Cn U l tp ton
1. Chain Hashing
b bn nhc trn
Phng php c in nht l to mt bng bm tm th gii.

Method
kim, Bn c th dng hm bm ngu nhin.Phng
4.3. Bloom php ny lun cho kt qu chnh xc v thi gian
Filter Method kh nh. Theo phn tnh mc 2, maximum load
bng ln n/ln ln n l khng qu 1/n. Vy th thi gian
tm kim ln hn (ln n/ln ln n ) vi xc sut khng
qu n.

Nhc im ca phng ph ny l truy cp b nh
qu ln : m phn t ca tp S khng th lu trong
RAM c.



4.2. Fingerprint Method

Ta nh ngha mt hm to fingerprint nh sau:
f: S -> B
fingerprint = du
Trong B l tp cc s nh phn b bt, D thy B vn tay
c 2^b phn t. Ta ch cn lu m phn t, mi phn
t b bt trong RAM. Tc cn m*2^b bit.

Vic tnh f(x) cng chnh l vic tm ra fingerprint
ca x.



ALGORITHM

Tnh f(x). So snh f(x) vi tt c cc f(s_i); s_i

thuc S

C 2 trng hp xy ra:

Case 1: Nu f(x) <> f(s_i); mi i = 1, 2, ...,m

=> x khng thuc S.

Case 2: Nu tn ti 1<= i <= m f(x) = f(s_i)

=> x thuc S.



PROBABILISTIC ANALYSIS:

Case 1: Nu f(x) <> f(s_i); mi i = 1, 2, ...,m

=> x khng thuc S. Gi s ngc li x thuc S
th phi tn ti i m f(x) = f(s_i) vi 1<= i <= m. Mu
thun!

Case 2: Nu tn ti 1<= i <= m f(x) = f(s_i)


=> x thuc S. iu khng nh ny l cha
hn ng bi c th x khng thuc s nhng v false negative l
tnh fingerprint ca x trng vi fingerprint ca s_i. khi x thuc S m
S kin ny c ta gi l false positive. Ta nhn ta li tnh ton ra
nhm x .
l x khng thuc
Liu xc sut false positve ny c ln khng? Nu S.
ln qu th ta khng nn dng gii thut ny.
Vic gi l
Pr(false positive) = Pr (x S (k: f(x) = f(sk )) ) negative hay
= Pr(i: f(x) = f(si )) ) Pr (x S (i: f(x) positive l ch tc
ng ca kt qu
= f(si )) )
sai ln ng
= 1 Pr((i: f(x) = f(si )) ) Pr (x S )
dng m ta ang

thc hin. y
ta coi S l tp rt
quan trng, "th
ly nhm cn
hn b st".

= 1 Pr(i: f(x)f(si ) ) Pr (x S ) (4.2)


Pr (x S ) = 0 do S l tp con c m phn t

ca tp U rt rt ln.

m
i!1

Pr(i: f(x)f(si ) ) =
m
i!1

Pr(f(x)f(si ) ) =

(1 Pr(f(x) = f(si ) )

Mi f(s_i) l mt fingerprint di b bits. Do vy xc


sut f(x) = f(s_i) ch l 1/2^b. Vi mi i = 1,2, ... ,
m. Suy ra:
1
Pr(i: f(x)f(si ) ) = (1 b )m
2

Thay vo ng th (4.2) ta c:
Pr(false positive) = 1 (1

1
2

!m

1 e

m
2

xp x trn ta dng 2 ln cng thc: 1 x


e!x khi x nh.
Chn b = 32 tc fingerprint 32-bit, gi s t in c
2^16 password xu tc password ngi s dng
khng c php dng. Khi trong RAM ta phi
lu: 2^16 * 4 bytes = 256KB;
Pr(false positive)

16

32

1

65536


4.3. Bloom Filter Method
Ging nh fingerprint method ta s dng mt
nh x f t tp S vo tp cc gi tr n-bit
By gi thay cho vic mi mt phn t cho ra mt
fingerprint ta ch cn mt dy n bt m ta gi l
Bloom lu tt c cc f(s_i). Nu f(s_i) tr li gi tr
no m ti bit = 1 th ta thit lp bit ny.
V d: Bloom n = 4 bit 0000; sau khi tinhs f(s_1) = 2
= 0010. Ta thu c Bloom 0010. Sau khi tnh f(s_2)
= 10 = 1010 ta thu c Bloom 1010.
Phng php Bloom Filter s dng mt Bloom,
v cc hm h_i; i =1,2,...,k
Ta ni y nm trong Bloom nu tt c cc v tr bit 1
ca y u c trong Bloom.
V d: y= 8 = 1000 nm trong Bloom 1010.; y = 1001
khng nm trong Bloom 1010.
ALGORITHM
Tnh h_i(x); i =1,2,...,k
x thuc S <=> h_i(x) nm trong Bloom no vi
mi i =1,2,...,k

PROBABILISTIC ANALYSIS:
Case 1: Khng tn ti i h_i(x) nm trong Bloom
=> x khng thuc S. Gi s ngc li x thuc
S th phi tn ti i: 1<= i <= m. m h_i(x) nm trong
Bloom. Mu thun!
Case 2: Nu mi i: 1<= i <= m , h_i(x) nm trong
Bloom ;
=> x thuc S.iu khng nh ny l cha hn
ng bi c th x khng thuc S nhng h_i(x)
nm trong Bloom. Ging nh trong fingerprint
method, ta gi s kin ny l false positive. Ta
nhn nhm x .

Pr(false positive) = Pr (x
S (i: hi (x) in Bloom) )
= Pr(i: hi (x) in Bloom ) Pr (x
S (i: hi (x) in Bloom ))

(V cc Bloom ny c lp nhau).
(4.3)
Pr (x S ) = 0 do S l tp con c m phn t
ca tp U rt rt ln.
Gi s ta lu Bloom di dng 1 mng A[1:n].
Xt 1 Bloom bt k:

Vic thit lp cc bit 1 trong Bloom c thc
hin k ln ng vi k hm h(x), mi ln nh
vy nh x m phn t ca S. Suy ra:
1
= (Pr(j A[hi (x)] 1))mk = (1 )mk
n
Thay vo ng thc (4.3) ta c:
1
Pr(false positive) = (1 (1 )mk )k = (1 p)k ;
n
1

trong : p = (1 n)mk e!mk/n . Xp x: k = - (ln


p)*n/m
Vi Bloom n-bit v tp S m phn t ta cn chn k
xc sut false positive nh nht.
kln(1!p)
Pr(false positive) = (1 p)k = e
; thay k = -
(ln p)*n/m
Pr(false positive) = (1 p)k = e
ln p ! ln(1!p)

! n/mln pln(1!p)

ln p

ln(p(1!p))

.
t min khi ln p * ln (1-p) t max.
ln(1 p)

ln pln(1 p)

ln p + ln(1 p) ln ( p(1 p))


=

2
2

p*(1-p) t max khi p = 1-p = 1/2;


! 4n/m
Min (Pr(false positive)) = e
khi k = ln2*n/m

3. Exercises:
http://docs.google.com/View?id=dgmqjfk5_184c7sdskcv

4. Cng thc ny ch no , em c nu thy th chuyn vo nh: = 1 - Pr(A[j], 1jn:


A[f(x)] = A[j]))

Exercise
Exercise 5.21:
Trong open addressing, hash table c ci t bng mng, hon ton khng s dng
linked-lists. Mi entry trong bng ch c th rng hoc cha 1 phn t.
Bn c th nhp vo link sau tm hiu r hn.
http://en.wikipedia.org/wiki/Open_addressing
http://courseweb.xu.edu.ph/courses/ics20/supplements/holte/open-addr.htm
http://courseweb.xu.edu.ph/courses/ics20/supplements/holte/open-addr.htm
Vi mi key k trong table ta nh ngha mt probe sequence h(k,0),h(k,1) .... ,h(k,n);
n l s entry trong table.
chn kha k ta tnh ln lt h(k,0),h(k,1) .... cho n khi tm c trng chn k,
sau n ln tht bi ta hiu l bng full v khng th chn thm
Khi tm kim cng lm tng t nh vy, tnh ln lt h(k,0),h(k,1) ....n khi tm c
kha k, hoc tm thy mt trng th chng t trong bng khng c kha k.
Gi s h(k,j) c th nhn bt c gi tr ngu nhin no trong n entries ca bng v tt c
cc h(k,j) c lp.
Sau khi s dng bng ny lu gi m = n/2 phn t, ta nhn c yu cu tm kha k
trong bng .
Gi X_i l s probe (thm d) cn thc hin chn kha th i. t X = max {X_i}; 1
i m l s thm d ln nht cn thc hin chn phn t c kha m
(a) Chng minh Pr(X>2log n) 1/n
(b) Chng minh expectation ca di ln nht ca chui thm d cn thc hin l E[X]
= O(log m). Ch : n = 2m.
Phng php trn cn c gi l Double Hashing. V d: h(k,i) = a*h(k) + b*h(i) (mod
n) tc l dng 2 hash function.
(c) Open addressing/Linear Probing l mt trng hp ring ca phng php ny. :
h(k,i) = h(k) + i (mode n); tron h(i) = i. Nhng khng hn nh vy bi h(k,i) v
h(k,i+1) khng cn c lp nhau na. Hy a ra nh hng ca s khac bit ny v tm
cch p dng Double Hashing cho vic tm xpectation ca di ln nht ca chui thm
d cn thc hin cho Open addressing/Linear Probing. ( thi K52 - CNTT - HBK HN)

Exercise 5.22:
Gi s list cc bi ht bn u thch l X, v list cc bi ht ti u thch l Y. Bit rng
|X| = |Y| = n.
Ta to ra Bloom filter ca cc tp X v Y s dng cc s m bits v k hash functions.
(a) Tnh expectation ca s cp bit khc nhau trong Bloom filter ca X v Y
(b) Tnh E[ |X Y|]
(c) Gii thch ti sao ta c th s dng phng php ny tm nhng ngi c s thch
cng th loi nhc thay cho vic so snh tt c list mt cch trc tip
2
1
p s: (a) n(2p-p ) trong p = (1 - n)mk ;
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Answer
Exercise 5.21: Trong open addressing, hash table c ci t bng mng, hon ton
khng s dng linked-lists. Mi entry trong bng ch c th rng hoc cha 1 phn t.
Bn c th nhp vo link sau tm hiu r hn.
http://en.wikipedia.org/wiki/Open_addressing
http://courseweb.xu.edu.ph/courses/ics20/supplements/holte/open-addr.htm
http://courseweb.xu.edu.ph/courses/ics20/supplements/holte/open-addr.htm
Vi mi key k trong table ta nh ngha mt probe sequence h(k,0),h(k,1) .... ,h(k,n);
n l s entry trong table.
chn kha k ta tnh ln lt h(k,0),h(k,1) .... cho n khi tm c trng chn k,
sau n ln tht bi ta hiu l bng full v khng th chn thm
Khi tm kim cng lm tng t nh vy, tnh ln lt h(k,0),h(k,1) ....n khi tm c
kha k, hoc tm thy mt trng th chng t trong bng khng c kha k.
Gi s h(k,j) c th nhn bt c gi tr ngu nhin no trong n entries ca bng v tt c
cc h(k,j) c lp.
Sau khi s dng bng ny lu gi m = n/2 phn t, ta nhn c yu cu tm kha k
trong bng .
Gi X_i l s probe (thm d) cn thc hin chn kha th i. t X = max {X_i}; 1
i m l s thm d ln nht cn thc hin chn phn t c kha m
Proof:
(a) Chng minh Pr(X>2log n) 1/n
ln chn th i, trong bng c i - 1 entries. Ta phi tnh h(i.j) cho n khi tm c
entry trng. Nh vy X_i l mt geometric random variable with parameter
i-1

j-1

pi = 1 - . Theo lut phn phi ny: Pr(Xi = j) = (1-p) p


n

Suy ra: Pr(Xi j) =


j-1

= p(1-p)

l!0

l!j

Pr(X = j) =
j-1

j-1

l!j

(1-p)l = p(1-p)

(1-p) p
1

1-(1-p)

= (1-p)

j-1

Suy ra: Pr(Xi > 2log m) = Pr(Xi 2log m + 1 ) = (1-pi )2logm ; thay pi = 1 -

i-1
n

v n

= 2m vo ta c:
Pr(Xi > 2log m) = (

i-1 2logm
2m

m 2logm
2m

1
m2

(b) Chng minh expectation ca di ln nht ca chui thm d h(i,j) cn thc hin l
E[X] = O(log m). Ch : n =2m.
m

E[X] =

( xPr(X = x)) +
x!2 log m

(xPr(X = x))
x ! 2 log m
m

< 2 log m

Pr(X = x) + n
x!2 log m

Pr(X = x)
x ! 2 log m

E[X] < 2 log mPr(X 2 log m) + nPr(X > 2 log m) = 2 log m + n

1
n

= 2 log n
1
trn ta s dng: Pr(X 2 log m) 1 v kt qu Pr(X > 2 log m) t cu a.
n

Ch trn ta a ra nhn xt X_i l mt geometric random variable with parameter


i-1

pi!1

pi = 1 - . Do vy: E[Xi!1 ] =

n
n-i

1
1 -

; trong = n

Cng thc ny cho ta thy:


"Nu nh trong bng c i kha th expectation ca s ln thm d cn lm l 1/(1-a)
trong a l t s gia s phn t a vo bng v s entries bng c th cha c"
(c) Open addressing/Linear Probing l mt trng hp ring ca phng php ny. :
h(k,i) = h(k) + i (mode n); tron h(i) = i. Nhng khng hn nh vy bi h(k,i) v
h(k,i+1) khng cn c lp nhau na. Hy a ra nh hng ca s khac bit ny v tm
cch p dng Double Hashing cho vic tm xpectation ca di ln nht ca chui thm
d cn thc hin cho Open addressing/Linear Probing. ( thi K52 - CNTT - HBK HN)
Proof:
Gi s by gi thy cho vic tnh h(k,0),h(k,1) .... ,h(k,n) mt cch ln lt ta s tnh
h(k,i_0),h(k,i_1) .... ,h(k,i_n) vi (i_1,i_2 ,....,i_n) l mt hon v ca (1,2,...,n).
Mi hon v c gi l mt case. Nh vy ta c n! case.
Trong n! th t c th, c 1 v ch 1 hon v l (i_1,i_2 ,....,i_n) = (1,2,...,n) dn ta n
Linear Probing. Ta gi Linear Probing l case_1
Mt nhn xt na l cc hon v ny u c vai tr tng ng nhau tc nu lm Linear
Probing th cng c th lm (3,1,2,...,n-1,n) hay bt k hon v no cng cho ta mt kt
qu ging nhau khi tnh expectation.
Tr li vi Double Hashing ta nh ngha bin ngu nhin X = max {X_i}; 1 i m
l s thm d ln nht cn thc hin tm phn t c kha m.
E[X] = n!
Pr(CASE = casei ) E[X|CASE = casei ]; trong CASE l mt hon
i!1
v ca b (1,2,...,n).
V cc case_i l tng ng nhau nn: Pr(case_i) = 1/n! vi mi i = 1,2,....,n!,
v E[X|CASE = case_i] = E[X|CASE = case_1] = E[X|CASE = Linear Probing]
vi mi i = 1,2,....,n!.
Do vy: E[X|CASE = Linear Probing] = E[X] = O(log n);

Ch :
Gi X_i l s probe (thm d) cn thc hin chn kha th i.
trn ta a ra nhn xt X_i l mt geometric random variable with parameter
i-1

pi!1

pi = 1 - . Do vy: E[Xi!1 ] =

n
n-i

; trong = n
1 -

Nhn xt ny ch ng cho trng hp Double Hashing khng c g m bo n s ng


cho Linear Probing.

You might also like