Download as pdf or txt
Download as pdf or txt
You are on page 1of 46

Introduction to

Randomized Algorithms

Nguyen An Khuong,

Chapter 2
Introduction to Randomized Contents

Introduction
Algorithms Miller-Rabin
PRIMARILY TEST

QUICKSORT
Advanced Algorithms (055127) on Ngày 6 tháng 9 năm 2023 MIN-CUT

SELECTION Problem
Introduction
Deterministic algorithm
Divide-and-conquorer with
random pivoting
Random sampling strategy
for selection

Discrete RVs

References

Nguyen An Khuong,
Faculty of Computer Science and Engineering
University of Technology, VNU-HCM
2.1
Introduction to
Contents Randomized Algorithms

Nguyen An Khuong,

1 Introduction

2 Miller-Rabin PRIMARILY TEST Contents

Introduction

3 QUICKSORT Miller-Rabin
PRIMARILY TEST

QUICKSORT

4 Karger’s minimum cut (’93) MIN-CUT

SELECTION Problem

5 SELECTION Problem Introduction


Deterministic algorithm

Introduction Divide-and-conquorer with


random pivoting

Deterministic algorithm Random sampling strategy


for selection

Divide-and-conquorer with random pivoting Discrete RVs


Random sampling strategy for selection References

6 Appendix: Discrete Random variables

7 References and Further Reading

2.2
Introduction to
Deterministic algorithms vs randomized algorithms Randomized Algorithms

Nguyen An Khuong,
• Deterministic:

input Algorithm output


Contents

Introduction
Goal: To prove that the algorithm solves the problem correctly
Miller-Rabin
(always) and quickly (typically, the number of steps should be PRIMARILY TEST

polynomial in the size of the input.) QUICKSORT

MIN-CUT
In other words, for all inputs, output is good.
SELECTION Problem
• Randomized: Introduction
Deterministic algorithm
Divide-and-conquorer with
input Algorithm output random pivoting
Random sampling strategy
for selection

Discrete RVs

References
random bits
In addition to input, algorithm takes a source of random
numbers and makes random choices during execution.
Behavior can vary even on a fixed input.
For all inputs, with good probability, output is good.
2.3
Introduction to
Randomized Algorithms vs Probabilistic Analysis of Algorithms Randomized Algorithms

Nguyen An Khuong,

input Algorithm output

Contents

Introduction
random bits Miller-Rabin
PRIMARILY TEST
Goal: Design algorithm + analysis to show that this behavior QUICKSORT
is likely to be good on every input? MIN-CUT

(The likelihood is over the random numbers only.) SELECTION Problem


Introduction
• Not to be confused with the Probabilistic Analysis of Deterministic algorithm
Divide-and-conquorer with
Algorithms: random pivoting
Random sampling strategy
for selection

Discrete RVs

random input Algorithm output distribution


References

Here the input is assumed to be from a probability


distribution.
Goal: To show that the algorithm works for most inputs.

2.4
Introduction to
Types of randomized algorithms: Monte Carlo vs Las Vegas Randomized Algorithms

Nguyen An Khuong,

• A Monte Carlo (MC) algorithm runs for a fixed number of


steps and produces an answer that is correct with good
probability, say ≥ 1/3.Example: Karger’s min-cut algorithm
Contents
• A Las Vegas (LV) algorithm always produces the correct Introduction

answer, its running time is a random (small time with good Miller-Rabin
PRIMARILY TEST
probability) variable whose expectation is bounded (say by a QUICKSORT
polynomial). Examples: Quicksort, Hashing,... MIN-CUT

• These probabilities/expectations are only over the random SELECTION Problem


Introduction
choices made by the algorithm - independent of the input. Deterministic algorithm

• Thus independent repetitions of Monte Carlo algorithms drive Divide-and-conquorer with


random pivoting
Random sampling strategy
down the failure probability exponentially. for selection

• It’s easy to see the following: Discrete RVs

References
1 LV =⇒ MC. Fix a time T. If the algorithm terminates before
T , we output the answer, otherwise we output 1.
2 MC does not always imply LV. The implications holds when
verifying a solution can be done much faster than finding one.
In that case, we test the output of MC algorithm and stop
only when a correct solution is found.
2.5
Introduction to
Advantages of randomized algorithms Randomized Algorithms

Nguyen An Khuong,

• The goals are simplicity, speed (performance) and sometimes


other objectives depending on the problem (e.g., for Nash
Equilibrium, the problem inherently requires randomization). Contents
• For many problems, a randomized algorithm is the simplest, Introduction

the fastest, or both. Miller-Rabin


PRIMARILY TEST
• Themes: QUICKSORT
• Avoiding adversarial inputs (“adversarial” may mean MIN-CUT

well-structured, i.e. natural). Randomm ization makes your SELECTION Problem

data look random, even if it is not (think Quicksort). Introduction


Deterministic algorithm
• Finger printing (have a quick identifier for a complex structure) Divide-and-conquorer with

• Load balancing (distribute workload across resources to random pivoting


Random sampling strategy
for selection
optimize time consumption, or efficiency, or avoid overloads)
• Sampling (estimate properties of the whole population based Discrete RVs

References
on a subset)
• Symmetry breaking (e.g. what to do when two packets arrive
at the same time?)
• Probabilistic existence proofs (Erdos, Lovasz... if a random
object satisfies a property with non-zero probability, we just
proved that such an object exists.)
2.6
Introduction to
Scope of randomized algorithms Randomized Algorithms

• Number theoretic algorithms: Primality testing (Monte Carlo) Nguyen An Khuong,

• Data structures: Sorting, order statistics, searching,


computational geometry.
• Algebraic identities: Polynomial and matrix identity verication. Contents

Interactive pro of systems. Introduction

• Mathematical programming: Faster algorithms for linear Miller-Rabin


PRIMARILY TEST
programming. Rounding linear program solutions to integer QUICKSORT
program solutions. MIN-CUT
• Graph algorithms: Minimum spanning trees, shortest paths, SELECTION Problem

minimum cuts. Introduction


Deterministic algorithm

• Counting and enumeration. Matrix permanent. Counting Divide-and-conquorer with


random pivoting

combinatorial structures. Random sampling strategy


for selection

• Parallel and distributed computing. Deadlock avoidance, Discrete RVs

distributed consensus. References

• Probabilistic existence proofs: Show that a combinatorial


object arises with non-zero probability among objects drawn
from a suitable probability space.
• Derandomization: First devise a randomized algorithm, then
argue that it can be “derandomized” to yield a deterministic
algorithm. 2.7
Introduction to
Introduction Randomized Algorithms

Nguyen An Khuong,

Contents

Introduction
• Proposed in 70’s.
Miller-Rabin
• Miller and Rabin gave two versions of the same algorithm to PRIMARILY TEST

QUICKSORT
test whether a number n is prime or not. MIN-CUT
• Whereas Rabin’s algorithm works with a randomly chosen SELECTION Problem

a ∈ Zn , and is therefore a randomized one, Miller’s version Introduction


Deterministic algorithm

tests deterministically for all a’s, where 1 ≤ a ≤ 4 log2 n. Divide-and-conquorer with


random pivoting

• But correctness of Miller’s algorithm depends on correctness of Random sampling strategy


for selection

Extended Riemann Hypothesis. Discrete RVs

References

2.8
Introduction to
Miller-Rabin Algorithm Randomized Algorithms

Nguyen An Khuong,

Contents

• Let ψ be an automorphism in Zn . Introduction

• Let n − 1 = s × 2t for odd s. Miller-Rabin


PRIMARILY TEST

• the ALGORITHM QUICKSORT

1 Test if n = mj for j > 1. If yes, output COMPOSITE. MIN-CUT

SELECTION Problem
2 Randomly choose a ∈ Zn . Introduction
3 Test if an−1 = 1( mod n). If no, output COMPOSITE. Deterministic algorithm
i Divide-and-conquorer with
4 Compute ui = as×2 ( mod n) for 0 ≤ i < t. random pivoting
Random sampling strategy
5 If there is an i such that ui = 1 and ui−1 ̸= ±1, output for selection

COMPOSITE. Discrete RVs

6 output PRIME. References

2.9
Introduction to
Correctness of Miller-Rabin Algorithm Randomized Algorithms

Nguyen An Khuong,

Contents

Introduction
• Observation: Pr [test is correct | n is prime ] = 1.
a∈Zn Miller-Rabin
PRIMARILY TEST
• THEOREM Pr [test outputs PRIME | n is composite ] QUICKSORT
a∈Zn
MIN-CUT
1 1 1
≤ 4 + 4 = 2 SELECTION Problem

• Hence the test is correct with probability ≥ 12 . Introduction


Deterministic algorithm

• The probability of success can be boosted further by repeating Divide-and-conquorer with


random pivoting
Random sampling strategy
the test a few times, where output will be COMPOSITE if any for selection

of the single test output is COMPOSITE, else PRIME. Discrete RVs

References

2.10
Introduction to
Time Complexity of Miller-Rabin Algorithm Randomized Algorithms

Nguyen An Khuong,

Contents

• Computing u0 : O(log n) × O(log s) = O(log2 n) Introduction

• [ by repeated squaring O(log s) times, s ≤ n, then multiplying Miller-Rabin


PRIMARILY TEST

and taking modulo n each time all with log n bits numbers QUICKSORT

using FFT takes Õ(log n) time ] MIN-CUT

• Computing u1 , u2 , · · · , ut : O(log2 n) [ t ≤ log n, squaring SELECTION Problem


Introduction

and taking modulo n takes O(log n) time ]. Deterministic algorithm


Divide-and-conquorer with

• Testing if n = mj holds for some j > 1 can be done in random pivoting


Random sampling strategy

O(log2 n) time. for selection

Discrete RVs
• Hence the time complexity of the algorithm is O(log n). References

2.11
Introduction to
The algorithm Randomized Algorithms

Nguyen An Khuong,

Our goal is to sort of a sequence S = (x1 , ..., xn ) of n distinct real


numbers in increasing order. We use a recursive method known as Contents

quicksort which proceeds as follows: Introduction

Miller-Rabin
Algorithm (Hoare, 1962) PRIMARILY TEST

QUICKSORT

1. If S has one or zero elements return S. MIN-CUT

SELECTION Problem
2. Pick some element x = xi in S called the pivot. Introduction
Deterministic algorithm
3. Reorder S in such a way that for every number xj ̸= x in S, if Divide-and-conquorer with
random pivoting
xj < x, then xj is moved to a list S1 , else if xj > x then xj is Random sampling strategy

moved to a list S2 . for selection

Discrete RVs
4. Apply this algorithm recursively to the list of elements in S1 References
and to the list of elements in S2 .
5. Return the sorted list S1 , x, S2 .

2.12
Introduction to
Quicksort Demonstration Randomized Algorithms

Given Nguyen An Khuong,

Contents

Introduction

Miller-Rabin
PRIMARILY TEST

QUICKSORT

MIN-CUT

SELECTION Problem
Introduction
Deterministic algorithm
Divide-and-conquorer with
random pivoting
Random sampling strategy
for selection

Discrete RVs

References

2.13
Introduction to
Quicksort’s Average Running Time Analysis Randomized Algorithms

• In order to have a good “average performance,” one can Nguyen An Khuong,

randomize this algorithm by assuming that each pivot is


chosen at random.
• Let us compute the expectation of the number X of Contents

comparisons needed when running the randomized version of Introduction

quicksort. Miller-Rabin
PRIMARILY TEST
• Recall that the input is a sequence S = (x1 , ..., xn ) of distinct QUICKSORT

elements, and that (y1 , ..., yn ) has the same elements sorted in MIN-CUT

increasing order. SELECTION Problem


Introduction
• In order to compute E(X), we decompose X as a sum of Deterministic algorithm

indicator variables Xi,j , with Xi,j = 1 iff yi and yj are ever Divide-and-conquorer with
random pivoting

compared, and Xi,j = 0 otherwise. Random sampling strategy


for selection

Xn X j−1 Discrete RVs

• Then, it is clear that X = Xi,j and References

j=2 i=1
j−1
n X
X
E(X) = E(Xi,j ).
j=2 i=1
• Furthermore, since Xi,j is an indicator variable, we have
E(Xi,j ) = P (yi and yj are ever compared). 2.14
Introduction to
Quicksort’s Average Running Time Analysis (cont’d) Randomized Algorithms

Nguyen An Khuong,

• The crucial observation is that yi and yj are ever compared iff


either yi OR yj is chosen as the pivot when {yi , yi+1 , ..., yj }
Contents
is a subset of the set of elements of the (left or right) sublist
Introduction
considered for the choice of a pivot.
Miller-Rabin
• It remains to compute the probability that the next pivot PRIMARILY TEST

QUICKSORT
chosen in the sublist Yi,j = {yi , yi+1 , ..., yj } is yi (or that the
MIN-CUT
next pivot chosen is yj , but the two probabilities are equal).
SELECTION Problem
• Since the pivot is one of the values in {yi , yi+1 , ..., yj } and Introduction
Deterministic algorithm
since each of these is equally likely to be chosen (by Divide-and-conquorer with
random pivoting
hypothesis), we have Random sampling strategy
for selection
1
P (yi is chosen as the next pivot in Yi,j ) = j−i+1 . Discrete RVs

• Consequently, since yi and yj are ever compared iff either yi is References

chosen as a pivot or yj is chosen as a pivot, and since these


two events are mutally exclusive, we have
2
E(Xi,j ) = P (yi and yj are ever compared) = j−i+1 .

2.15
Introduction to
Quicksort’s Average Running Time Analysis (cont’d) Randomized Algorithms
j−1
n X j−1
n X Nguyen An Khuong,
X X 2
• It follows that E(X) = E(Xi,j ) =
j=2 i=1 j=2 i=1
j−i+1
j
n X
X 1 Contents
=2 , (set k = j − i + 1)
j=2 k=2
k Introduction

Miller-Rabin
n X n PRIMARILY TEST
X 1
=2 QUICKSORT
k MIN-CUT
k=2 j=k
n n n SELECTION Problem
X n−k+1 X 1 X Introduction
=2 = 2(n + 1) +2 (−1) Deterministic algorithm
k k Divide-and-conquorer with
k=2 k=2 k=2 random pivoting
n
X 1 Random sampling strategy
for selection
= 2(n + 1) − 4n.
k Discrete RVs
k=1
References
• The quantity Hn = 1 + 1/2 + 1/3 + ... + 1/n is called the
“nth harmonic number”, and is in the rangeR [ln n, 1 + ln n]
n
(this can be seen by considering the integral 1 x1 dx).
• Therefore,
E(X) ∈ [2(n + 1) ln n − 4n, (2(n + 1) ln n − 4n) + (2 − 2n)].
• In other words, E(X) ∈ Θ(n ln n).
2.16
Introduction to
Approximation of the nth harmonic number Hn Randomized Algorithms

Nguyen An Khuong,
1 1 1
• The exact expression for Hn := 1 + + + · · · + is not
2 3 n
known, but we can estimate Hn as below.
• Let us consider the area under the curve x1 when x varies from Contents

1 to n. Introduction

1 1 1 1 Miller-Rabin
• We note that Hn − = 1 + + + · · · + is an PRIMARILY TEST
n 2 3 n−1 QUICKSORT
overestimation of this area by rectangles. See below.
MIN-CUT

SELECTION Problem
Introduction
Deterministic algorithm
Divide-and-conquorer with
random pivoting
Random sampling strategy
for selection

Discrete RVs

References

Rn1
• So, 1 x
dx < Hn − n1 .
2.17
Introduction to
Approximation of the nth harmonic number Hn (cont’d) Randomized Algorithms
1 1 1
• Moreover Hn − 1 = + + · · · + is an underestimation Nguyen An Khuong,
2 3 n
of the area as below.

Contents

Introduction

Miller-Rabin
PRIMARILY TEST

QUICKSORT

MIN-CUT

SELECTION Problem
Introduction
Deterministic algorithm
Divide-and-conquorer with
random pivoting
Random sampling strategy
for selection

• Hence, Hn − 1 < 1n x1 dx. Therefore,


R
Discrete RVs
Rn
Hn − 1 < 1 x1 dx < Hn − n1 , References

or, ln n + n1 < Hn < ln n + 1.


• Also, Euler discovered this beautiful property of harmonic
number Hn that
limn→∞ (Hn − ln n) = γ ≈
0.57721566490153286060651209008240243104215933593992. . .
• γ is called the Euler-Mascheroni constant. 2.18
Introduction to
Introduction Randomized Algorithms

Nguyen An Khuong,

• Given a graph G = (V, E) with n vertices and m edges, find a


global minimum cut? Contents

• That is to find a set of vertices S ⊂ V , with 1 ≤ |S| ≤ n − 1, Introduction

which minimizes the number of edges going from this subset S Miller-Rabin
PRIMARILY TEST
to the rest of the vertices. QUICKSORT

• We denote the cut value for set S by |E(S, S̄)| (where MIN-CUT

S̄ = V \ S). SELECTION Problem


Introduction

• The problem of finding a minimum cut with a given source and Deterministic algorithm
Divide-and-conquorer with
sink (s − t min cut) is probably more familiar. random pivoting
Random sampling strategy

• Dinic’s algorithm or Ford-Fulkerson algorithm lead to a for selection

Discrete RVs
solution. References
• We can compute a s − t minimum cut in O(n3 ).
• By running this algorithm for all (s, t) pairs, we get a solution
for the global minimum cut in O(n5 ).

2.19
Introduction to
The Algorithm Randomized Algorithms

Nguyen An Khuong,

Algorithm It looks stupid, but actually it’s not that stupid.

1: function Algo1 Contents

2: while n > 2 do Introduction

3: Choose an edge e = (u, v) randomly from the remaining Miller-Rabin


PRIMARILY TEST
edges QUICKSORT

4: Contract that edge MIN-CUT

5: end while SELECTION Problem


Introduction
6: return the two last vertices Deterministic algorithm

7: end function Divide-and-conquorer with


random pivoting
Random sampling strategy
for selection

• Contracting an edge means that we remove that edge and Discrete RVs

combine the two vertices into a super-node. References

• We note that self-loops thus formed are removed but any


resulting parallel edges are itnot removed. This means at every
step, we have a multi-graph without any self-loops.
• This algorithm will yield a cut. The minimum cut? Probably
not. But maybe!
• If the graph is somewhat dense, then choosing an edge in the 2.20
Introduction to
Analyzing the probability of the correct answer Randomized Algorithms

• Let’s say that the algorithm “fails" in step i if in that step, we Nguyen An Khuong,

choose to contract an edge in the minimum cut (here we


suppose that there is only one minimum cut).
• For all u ∈ V , let d(u) be the degree of u. We have Contents

Introduction
1 X 2·m
|E(S, S̄)| ≤ min d(u) ≤ d(u) = . Miller-Rabin
u n n PRIMARILY TEST
u∈V
QUICKSORT

• So, |E(S,
m
S̄)|
≤ n2 . MIN-CUT

• We fail in the first step if we pick an edge in the minimum cut, SELECTION Problem
Introduction

|E(S, S̄)| Deterministic algorithm

an event that occurs with probability: . Divide-and-conquorer with


m random pivoting
Random sampling strategy
for selection
2
P(fail in 1st step) ≤ Discrete RVs

n References

2
P(fail in 2nd step | success in 1st step) ≤
n−1
···
2
P(fail in ith step | success till (i − 1)th step) ≤
n−i+1
2.21
Introduction to
Analyzing the probability of the correct answer (cont’d) Randomized Algorithms

• Let Zi be the event that the algorithm succeeds in step i. Nguyen An Khuong,

Thus, we have
n−2
P(Z1 ) ≥
n Contents

n−3 Introduction
P(Z2 |Z1 ) ≥
n−1 Miller-Rabin
PRIMARILY TEST

··· QUICKSORT

n−i−1 MIN-CUT

P(Zi |Zi−1 ∩ · · · ∩ Z1 ) ≥ SELECTION Problem


n−i+1 Introduction
Deterministic algorithm
• Therefore: Divide-and-conquorer with
random pivoting
Random sampling strategy
P(Success) = P(Z1 ∩ Z2 ∩ · · · ∩ Zn−2 ) for selection

Discrete RVs
= P(Z1 ) · P(Z2 |Z1 ) · P(Zn−2 |Z1 ∩ Z2 ∩ · · · ∩ Zn−3 )
References
n−2 n−3 n−4 n−5 2 1
≥ · · · ··· ·
n n−1 n−2 n−3 4 3
1 1 2 1
= · · ·
n n−1 1 1
2
≥ 2.
n
2.22
Introduction to
Analyzing the probability of the correct answer (cont’d) Randomized Algorithms

Nguyen An Khuong,

• If n is too large, this is a very poor guarantee.


• So what we do instead, is run the algorithm several times and
choose the best min-cut we find. If we run this algorithm k Contents

times, we then have: Introduction

Miller-Rabin
 k PRIMARILY TEST
2
P(Success at least once) ≥ 1 − 1 − 2 QUICKSORT
n MIN-CUT

SELECTION Problem
1
• A classical inequality yields (1 − z) z ≤ 1e . Introduction
Deterministic algorithm

• We hence choose to run the algorithm k times, with Divide-and-conquorer with


random pivoting
2 Random sampling strategy
k = n2 log 1δ . for selection

Discrete RVs
• We get:
References

 log δ1
1
P(Success) ≥ 1 − =1−δ
e
• After running all these k trials, we choose the one with
minimum cut value.
2.23
Introduction to
Complexity Randomized Algorithms

Nguyen An Khuong,

Contents

Introduction

• At every step of the algorithm, we have to contract an edge. Miller-Rabin


PRIMARILY TEST
This takes at most O(n) time. QUICKSORT

• We do at most n such contractions. Thus for every run, we MIN-CUT

need O(n2 ) time. SELECTION Problem


Introduction
2
• We can choose δ to be, say 1/4, and thus k = n2 log 4 and Deterministic algorithm
Divide-and-conquorer with
this will ensure that within O(n4 ) time the algorithm succeeds random pivoting
Random sampling strategy
with probability at least 3/4. for selection

Discrete RVs

References

2.24
Introduction to
Min-Cut: Can we do better? Randomized Algorithms

Nguyen An Khuong,

• Initial stages of the algorithm are very likely to be correct.


• In particular, the first step is wrong with probability at most Contents

2/n. Introduction

• As we contract more edges, failure probability goes up. Miller-Rabin


PRIMARILY TEST

• Moreover, earlier stages take more time compared to later ones. QUICKSORT

MIN-CUT
• Idea: Let us redistribute our iterations. Since earlier ones are
SELECTION Problem
more accurate and slower, why not do less of them at the Introduction
Deterministic algorithm
beginning, and increasingly more as the number of edges Divide-and-conquorer with

decreases? random pivoting


Random sampling strategy
for selection
• Let us formalize this idea. After t steps: Discrete RVs

n−2 n−3 n−3 n−t−1 (n − t)2 References


P(Success) ≥ · · ··· ≈ .
n n−1 n−1 n−t+1 n2
• We equate this to 12 to get t = n − √n . Thus, we have the
2
following algorithm:...

2.25
Introduction to
A better MIN-CUT Randomized Algorithms

Nguyen An Khuong,

Algorithm Getting better

Contents
1: function Algo2 Introduction
2: Repeat Twice: √ Miller-Rabin
PRIMARILY TEST
3: Run contraction from n → n/ √ 2 QUICKSORT
4: Recursively call Algo2 on n/ 2 set MIN-CUT
5: return the last two vertices for the best cut among the two SELECTION Problem
6: end function Introduction
Deterministic algorithm
Divide-and-conquorer with
random pivoting

Runtime analysis: Random sampling strategy


for selection

   Discrete RVs
n
T (n) = 2 O(n2 ) + T √ References

2
 2
2 n
= 2cn + 2 · 2 · c · √ + ···
2
= O(n2 log n)

2.26
Introduction to
Success Probability Randomized Algorithms

Nguyen An Khuong,

p(n) = 1 − (1 − success in one branch)2


  2
1 n
=1− 1− p √ Contents

2 2 Introduction
  2 Miller-Rabin
1 n PRIMARILY TEST
=1− 1− p √
2 2 QUICKSORT

   2 MIN-CUT
n 1 n
=p √ − p √ SELECTION Problem

2 4 2 Introduction
Deterministic algorithm

• To solve this recursion, we let x = log√2 n. Thus, setting Divide-and-conquorer with


random pivoting
Random sampling strategy
f (x) = p(n), we get f (x) = f (x − 1) − f (x − 1)2 . for selection

• We observe that setting f (x) = x1 gives: Discrete RVs

References
1 1 1 1
f (x−1)−f (x) = − = ≈ = f (x−1)2
x−1 x x(x − 1) (x − 1)2
 
• Hence, we have p(n) = O log1 n . Thus, to get 3/4
probability of success, we need O(n2 log2 n) time (which is
much better than before, so we’re happy). 2.27
Introduction to
Extension: The number of MIN-CUT Randomized Algorithms

• Question: what is the maximum number of global min-cuts an Nguyen An Khuong,

undirected graph G can have?


• Not obvious. Consider a directed graph as follows: s together
with any subset of v1 , ..., vn constitutes a minimum s − t cut. Contents
( 2n cuts in total.) Introduction

Miller-Rabin
PRIMARILY TEST

QUICKSORT

MIN-CUT

SELECTION Problem
Theorem Introduction
Deterministic algorithm
An undirected graph G on n nodes has at most Cn2 global min-cuts. Divide-and-conquorer with
random pivoting
Random sampling strategy
for selection

Discrete RVs
Proof:
• Suppose there are r global min-cut c1 , ..., cr ; References

• Let Ci denote the event that ci is reported, and C denote the


success of Contraction algo;
• For each i, we have P r[Ci ] ≥ 12 .
Cn
• Thus P r[C] = P r[C1 ∪ ... ∪ Cr ]= i P r[Ci ] ≥ r 12 . (Note:
P
Cn
= since all Ci are disjoint. )
• We get r ≤ Cn2 . (r 12 ≤ 1.) 2.28
C n
Introduction to
Selection problem Randomized Algorithms

Nguyen An Khuong,

INPUT: Given a set of number S = {a1 , a2 , ..., an }, and a


number k ≤ n; Contents

OUTPUT: the k-th smallest item in general case (or the median of Introduction

S as a specical case). Miller-Rabin


PRIMARILY TEST

QUICKSORT
For example, given a set S = {18, 15, 27, 13, 1, 7, 25}, the MIN-CUT
objective is the median of S. Note: SELECTION Problem

• Brute-force: polynomial time; Introduction


Deterministic algorithm

• Divide-and-conquer: reduce the running time to a lower Divide-and-conquorer with


random pivoting
Random sampling strategy
polynomial; for selection

• A feasible strategy is to sort S first, and then report the k-th Discrete RVs

References
one, which takes O(n log n) time.
• Divide-and-conquer becomes more powerful when combined
with randomization: Possible to develop a faster algorithm, say
the deterministic linear algorithm (16n comparisons) by Blum
et al.

2.29
Introduction to
A general divide-and-conquer paradigm Randomized Algorithms

Algorithm Select(S, k): Nguyen An Khuong,

1: Choose an item si from S as a pivot;


2: S + = {}; Contents

3: S − = {}; Introduction

Miller-Rabin
4: for j = 1 to n do PRIMARILY TEST

5: if sj > si then QUICKSORT

6: S + = S + ∪ {sj }; MIN-CUT

7: else SELECTION Problem

S − = S − ∪ {sj };
Introduction
8: Deterministic algorithm

9: end if Divide-and-conquorer with


random pivoting

10: end for Random sampling strategy


for selection

11: if |S − | = k − 1 then Discrete RVs

12: return si ; References

13: else if |S − | > k − 1 then


14: return Select(S − , k);
15: else
16: return Select(S + , k − 1 − |S − |);
17: end if
2.30
Introduction to
Perform iteration on ONLY one subset. Randomized Algorithms

Nguyen An Khuong,

Contents

Introduction

Miller-Rabin
PRIMARILY TEST

QUICKSORT

MIN-CUT

SELECTION Problem
Introduction
Intuition: Deterministic algorithm
Divide-and-conquorer with
1 At first, an element ai is chosen to split S into two parts random pivoting
Random sampling strategy
S + = {aj : aj ≥ ai }, and S − = {aj : aj < ai }. for selection

Discrete RVs
2 We can determine whether the k-th median is in S + or S − . References

3 Thus, we perform iteration on ONLY one subset.

2.31
Introduction to
How to choose a splitter? Randomized Algorithms

We have the follow options: Nguyen An Khuong,

• Bad choice: select the smallest element at each iteration.


T (n) = T (n − 1) + O(n) = O(n2 )
• Ideal choice: select the median at each iteration.
Contents
T (n) = T ( n2 ) + O(n) = O(n) Introduction
• Good choice: select a “centered” element ai , i.e., |S + | ≥ ϵn, and Miller-Rabin
|S − | ≥ ϵn for a fixed ϵ > 0. PRIMARILY TEST

QUICKSORT
T (n) ≤ T ((1 − ϵ)n) + O(n) MIN-CUT

≤ cn + c(1 − ϵ)n + c(1 − ϵ)2 n + .... SELECTION Problem


Introduction
= O(n) Deterministic algorithm
Divide-and-conquorer with
(1) random pivoting
Random sampling strategy
for selection

Discrete RVs
e.g.: ϵ = 14 : References

2.32
Introduction to
BFPRT algorithm: a linear deterministic algorithm Randomized Algorithms

Nguyen An Khuong,

Contents

Introduction
• Still using the idea of choosing splitter. The ideal splitter is the
Miller-Rabin
median; however, finding the median is exactly our objective. PRIMARILY TEST

QUICKSORT
• Thus, just try to get “something close to the median”, say
MIN-CUT
within n4 from the median. SELECTION Problem

• How can we get something close to the median? Instead of Introduction


Deterministic algorithm

finding the median of the “whole set”, find a median of a Divide-and-conquorer with
random pivoting

“sample”. Random sampling strategy


for selection

• But how to choose a sample? Medians again! Discrete RVs

References

2.33
Introduction to
Median of medians algorithm [Blum, 1973] Randomized Algorithms

Nguyen An Khuong,
“Median of medians” algorithm:

1: Line up elements in groups of 5 elements;


2: Find the median of each group; (takes O( 6n5 ) time)
Contents

3: Find the median of medians (denoted as M ); (takes T ( n


Introduction
5 ) time) Miller-Rabin
4: Use M as splitter to partition the input and call the algorithm PRIMARILY TEST

recursively on one of the partitions. QUICKSORT

MIN-CUT

SELECTION Problem
Introduction
Deterministic algorithm
Divide-and-conquorer with
random pivoting
Random sampling strategy
for selection

Discrete RVs

References

Analysis:
T (n) = T ( n5 ) + T ( 7n 6n
10 ) + 5 at most 24n comparisons.
(here, 10 comes from the fact that at least 3n
7n
10 can be deleted by
using M as the splitter. )
2.34
Introduction to
Randomized divide-and-conquorer Randomized Algorithms

Algorithm RandomSelect(n, k): Nguyen An Khuong,

1: Choose an element si from S uniformly at random;


2: S + = {};; Contents

3: S − = {};; Introduction

Miller-Rabin
4: for j = 1 to n do PRIMARILY TEST

5: if sj > si then QUICKSORT

6: S + = S + ∪ {sj }; MIN-CUT

7: else SELECTION Problem

S − = S − ∪ {sj };
Introduction
8: Deterministic algorithm

9: end if Divide-and-conquorer with


random pivoting

10: end for Random sampling strategy


for selection

11: if |S − | = k − 1 then Discrete RVs

12: return si ; References

13: else if |S − | > k − 1 then


14: return RandomSelect(S − , k);
15: else
16: return RandomSelect(S + , k − 1 − |S − |);
17: end if
2.35
Introduction to
Randomized divide-and-conquorer cont’d Randomized Algorithms

Nguyen An Khuong,

e.g.: ϵ = 14 :
Contents

Introduction

Miller-Rabin
PRIMARILY TEST

QUICKSORT

MIN-CUT

SELECTION Problem
Introduction
Deterministic algorithm
Divide-and-conquorer with
random pivoting
Random sampling strategy
for selection

Discrete RVs

Key observation: if we choose a splitter ai ∈ S uniformly at References

random, it is easy to get a good splitter since a fairly large fraction


of the elements are “centered”.

2.36
Introduction to
Randomized divide-and-conquorer cont’d Randomized Algorithms

Nguyen An Khuong,

Theorem
The expected running time of Select(n,k) is O(n). Contents

Introduction

Proof. Miller-Rabin
PRIMARILY TEST

• Let ϵ = 1
4
.
We’ll say that the algorithm is in phase j when the size QUICKSORT

of set under consideration is in [n( 34 )j−1 , n( 43 )j ]. MIN-CUT

SELECTION Problem
• Let X be the number of steps. And Xj be the number of steps in Introduction

phase j. Thus, X = X0 + X1 + .... Deterministic algorithm


Divide-and-conquorer with
• Consider the j-th phase. The probability to find a centered splitter random pivoting
Random sampling strategy
is ≥ 12 since at least half elements are centered. Thus, the expected for selection

Discrete RVs
number of iterations to find a centered splitter is: 2.
References
• Each iteration costs cn( 43 )j
steps since there are at most n( 43 )j
elements in phase j. Thus, E(Xj ) ≤ 2cn( 34 )j .
• E(X) = E(X0 + X1 + ....) ≤ j 2cn( 43 )j ≤ 8cn.
P

2.37
Introduction to
A “random sampling” algorithm [Floyd & Rivest, 1975] Randomized Algorithms

Nguyen An Khuong,

Basic idea: randomly sample a subset as a representation of the


whole set.
Random sampling algorithm: Contents

Introduction

Miller-Rabin
1: randomly sample r elements (with replacement) from S = PRIMARILY TEST

{s1 , s2 , ..., sn }. Denote the r elements as R. QUICKSORT

2: take the (1 − δ) 2r -th smallest element of R (denoted as a), and MIN-CUT

the (1 + δ) 2r -th smallest element of R (denoted as b); SELECTION Problem


Introduction
3: divide S into three dis-joint subsets: Deterministic algorithm
Divide-and-conquorer with
random pivoting

L = {si : si < a}; Random sampling strategy


for selection

M = {si : a ≤ si ≤ b}; Discrete RVs

References
H = {si : si > b};

4: check |L| ≤ n2 , |H| ≤ n2 , and |M | ≤ cδn. If not, goto Step 1.


5: return the ( n2 − |L|)-th smallest of M ;

2.38
Introduction to
Example Randomized Algorithms

Nguyen An Khuong,

Contents

Introduction

Miller-Rabin
PRIMARILY TEST

QUICKSORT

MIN-CUT

SELECTION Problem
Introduction
Deterministic algorithm
Divide-and-conquorer with
random pivoting
Random sampling strategy
for selection

Discrete RVs
Two requirements of M : References

• On one side, M should be LARGE enough such that the


median is covered by M with a high probability;
• On the other side, M should be SMALL enough such that
Step 4 will not take a long time;

2.39
Introduction to
Time-comlexity analysis Randomized Algorithms

Nguyen An Khuong,

Running time: Contents


• Step 2: O(r log r) = o(n); (sorting R) Introduction

• Step 3: 2n steps. (O(n) + O(|M | + |H|)) Miller-Rabin


PRIMARILY TEST

• Step 4: O(δn log(δn)). QUICKSORT

3 MIN-CUT
− 41
Setting r = n , and δ = n . The time bound of Step 4 changes
4
SELECTION Problem
to: Introduction
Deterministic algorithm
• Step 4: O(δn log(δn)) = o(n). Divide-and-conquorer with
random pivoting

Total steps: 2n + o(n). Random sampling strategy


for selection

• The best known deterministic algorithm: 3n. But too Discrete RVs

References
complicated.
• A lower bound: 2n.

2.40
Introduction to
Error probability analysis I Randomized Algorithms

Nguyen An Khuong,

Theorem
1
With probability 1 − O(n− 4 ), the RandomSamplingSelect
Contents
algorithm reports the median in the first pass. Thus, the running
Introduction
time is only 2n + o(n). Miller-Rabin
PRIMARILY TEST

QUICKSORT

MIN-CUT

SELECTION Problem
Introduction
Deterministic algorithm
Divide-and-conquorer with
random pivoting
Random sampling strategy
for selection

Three cases of failure in Step 4: Discrete RVs

References
Case 1:
Define index variable xi = 1 when the i-th sample is less than the
median, and xi = 0 otherwise. Let X = x1 + ... + xr be the
number of samples that is less than the median. We have:
E(xi ) = 12 and σ 2 (xi ) = 14 .

2.41
Introduction to
Error probability analysis II Randomized Algorithms

Nguyen An Khuong,

Contents
E(X) = 12 r and σ 2 (X) = 41 r. Introduction

Miller-Rabin
n 1−δ PRIMARILY TEST
Pr(|L| ≥ 2) = Pr(X ≤ 2 r) (2) QUICKSORT

= Pr(|X − E(X)| ≥ 2δ r) (3) MIN-CUT

SELECTION Problem
σ 2 (X)
≤ (4) Introduction

( 2δ r)2 Deterministic algorithm


Divide-and-conquorer with
random pivoting
1
−4 Random sampling strategy
= n (5) for selection

Discrete RVs

Case 2 and Case 3 are similar and thus omited. References

2.42
Introduction to
Random Variables Randomized Algorithms

• When looking at independent replications of a binary Nguyen An Khuong,

experiment, we would not usually be interested in whether


each case is a success or a failure but rather in the total
number of successes (or failures). Contents

• Obviously, this number is random since it depends on the Introduction

Miller-Rabin
individual random outcomes, PRIMARILY TEST

• and it is consequently called a random variable. QUICKSORT

MIN-CUT
• In this case it is a discrete-valued random variable that can
SELECTION Problem
take values 0, 1, ..., n, where n is the number of replications. Introduction
Deterministic algorithm
• A random variable X has a probability distribution that can Divide-and-conquorer with
random pivoting
be described using point probabilities fX (x) = p(X = x), Random sampling strategy
for selection

• or the cumulative distribution function F (x) = p(X ≤ x). Discrete RVs

• Expected value (giá trị kỳ vọng): References


P
µ = E(X) = x · p(X = x).
• Variance (phương sai):
σ 2 = V (X) = (x − E(X))2 · p(X = x).
P

• Standard deviation
p (độ lệch chuẩn):
σ = SD(X) = V (X).
2.43
Introduction to
Expected Value: An Example Randomized Algorithms

Nguyen An Khuong,

An insurance company charges $50 a year. Can company make a


profit? Assuming that it made a research on 1000 people and have
following table: Contents

Introduction

Miller-Rabin
Outcome Payroll x Probability p(X = x) PRIMARILY TEST

1 QUICKSORT
Death 10,000 1000 MIN-CUT
2
Disability 5000 1000
SELECTION Problem
Introduction

997 Deterministic algorithm


Neither 0 1000 Divide-and-conquorer with
random pivoting
Random sampling strategy
for selection
X: amount of payment, is a discrete random variable (biến ngẫu Discrete RVs
nhiên rời rạc). The company expects that they have to pay each References

customer:
1 2 997
E(X) = $10, 000( ) + $5000( ) + $0( ) = $20.
1000 1000 1000

2.44
Introduction to
Variance: An Example Randomized Algorithms

Nguyen An Khuong,

• Of course, the expected value $20 will not always happen in Contents

Introduction
reality. Miller-Rabin
• There will be variability. Let’s calculate! PRIMARILY TEST

QUICKSORT
• So MIN-CUT
1 2 997
V (X) = 99802 ( 1000 )+49802 ( 1000 )+(−20)2 ( 1000 ) = 149, 600. SELECTION Problem
√ Introduction
• and SD(X) = 149, 600 ≈ $386.78 Deterministic algorithm
Divide-and-conquorer with
random pivoting
Random sampling strategy
for selection
Comment
Discrete RVs
The company expects to pay out $20, and make $30. However, the References

standard deviation of $386.78 indicates that it’s no sure thing.


That’s pretty big spread (and risk) for an average profit of $30.

2.45
Introduction to
Randomized Algorithms

Nguyen An Khuong,

Contents
Cormen, T.H., Leiserson, C.E., Rivest, R.L., and Stein, C. Introduction
Introduction to Algorithms, 3rd ed. The MIT Press, 2009. Miller-Rabin
PRIMARILY TEST
Dasgupta, S., Papadimitriou, C.H. , and Vazirani, U.V. QUICKSORT
Algorithms. McGraw Hill, 2006. MIN-CUT

SELECTION Problem
Motwani, R and Prabhakar Raghavan, R. Randomized Introduction

Algorithms, Cambridge University Press, 1995. Deterministic algorithm


Divide-and-conquorer with
random pivoting
Ross, S.M. Probability Models for Computer Science. Academic Random sampling strategy
for selection
Press 2008. Discrete RVs

References

2.46

You might also like