Download as pdf or txt
Download as pdf or txt
You are on page 1of 4

CSE 431/531: Algorithm Analysis and Design Spring 2021

Homework 2
Instructor: Shi Li Deadline: 3/12/2021

Your Name: Your Student ID:

Problems 1 2 3 4 Total
Max. Score 6 10 12 12 40
Your Score

Problem 1 Construct the Huffman code (i.e, the optimum prefix code) for the alphabet
{a, b, c, d, e, f, g} with the following frequencies:

Symbols a b c d e f g
Frequencies 30 20 58 29 59 85 25

Also give the weighted length of the code (i.e, the sum over all symbols the frequency
of the symbol times its encoding length).
The codes for the letters are:

Symbols a b c d e f g
Frequencies 30 20 58 29 59 85 25
Codes 101 0000 001 100 11 01 0001

The weighted length of the code is

30 × 3 + 20 × 4 + 58 × 3 + 29 × 3 + 59 × 2 + 85 × 2 + 25 × 4 = 819.

Problem 2 We are given two integers 1 ≤ k ≤ n and an array A of length n. For every
integer i in [k, n], let bi be the k-th smallest number in A[1..i]. The goal of the problem
is to output bk , bk+1 , bk+2 , · · · , bn in O(n log n) time.
For example, if k = 4, n = 10 and A = (50, 80, 10, 30, 90, 100, 20, 40, 35, 70). Then
b4 = 80, b5 = 80, b6 = 80, b7 = 50, b8 = 40, b9 = 35, b10 = 35.
Hint: use the heap data structure.
1: Q ← empty max-heap.
2: for i ← 1 to k do
3: Q.add(A[i]) . The element A[i] itself is its priority value.

1
4: bk ← Q.get max()
5: for i ← k + 1 to n do
6: if A[i] < Q.get max() then
7: Q.extract max()
8: Q.add(A[i])
9: bi ← Q.get max()
10: return (bk , bk+1 , · · · , bn )
After step 3, Q contains the k elements A[1..k] and bk is the largest number in Q. We
maintain that at the end of iteration i of Loop 5, Q contains the k smallest numbers in
A[1..i]. Then the k-th smallest number is the largest number in Q.
Each priority queue operation (get max, extract max, add) takes time O(log k). So
the running time of the algorithm is O(n log k) ≤ O(n log n).

Problem 3 Consider the problem of scheduling a set of jobs so as to minimize the total
weighted completion time. There are n jobs indexed by [n] := {1, 2, 3, · · · , n}, and each
job j ∈ [n] has a weight wj ≥ 0 and processing time pj ≥ 0. The goal of the problem is
to output an ordering of jobs so as to minimize the total weighted completion time of all
jobs. See Figure 1 for an instance of the problem.
p1 = 1 p2 = 2 p3 = 3
1 2 3
w1 = 2 w2 = 5 w3 = 7

w1 = 2 w2 = 5 w3 = 7 w2 = 5 w3 = 7 w1 = 2
1 2 3 2 3 1
0 1 2 3 4 5 6 0 1 2 3 4 5 6

cost = 2 × 1 + 5 × 3 + 7 × 6 = 59 cost = 5 × 2 + 7 × 5 + 2 × 6 = 57

Figure 1: An instance with two different schedules. The second schedule has a better
weighted completion time than the first one, and it is the optimum schedule.

p
Prove that the schedule that processes the jobs j in increasing order of wjj is the
optimum for the instance. For convenience, you can assume that for any two different
p p 0
jobs j and j 0 , we have wjj 6= wj 0 .
j

Let π1 , π2 , · · · , πn be a permutation of the n jobs. Assume it is not the order of jobs


p
according to the increasing order of wjj . We show that scheduling jobs according to π is
not optimum.
p pπ
There must be some index i ∈ {1, 2, · · · , n − 1} such that wππi > wπi+1 . Let π 0 be
i i+1
the permutation obtained from π by swapping πi and πi+1 . Namely, we have, for every
k ∈ [n], we have

π k
 if k ∈
/ {i, i + 1}
0
πk = πi+1 if k = i

πi if k = i + 1

2
The weighted completion time of π is
n
X i
X X
cost(π) := wπi pπi0 = p j wj 0 .
i=1 i0 =1 j,j 0 ∈[n]:π −1 (j)≤π −1 (j 0 )

Above π −1 (j) denotes the index i such that π(i) = j. Similarly, the weighted comple-
tion time of π 0 is
X
cost(π 0 ) := pj w j 0 .
j,j 0 ∈[n]:π 0−1 (j)≤π 0−1 (j 0 )

The difference between the two quantities is

cost(π 0 ) − cost(π) = pπi+1 wπi − pπi wπi+1 < 0.

Therefore, π 0 is a better solution than π. Namely, π is not optimum. Therefore, any


order that does not sort jobs according to increasing order of pj /wj is not optimum. So,
the optimum solution is the unique order that does so.

Problem 4 In the S interval covering problem, we are given n intervals [s1 , t1 ), [s2 , t2 ),
· · · , [sn , tn ) such that i∈[n] [si , ti ) = [0, T ). The goal of the problem is to return a smallest-
S
size set S ⊆ [n] such that i∈S [si , ti ) = [0, T ). Design an efficient greedy algorithm for
this problem.
We make the problem slightly more general. We assume we are given S n intervals
[s1 , t1 ), [s2 , t2 ), · · · , [sn , tn ), and an interval [A, B) to be covered such that i∈S [si , ti ) ⊇
S B). The goal of the problem is to find the smallest-size set S ⊆ [n] such that
[A,
i∈S [si , ti ) ⊇ [A, B). Notice that the original problem is a special case of this prob-
lem.
Assume A < B. Let i∗ = arg maxi∈[n]:si ≤A ti , i.e, the index of the interval covering A
with the largest t value. We prove that

Lemma 1. There is an optimum solution S that contains i∗ .

Proof. Let S be any optimum solution to the instance. If i∗ ∈ S then the lemma is proved,
and so we assume i∗ ∈ / S. Then there must be some i ∈ S such that A ∈ [si , ti ) since
the point A needs to be covered. By our choice of i∗ , we have ti∗ ≥ ti . So the portion of
[A, B) covered by [si∗ , ti∗ ) is [A, ti∗ ), and the portion of [A, B) covered by [si , ti ) is [A, ti ).
The former is a super-interval of the later. Thus, if we define S 0 = S \ {i} ∪ {i∗ }, S 0 is
also a valid solution to the instance. Since |S 0 | = |S|, S 0 is also optimum. As S 0 contains
i∗ , we prove the lemma.
Thus, in the first step of our algorithm, we can make the following irrevocable decision:
include i∗ in our solution. Now since [A, ti∗ ) is already covered, the residual problem is
to choose the minimum number of intervals to cover [ti∗ , B). It ti∗ ≥ B, then the interval
is empty and the instance is trivial. Otherwise, the instance is still an instance of the
interval covering problem.
The pseudo-code of the algorithm is as follows:

3
1: S←∅
2: while A < B do
3: i∗ ← arg maxi∈[n]:si ≤A ti
4: S ← S ∪ {i∗ }
5: A ← ti∗
6: return S
A simple implementation of the algorithm can achieve running time O(n2 ).

You might also like