Professional Documents
Culture Documents
CSE 221 Lec05 Greedy F23
CSE 221 Lec05 Greedy F23
Analysis of Algorithms
Fall 23
1
5 - Greedy Algorithms
2
Contents
• ACTIVITY SELECTION
• D&C vs. DP vs. GREEDY
• SCHEDULING
• HUFFMAN CODING
3
THE GREEDY PARADIGM
4
THE GREEDY PARADIGM
5
NON-EXAMPLE: GREEDY KNAPSACK?
Item:
Total weight: 2 + 2 + 3 + 3 = 10 Weight: 6 2 4 3 11
Total value: 8 + 8 + 13 + 13 = 42 Value: 20 8 14 13 35
6
NON-EXAMPLE: GREEDY KNAPSACK?
Item:
Total weight: 2 + 2 + 3 + 3 = 10 Weight: 6 2 4 3 11
Total value: 8 + 8 + 13 + 13 = 42 Value: 20 8 14 13 35
Greedy approach? Here’s an idea: koalas have the best value/weight ratio, so keep using koalas!
Total weight: 3 + 3 + 3 = 9
Total value: 13 + 13 + 13 = 39
7
NON-EXAMPLE: GREEDY KNAPSACK?
Greedy approach? Here’s an idea: koalas have the best value/weight ratio, so keep using koalas!
Total weight: 3 + 3 + 3 = 9
Total value: 13 + 13 + 13 = 39
8
NON-EXAMPLE: GREEDY KNAPSACK?
Our greedy approach for Unbounded Knapsack doesn’t work for all inputs
(and we showed it fails via a counterexample)
While we usually don’t say “No greedy algorithm can work”, you can often get an idea of
whether a nearsighted style of greedy decision making feels suitable for a problem by
going through a few attempts at designing a greedy solution.
In this Unbounded Knapsack attempt, we saw that making the nearsighted decision of
putting in the highest value/weight ratio object that can fit at the time will cause us to have
“regret” later down in the road. Making a nearsighted greedy decision feels inappropriate in
this problem, since it might be better to give up something earlier on to make room for
optimal decisions later. That’s why DP made more sense for Unbounded Knapsack: DP tries
to optimize its choice by seeing a decision all the way through (via recursive formulations)
and then picking the most optimal choice
9
ACTIVITY SELECTION
•An example where greedy works!
10
ACTIVITY SELECTION: THE TASK
Input: n activities with start times and finish times
Constraint: All activities are equally important, but you can only do 1 activity at a time!
Output: A way to maximize the number of activities you can do
time
11
ACTIVITY SELECTION: THE TASK
In what order should you greedily add activities? Here are 3 ideas:
time
12
ACTIVITY SELECTION: THE TASK
In what order should you greedily add activities? Here are 3 ideas:
1) Be impulsive: choose 2) Avoid commitment: 3) Finish fast: choose
activities in ascending choose activities in ascending activities
order of start times order of length in ascending order of end
times
Sit outside Eat dinner Sleep
Work on homework
Wash dishes Make hats Watch TV
time
13
ACTIVITY SELECTION: THE TASK
In what order should you greedily add activities? Here are 3 ideas:
1) Be impulsive: choose 2) Avoid commitment: 3) Finish fast: choose
activities in ascending choose activities in ascending activities
order of start times order of length in ascending order of end
times
Sit outside
Only the third one seems to work (this is justEatourdinner
intuition Sleep
Work on homework right now)!
The first two greedy approaches
Wash dishes
could
Make hats lead to “regrettable”
Watch TV
decisions, and finding a counterexample confirms that.
time
14
OUR GREEDY ALGORITHM
Pick an available activity with the smallest finish time & repeat
time
15
OUR GREEDY ALGORITHM
Pick an available activity with the smallest finish time & repeat
time
16
OUR GREEDY ALGORITHM
Pick an available activity with the smallest finish time & repeat
time
17
OUR GREEDY ALGORITHM
Pick an available activity with the smallest finish time & repeat
time
18
OUR GREEDY ALGORITHM
Pick an available activity with the smallest finish time & repeat
time
19
OUR GREEDY ALGORITHM
Pick an available activity with the smallest finish time & repeat
time
20
OUR GREEDY ALGORITHM
Pick an available activity with the smallest finish time & repeat
time
21
OUR GREEDY ALGORITHM
Pick an available activity with the smallest finish time & repeat
time
22
OUR GREEDY ALGORITHM
Pick an available activity with the smallest finish time & repeat
time
23
OUR GREEDY ALGORITHM
Pick an available activity with the smallest finish time & repeat
time
24
ACTIVITY SELECTION: PSEUDOCODE
25
ACTIVITY SELECTION: PSEUDOCODE
Complexity
When activities are sorted by their finish time: O(N)
When activities are not sorted by their finish time, the time
complexity is O(N log N) due to complexity of sortingRuntime:
26
WHY IS IT GREEDY?
27
WHY IS IT GREEDY?
28
DP vs. GREEDY
Like Dynamic Programming, Greedy algorithms often work for problems with
nice optimal substructure. However, not only are optimal solutions to a problem
made up from optimal solutions of sub-problems,
29
DP vs. GREEDY
Like Dynamic Programming, Greedy algorithms often work for problems with
nice optimal substructure. However, not only are optimal solutions to a problem
made up from optimal solutions of sub-problems,
30
D&C vs. DP vs. GREEDY
DIVIDE-AND-CONQUER DYNAMIC PROGRAMMING GREEDY
31
SCHEDULING
•Another (more complex) problem with a greedy solution!
32
SCHEDULING: THE TASK
Input: A set of n jobs. Job i takes ti hours. For every hour until job i is done, pay ci.
Output: An order of jobs to complete s.t. you minimize the cost.
33
SCHEDULING: THE TASK
Input: A set of n jobs. Job i takes ti hours. For every hour until job i is done, pay ci.
Output: An order of jobs to complete s.t. you minimize the cost.
Input: A set of n jobs. Job i takes ti hours. For every hour until job i is done, pay ci.
Output: An order of jobs to complete s.t. you minimize the cost.
35
SCHEDULING: THE TASK
Input: A set of n jobs. Job i takes ti hours. For every hour until job i is done, pay ci.
Output: An order of jobs to complete s.t. you minimize the cost.
A greedy algorithm could greedily commit to the “best” job to do first, and then move on,
repeatedly picking the next “best” job. What would be the “best” job to do first?
36
SCHEDULING: EASIER VERSION #1
SIMPLIFIED VERSION #1
Input: A set of n tasks. Each task takes 1 hour. For every hour until task i is done, pay ci.
Output: An order of tasks to complete s.t. you minimize the cost.
Job A Job B Job C Job D
Cost/hr: 5 Cost/hr: 2 Cost/hr: 1 Cost/hr: 10
37
SCHEDULING: EASIER VERSION #1
SIMPLIFIED VERSION #1
Input: A set of n tasks. Each task takes 1 hour. For every hour until task i is done, pay ci.
Output: An order of tasks to complete s.t. you minimize the cost.
Job A Job B Job C Job D
Cost/hr: 5 Cost/hr: 2 Cost/hr: 1 Cost/hr: 10
38
SCHEDULING: EASIER VERSION #1
39
SCHEDULING: EASIER VERSION #2
SIMPLIFIED VERSION #2
Input: A set of n tasks. Task i takes ti hours. For every hour until task i is done, pay 1 unit.
Output: An order of tasks to complete s.t. you minimize the cost.
Job A Job B Job C Job D
Cost/hr: 1 Cost/hr: 1 Cost/hr: 1 Cost/hr: 1
40
SCHEDULING: EASIER VERSION #2
SIMPLIFIED VERSION #2
Input: A set of n tasks. Task i takes ti hours. For every hour until task i is done, pay 1 unit.
Output: An order of tasks to complete s.t. you minimize the cost.
Job A Job B Job C Job D
Cost/hr: 1 Cost/hr: 1 Cost/hr: 1 Cost/hr: 1
41
SCHEDULING: EASIER VERSION #2
SIMPLIFIED
Suppose A takes tA hours and B takesVERSION #1 tA ≥ tB (A is longer).
tB hours, and
Input: A set of n tasks. Task i takes ti hours. For every hour until task i is done, pay 1 unit.
Then cost(A then B) = tA + (tA + tB), and cost(B then A) = tB + (tB + tA).
Output: An order of tasks to complete s.t. you minimize the cost.
Since tA ≥ tB, then we know cost(A then B) ≥ cost(B then A),
Job A Job B Job C Job D
so it’s
Cost/hr: 1
cheaper to go
Cost/hr: 1
with BCost/hr:
(the shorter
1
job) before A.
Cost/hr: 1
Basically, doing longer jobs first is a bad idea. A longer job would delay every job that
Which
comes after it by a longer amount, jobs should
so this weshorter
is why do first?
jobs are more attractive here
— the shortest jobsA)
addsDo onlonger jobsdelay
a minimal first for each subsequent job.
B) Do shorter jobs first
42
SCHEDULING: THE “BEST” JOB
But what if neither A nor B are both higher-cost and shorter than the other? Then it’s
not immediately obvious what the “better” job would be…
We need some way of assigning a “score” to each job, and then we can
choose the job with the best score. Higher cost should increase a job’s
score, while longer time lengths should decrease a job’s score.
43
SCHEDULING: THE “BEST” JOB
We need some way of assigning a “score” to each job, and then we can choose the job
with the best score.
Higher cost should increase a job’s score, while longer time lengths should decrease a job’s score.
REASONABLE ATTEMPT
#1?
44
SCHEDULING: THE “BEST” JOB
We need some way of assigning a “score” to each job, and then we can choose the job
with the best score.
Higher cost should increase a job’s score, while longer time lengths should decrease a job’s score.
REASONABLE ATTEMPT REASONABLE ATTEMPT
#1? #2?
score for job i = costi – timei score for job i = costi / timei
(higher cost increases score, (higher cost increases score,
longer times decreases score) longer times decreases score)
Which one works?
45
SCHEDULING: THE “BEST” JOB
We need some way of assigning a “score” to each job, and then we can choose the job
with the best score.
Higher cost should increase a job’s score, while longer time lengths should decrease a job’s score.
REASONABLE ATTEMPT REASONABLE ATTEMPT
#1? #2?
score for job i = costi – timei score for job i = costi / timei
(higher cost increases score, (higher cost increases score,
longer times decreases score) longer times decreases score)
Cost/hr: 5 Cost/hr: 2
Consider this
Job A Job B
example:
time: 3 hours time: 1 hour
46
SCHEDULING: THE “BEST” JOB
We need some way of assigning a “score” to each job, and then we can choose the job
with the best score.
Higher cost should increase a job’s score, while longer time lengths should decrease a job’s score.
WRONG SCORING PROMISING SCORING
SCHEME! SCHEME!
score for job i = costi – timei score for job i = costi / timei
This says we should do Job A then Job This says we should do Job B then Job
B. A.
This gives us cost: (3 · 5) + (4 · 2) = 23 Cost/hr: 5 This gives us cost: (1 ·Cost/hr:
2) + (4 ·25) = 22
Consider this
Job A Job B
example:
time: 3 hours time: 1 hour
47
SCHEDULING: THE “BEST” JOB
We need some way of assigning a “score” to each job, and then we can choose the job
Why does the ratiowith the best
matter? Forscore.
any two tasks A and B:
Higher cost should increase a job’s score, while longer time lengths should decrease a job’s score.
cost(A B) = (tA· cA) + ((tPROMISING
WRONG SCORING A+ tB)· cB) SCORING
cost(B A) = (tB· cB) + ((tA+ tB)·SCHEME!
SCHEME! cA)
48
SCHEDULING: “PSEUDOCODE”
Our greedy choice: always choose the job with the next biggest ratio:
cost (per hour until finished)
time it takes
49
SCHEDULING: “PSEUDOCODE”
Our greedy choice: always choose the job with the next biggest ratio:
cost (per hour until finished)
time it takes
50
SCHEDULING: WHAT DID WE LEARN?
51
HUFFMAN CODING
•One more problem with a greedy solution!
52
OPTIMAL CODES: THE TASK
ASCII can be pretty wasteful for English sentences, where letters have varying
frequencies. If e shows up so often, maybe we should have a more efficient
way of representing it (e.g. use less bits to represent e)!
53
OPTIMAL CODES: THE TASK
16 we’re encoding
13 12
9
5
A B C D E F
Goal: Minimize the average number of bits used to encode a symbol
(with symbols weighted according to their frequencies)
54
OPTIMAL CODES: ATTEMPT 0
ATTEMPT 0: Use a fixed length code (the ith character gets coded as i in binary)
45
PERCENTAGE
16
13 12
9
5
A B C D E F
000 001 010 011 100 101
We should really try to get away with fewer bits for our more common symbols...
55
OPTIMAL CODES: ATTEMPT 1
ATTEMPT 1: Use a variable length code (shorter codes for common characters)
45
PERCENTAGE
16
13 12
9
5
A B C D E F
0 00 01 1 10 11
56
OPTIMAL CODES: ATTEMPT 2
PERCENTAGE
16
13 12
9
5
A B C D E F
00 101 110 01 111 100
What is 0011001? ACD
57
A PREFIX-FREE CODE IS A TREE
58
A PREFIX-FREE CODE IS A TREE
Cost of thisWe
treecan
(average leaf
think of depth): code as a tree:
a prefix-free
As long as all the A character’s encoding
(2 · 0.45)
letters show up as+ (2 · 0.16) + (3 · 0.05) + (3 · 0.13) + (3 · 0.12)can
+ (3be· 0.09)
found=by tracing
0
leaves, then the 2.391 down from the root. By
corresponding code convention, left edges
is prefix-free denote 0, and right
0 1 0 1
edges denote 1.
A: D:
45 16 0 1 1
00 01 0
The cost of a tree is the B: C:
expected length of the encoding F: 5 E: 9
13 12
of a random letter (randomness 100 101 110 111
is weighted by frequency)
59
A PREFIX-FREE CODE IS A TREE
Cost of thisWe
treecan
(average leaf
think of depth): code as a tree:
a prefix-free
As long as all the A character’s encoding
(2 · 0.45)
letters show up as+ (2 · 0.16) + (3 · 0.05) + (3 · 0.13) + (3 · 0.12)can
+ (3be· 0.09)
found=by tracing
0 1
leaves, then the 2.39 down from the root. By
corresponding code convention, left edges
is prefix-free denote 0, and right
0 1 0 1
edges denote 1.
A: OurD:goal (rephrased in terms of this tree):
Minimize
45 the (weighted)
16 average leaf depth of this binary tree!
0 1 1 0
00 01
The cost of a tree is the B: C:
expected length of the encoding F: 5 E: 9
13 12
of a random letter (randomness 100 101 110 111
is weighted by frequency)
60
A PREFIX-FREE CODE IS A TREE
IDEA: Greedily build sub-trees from the bottom up, where the “greedy goal”
is to have less frequent letters further down in the tree.
To ensure that less frequent letters are further down in the tree,
we’ll greedily build subtrees, by “merging” the 2 node with the smallest
frequency count, and then repeating until we’ve merged everything!
21
A “merge” between 2 nodes creates a
common parent node whose key is the 0 1
sum of those 2 nodes frequencies:
C:
E: 9
12
61
HUFFMAN CODING: EXAMPLE
Greedily build subtrees by merging, starting with the 2 most infrequent letters.
A: B: C: D:
E: 9 F: 5
45 13 12 16
62
HUFFMAN CODING: EXAMPLE
Greedily build subtrees by merging, starting with the 2 most infrequent letters.
14
0 1
A: B: C: D:
E: 9 F: 5
45 13 12 16
63
HUFFMAN CODING: EXAMPLE
Greedily build subtrees by merging, starting with the 2 most infrequent letters.
25 14
0 1 0 1
A: B: C: D:
E: 9 F: 5
45 13 12 16
64
HUFFMAN CODING: EXAMPLE
Greedily build subtrees by merging, starting with the 2 most infrequent letters.
30
1
0
25 14
0 1 0 1
A: B: C: D:
E: 9 F: 5
45 13 12 16
65
HUFFMAN CODING: EXAMPLE
Greedily build subtrees by merging, starting with the 2 most infrequent letters.
55 1
0 30
1
0
25 14
0 1 0 1
A: B: C: D:
E: 9 F: 5
45 13 12 16
66
HUFFMAN CODING: EXAMPLE
Greedily build subtrees by merging, starting with the 2 most infrequent letters.
100 1
55 1
0 30
0 1
0
25 14
0 1 0 1
A: B: C: D:
E: 9 F: 5
45 13 12 16
67
HUFFMAN CODING: EXAMPLE
Greedily build subtrees by merging, starting with the 2 most infrequent letters.
Note: This merging
100 procedure guarantees that
0 1
all characters will be leaves
A: 55 1 in the tree (so the tree
0
45 corresponds to a prefix-free
0 25 30 code)
Expected cost:
0 1 0 1
(0.45 · 1)
B: C: D:
+ 14
(0.13 · 3) + (0.12 · 3) + (0.16 · 13 12 16 0 1
3) 100 101 110
+ E: 9 F: 5
(0.09 · 4) + (0.05 · 4)
= 2.24 1110 1111
68
HUFFMAN CODING: PSEUDOCODE
69
HUFFMAN CODING: PSEUDOCODE
70
HUFFMAN CODING: WHAT DID WE LEARN?
71
References
● Algorithm design: J. Kleinberg, E. Tardos. Pearson Education, Ch. 6.
● Some slides are updated from: CS381 Introduction to the Analysis of Algorithms
● Some slides are updated from: design and analysis of algorithms Stanford university
72
Questions
73