Download as pdf or txt
Download as pdf or txt
You are on page 1of 8

Introduction to Algorithms November 12, 2015

Massachusetts Institute of Technology 6.006 Fall 2015


Instructors: Piotr Indyk and Ron Rivest Quiz 2

Quiz 2
• Do not open this quiz booklet until directed to do so. Read all the instructions on this page.
• When the quiz begins, write your name on every page of this quiz booklet.
• You have 50 minutes to earn 50 points. Do not spend too much time on any one problem.
Read them all first, and attack them in the order that allows you to make the most progress.
• You are allowed two 8.500 ×1100 double-sided cheat sheets. No calculators or programmable
devices are permitted. No cell phones or other communications devices are permitted.
• Write your solutions in the space provided. If you need more space, use the back of the
sheet containing the problem. Pages may be separated for grading.
• Do not waste time and paper rederiving facts that we have studied. Simply cite them.
• When writing an algorithm, a clear description in English will suffice. Pseudo-code is not
required unless you find that it helps with the clarity of your presentation.
• Pay close attention to the instructions for each problem. Depending on the problem,
partial credit may be awarded for incomplete answers.

Problem Parts Points Grade Grader


Name 2 2
1 9 18
2 2 30
Total 50

Name:

Circle your recitation:


R08
R01 R02 R03 R04 R05 R06 R07 R09
Shalev
Ilya Anak Alex Szymon Alex Joe Alex Matthew
Ben
R Y Jaffe Sidor Jaffe Paggi Chen Chang
David
10AM 10AM 11AM 11AM 12PM 12PM 1PM 3PM
2PM
6.006 Quiz 2 Name 2

Problem 0. What is Your Name? [2 points] (2 parts)


(a) Flip back to the cover page. Write your name and circle your recitation section.
(b) Write your name on top of each page.
Problem 1. True or False [18 points] (9 parts)
For each of the following questions, circle either T (True) or F (False). There is no need to jus-
tify the answers; you may include a remark regarding your interpretation of the question, or an
assumption you made while answering it, but these should not be necessary, and it is better to ask
a TA during the exam for clarification if necessary. The graders may ignore such remarks and
assumptions. Each correct answer is worth 2 points.
(a) T F Hashing with chaining has better asymptotic performance than open-address hash-
ing when the load factor α approaches 1.

Solution: True. The expected running time for chaining is O(1 + α) whereas for
open-addressing is O(1/(1 − α)).
(b) T F Consider a hash table m slots that has had m/2 elements inserted into it thus
far. The worst case asymptotic runtime of search if this table uses chaining with
linked lists is better than the worst case asymptotic runtime of search if this table
uses open-addressing.

Solution: False. In the worst case for chaining all m/2 elements get hashed to
the same location, and the element we are searching for is in the last position
in that linked list. Likewise, our open-addressing hashing scheme has to look
at m/2 different slots before finding the inserted element. Both are incredibly
unlikely, but represent the worst possible situations.
(c) T F Given a text t of length n over a constant size alphabet, and a parameter m, there
is an O(n)-time algorithm that detects whether t has two identical substrings of
length m (a small probability of error is OK).

Solution: True. Run Karp-Rabin, and store all the hash values. Duplications can
be found using, e.g., a set/dictionary data structure.
(d) T F We can use Dijkstra’s algorithm on graphs with negative edge weights by adding
some large constant C to every edge, running Dijkstra’s algorithm and then sub-
tracting C from every edge in the result.

Solution: False. For example, the shortest s-t path for the graph on the left is
hs, u, ti. However, after adding C = 4 to the weights, we obtain the graph on the
right, whose shortest s-t path is hs, ti.
6.006 Quiz 2 Name 3

u u

−1 −3 3 1
−2 2
s t s t

(e) T F Consider DFS on a directed graph; note that DFS has an outer loop that ensures
that each vertex is visited, and will produce one DFS tree for each vertex found
by this loop to be previously unvisited (that vertex is the root of the new DFS
tree). True or False: the number of DFS trees produced may depend on the order
that vertices are considered in the outer loop.

Solution: True. A graph with only two vertices s, t and a single directed edge
s → t will yield two DFS trees if t is considered before s, but only one if s is
considered before t.

(f) T F Let G = (V, E) be a simple directed graph with positive edge weights, and
s, t ∈ V be two fixed vertices. Suppose that the shortest-path weights δ(u, v)
for every pair of nodes u and v are already computed If a new edge with positive
weight is later added to the graph, then it is possible to determine, in constant
time, whether the value of δ(s, t) is changed.

Solution: True. We simply need to check if there exists a better s-t path that
makes use of the newly added edge. Since δ(s, u) and δ(v, t) cannot have changed
(due to positive weights), we compare the old value of δ(s, t) with δ(s, u) +
w(u, v) + δ(v, t), where w(u, v) denotes the weight of the newly added edge.

(g) T F Given as input a directed weighted graph with arbitrary real edge weights and
source vertex s, when Bellman-Ford terminates we have for all vertices v that
d[v] < ∞ if and only if there is some path from s to v in G.

Solution: True. This is Corollary 24.3 in the text.


6.006 Quiz 2 Name 4

(h) T F It is possible to find single source shortest paths, or detect a negative weight
cycle, in a graph with exactly one negative weight edge, in O((V + E) log V ).

Solution: True. We are looking for shortest path from the node s to every other
node in the graph. Suppose that (u, v) is the negative weight edge in G. Let G0
be the graph obtained by removing (u, v) from G. We run Dijkstra’s algorithm
twice on G0 , once with source node s (computing shortest path weights δ(s, x))
and once with source node v (computing shortest path weights δ(v, x)). To find
shortest path from s to a particular node x we consider two cases. If the shortest
path does not use (u, v), then result is δ(s, x). Otherwise, the result is δ(s, u) +
w(u, v) + δ(v, x). Since we don’t know which is the case a priori we pick the
better result. Finally, we need to check if there’s a negative weight cycle - that
happens if and only if δ(v, u) + w(u, v) < 0.

(i) T F MD5 is a collision-resistant cryptographic hash function.

Solution: False.
6.006 Quiz 2 Name 5

Problem 2. Short-ish Answer Problems [30 points] (2 parts)

(a) Close Pair on a Line [15 points]


Construct data structure that, for an integer R, maintains a set S of distinct integer
numbers in the range {0, . . . n3 } and supports operation CP-I NSERT(S, x). The lat-
ter operation adds a new number x to the set S, and returns YES if there is a pair
of numbers y, z ∈ S such that |y − z| ≤ R; otherwise, it returns NO. Note that
once CP-I NSERT returns YES, it will continue doing so in the future, since the data
structure does not support deletions.
Show how to implement this data structure so that CP-I NSERT can be performed in
expected amortized constant time. You can assume (and do not need to implement) a
standard hash table data structure supporting I NSERT and S EARCH operations, each
taking expected amortized constant time.
Hint: Suppose that you split the range {0, . . . n3 } into blocks of length R. What
happens if two numbers in S belong to the same block ?

Solution: Split the range into blocks of size R. For each inserted number x search
the hash table using the block index B(x) = bx/Rc as the key. If there is already an
element in the hash table with B(x) as the key, the answer is YES. If not, insert x into
the hash table using B(x) as the key, and then compare x to the numbers stored with
keys B(x) − 1 and B(x) + 1 (if they exist). If any of those numbers is within R from
x, the answer is YES. Otherwise, the answer is NO.
Notes:
• Instead storing x in a hash table with B(x) as the key, one could instead store
x directly in an array A[0 . . . n3 /R] at position B(x). However, this would
require performing an initialization of A, which could take lots of time. There-
fore this solution received only partial credit.
• It is important to lookup blocks B(x) − 1 and B(x) + 1, in addition to B(x). It
could be the case that two numbers that are very close belong to two different
blocks, say B(x) and B(x) + 1.
6.006 Quiz 2 Name 6

(b) Reachability on a Tree [15 points]


Let T be a rooted tree with n vertices. Note that T is a simple connected directed
graph: each node has an outgoing edge to each of its children, and the root r is the
only node without a parent. You may assume that we have a pointer to r. See Figure ??
for an example.

Figure 1: Example input tree T .

Propose a data structure that determines reachability between any pair of vertices.
That is, given a query I S -R EACHABLE(s, t), the data structure should answer T RUE
if and only if there is a directed path from s to t on this tree. For full credit your
data structure should be constructed in O(V ) time, and it should answer each query in
constant time.
Hint: Use DFS in the data structure construction phase.

Solution: We preprocess by running a DFS on this tree starting from r, while record-
ing all the start times and finish times; this step takes O(V + E) = O(V ) time since
on a tree, |E| = |V | − 1. Observe also that the input tree T is exactly our DFS tree.
There exists an s-t path if and only if s is an ancestor of t. The ancestor-descendant re-
lationship can be checked using the following condition: start-time[s] < start-time[t]
and f inish-time[t] < f inish-time[s]. This condition can be verified in O(1) time.
(See Figure 22.5 in the text for an example.)

Common mistakes.
There are many correct solutions that do not meet the running time requirements for
this problem. One possible approach is to explicitly store answers in some format,
such as a reachability matrix, a set of reachable nodes from each node, or a set of
reachable s-t pairs. These allow O(1) query time, but the data structure themselves
may contain as many as Θ(V 2 ) entries, which takes Θ(V 2 ) time to construct. Data
structures that use lists may yield worse query time.
6.006 Quiz 2 Name 7

Another approach is to keep a set of ancestors for each node. The problem with this
approach is that it is only possible to update the set of ancestor as we perform DFS,
but if we were to store this set at each node, it would take O(set size) to copy the set,
bringing the complexity up to O(V 2 ).
A related idea is to associate each node with a prime number, and compute the product
from the root to each node as an encoding of its ancestor list. Then reachability can be
checked by divisibility, but the numbers can be as large as Θ(V ) bits, where arithmetic
takes super-constant time.
Some other approaches require searching through a path or a subtree in the query step.
A common one is to reverse the edges of the graph during the construction step, so
that the list of ancestors can be computed in O(V ) time for each query.
There are some approaches using topological sorting, which is closely related to
checking the finish time on each node. This is not sufficient, but may lead to a so-
lution equivalent to the proposed model solution.
6.006 Quiz 2 Name 8

Quiz 2 Score Distribution


25

20
Frequency

15

10

0
0 5 10 15 20 25 30 35 40 45 50
Score
The histogram above shows the distribution of Quiz 2 scores of all students who took this quiz,
before any regrading.

• n = 243

• mean ≈ 28.08

• s.d. ≈ 6.69

• mode = 29

• median = 28

You might also like