Lecture 41

You might also like

Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 38

Lecture 41:

P = NP?
I will have extra office
hours after class today
(1-3pm).
To be eligible to present
Monday, your team must
send an email with your
URL before midnight
DNA Helix Photomosaic from cover of tonight.
Nature, 15 Feb 2001 (made by Eric Lander)

CS150: Computer Science


University of Virginia David Evans
Computer Science http://www.cs.virginia.edu/evans
Pegboard Problem

Lecture 41: P = NP? 2


Pegboard Problem
- Input: a configuration of n pegs on a
cracker barrel style pegboard
- Output: if there is a sequence of jumps
that leaves a single peg, output that
sequence of jumps. Otherwise, output
false.

How hard is the Pegboard


Problem?
Lecture 41: P = NP? 3
Problems and
Procedures
• To know a O (f) bound for a problem, we need to find a
Θ (f) procedure that solves it
– The sorting problem is O (n log n) since we know a
procedure that solves it in Θ (n log n)

• To know a Ω(f ) bound for a problem, we need to


prove that there is no procedure faster than Θ (f)
that solves it
– We proved sorting is Ω(n log n) by reasoning about the
number of decisions needed

Lecture 41: P = NP? 4


How much work is the
Pegboard Problem?
• Upper bound: (O)
O (n!)
Try all possible permutations
• Lower bound: (Ω )
Ω (n)
Must at least look at every peg
• Tight bound: (Θ )
No one knows!
Lecture 41: P = NP? 5
Complexity Class P
“Tractable”
Class P: problems that can be solved
in a polynomial (O(nk) for some
constant k) number of steps by a
deterministic TM.

Easy problems like sorting, making a


photomosaic using duplicate tiles,
simulating the universe are all in P.

Lecture 41: P = NP? 6


Complexity Class NP
Class NP: Problems that can be solved in
a polynomial number of steps by a
nondeterministic TM.

Omnipotent: If we could try all possible


solutions at once, we could identify the solution
in polynomial time.
Omniscient: If we had a magic guess-correctly
procedure that makes every decision correctly,
we could devise a procedure that solves the
problem in polynomial time.

Lecture 41: P = NP? 7


NP Problems
• Can be solved by just trying all possible
answers until we find one that is right
• Easy to quickly check if an answer is right
– Checking an answer is in P
• The pegboard problem is in NP
We can easily try ~n! different answers
We can check if a guess is correct in O(n)
(check all n jumps are legal)

Lecture 41: P = NP? 8


Is the Pegboard Problem in
P?
No one knows!

We can’t find a O (nk) solution.


We can’t prove one doesn’t
exist.

Lecture 41: P = NP? 9


Orders of Growth
1200

simulating
1000
universe
logn n

800

nlogn n^2
pegboard puzzle
600

2n < n!
400
n^3 2^n

insertsort
200 quicksort

Lecture 41:0P = NP? 10


1 2 3 4 5 6 7 8 9
Orders of Growth
70000
pegboard
puzzle
60000 logn n

50000

simulating universe
40000 nlogn n^2

30000

20000 n^3 2^n

10000
insertsort
quicksort
0
Lecture 41: P = NP?
1 2 3 4 5 6 11 7 8 9 10 11 12 13 14
Orders of Growth
1200000

Pegboar
1000000 logn n d puzzle

800000

nlogn n^2 “intractable”


600000

400000 “tractable”
n^3 2^n

200000
simulating
universe
I do0 nothing that a man of unlimited funds, superb
physical1 endurance,
2 3 4 5 and 6 7maximum
8 9 10scientific
11 12 13 14 15 16 17 18 19
knowledge could not do.
– Batman (may be able to solve intractable problems,
but computer scientists can only solve tractable ones
CS 200 Spring 2003 12
Intractable Problemslog-log scale
1E+30

time
1E+28
n! 2n
1E+26
since
“Big 1E+24

Bang” 1E+22

1E+20

1E+18

1E+16

1E+14 P
1E+12
2022
1E+10
today
1E+08 n2
1E+06 n log n
10000

100

Lecture 41: P = NP?2 4 8


13
16 32 64
Moore’s Law Doesn’t Help
• If the fastest procedure to solve
a problem is Θ (2n) or worse,
Moore’s Law doesn’t help much.

• Every doubling in computing


power increases the solvable
problem size by 1.

Lecture 41: P = NP? 14


Complexity Classes
Class P: problems that can be solved in
polynomial time by deterministic TM
Easy problems like simulating the
universe are all in P.
Class NP: problems that can be solved
in polynomial time by a nondeterministic
TM. Includes all problems in P and
some problems possibly outside P like
the Pegboard puzzle.

Lecture 41: P = NP? 15


Problem Classes if P ≠ NP:
Simulating
Universe: O(n3)
NP
Find Best: Θ (n)
P
How many problems
Θ (n) are in the Θ (n) class?
infinite
How many problems
are in P but not
in the Θ (n) class?
infinite
How many problems
Sorting: are in NP but not
in P?
Θ (n log n) infinite
Pegboard:
O(n!) and Ω (n)

Lecture 41: P = NP? 16


Problem Classes if P = NP:
Simulating
Universe: O(n3)
Find Best: Θ (n)
P
How many problems
Θ (n) are in the Θ (n) class?
infinite
How many problems
are in P but not
in the Θ (n) class?
infinite
How many problems
Sorting: are in NP but not
in P?
Θ (n log n) 0
Pegboard:
Θ (nk)

Lecture 41: P = NP? 17


P = NP?
• Is P different from NP: is there a problem
in NP that is not also in P
– If there is one, there are infinitely many
• Is the “hardest” problem in NP also in P
– If it is, then every problem in NP is also in P
• The most famous unsolved problem in
computer science and math
– Listed first on Millennium Prize Problems
– $1M + an automatic A+ in this course

Lecture 41: P = NP? 18


NP-Complete
• NP-Complete: is the class of problems
that are the hardest problems in NP
• Cook and Levin proved that 3SAT was NP-
Complete (1971)
– If 3SAT can be transformed into a different
problem in polynomial time, than that problem
must also be NP-complete.
– Pegboard ⇔ 3SAT
• Either all NP-complete problems are
tractable (in P) or none of them are!

Lecture 41: P = NP? 19


NP-Complete Problems
• Easy way to solve by trying all possible
guesses
• If given the “yes” answer, quick (in P) way to
check if it is right
• If given the “no” answer, no quick way to
check if it is right
– No solution (can’t tell there isn’t one)
– No way (can’t tell there isn’t one)
This part is hard to prove: requires showing
you could use a solution to the problem to
solve a known NP-Complete problem.

Lecture 41: P = NP? 20


Pegboard Problem

Lecture 41: P = NP? 21


Pegboard Problem
- Input: a configuration of n pegs on a
cracker barrel style pegboard
- Output: if there is a sequence of jumps
that leaves a single peg, output that
sequence of jumps. Otherwise, output
false.

If given the sequence of jumps, easy


(O(n)) to check it is correct. If not, hard
to know if there is a solution.

Lecture 41: P = NP? 22


Most Important Science/Technology Races
1930-40s:Decryption Nazis vs. British
Winner: British
Reason: Bletchley Park had computers (and Alan
Turing), Nazi’s didn’t
1940s: Atomic Bomb Nazis vs. US
Winner: US
Reason: Heisenberg miscalculated, US had better
physicists, computers, resources
1960s: Moon Landing Soviet Union vs. US
Winner: US
Reason: Many, better computing was a big one
1990s-2001: Sequencing Human Genome
Lecture 41: P = NP? 23
Human Genome Race
Francis Collins Craig Venter
(Director of (President of
public National Celera
Center for
Human Genome
Research)
vs. Genomics)

(Picture from
UVa Graduation
2001)

• UVa CLAS 1970 • San Mateo College


• Yale PhD
• Court-martialed
• Tenured Professor at U.
Michigan • Denied tenure at
SUNY Buffalo
Lecture 41: P = NP? 24
Reading the Genome

Whitehead Institute, MIT

Lecture 41: P = NP? 25


Gene Reading Machines
• One read: about 700 base pairs
• But…don’t know where they are on
the chromosome
Read 3 TACCCGTGATCCA
Read 2 TCCAGAATAA
Read 1 ACCAGAATACC
Actual
AGGCATACCAGAATACCCGTGATCCAGAATAAGC
Genome

Lecture 41: P = NP? 26


Genome Assembly
Read 1 ACCAGAATACC
Read 2 TCCAGAATAA
Read 3 TACCCGTGATCCA

Input: Genome fragments (but


without knowing where they are
from)
Ouput: The full genome
Lecture 41: P = NP? 27
Genome Assembly
Read 1 ACCAGAATACC
Read 2 TCCAGAATAA
Read 3 TACCCGTGATCCA

Input: Genome fragments (but without knowing


where they are from)
Ouput: The smallest genome sequence such that all the
fragments are substrings.

Lecture 41: P = NP? 28


Common Superstring
Input: A set of n substrings and a
maximum length k.
Output: A string that contains all the
substrings with total length ≤ k, or no
if no such string exists.
ACCAGAATACC ACCAGAATACC

TCCAGAATAA
TCCAGAATAA
TACCCGTGATCCA
TACCCGTGATCCA ACCAGAATACCCGTGATCCAGAATAA
n = 26

Lecture 41: P = NP? 29


Common Superstring
Input: A set of n substrings and a
maximum length k.
Output: A string that contains all the
substrings with total length ≤ k, or no
if no such string exists.
ACCAGAATACC
TCCAGAATAA Not possible
TACCCGTGATCCA
n = 25

Lecture 41: P = NP? 30


Common Superstring
• In NP:
– Easy to verify a “yes” solution: just
check the letters match up, and count
the superstring length
• In NP-Complete:
– Similar to Pegboard Puzzle!
– Could transform Common Superstring
problem instance into Pegboard Puzzle
instance!

Lecture 41: P = NP? 31


Shortest Common
Superstring
Input: A set of n substrings
Output: The shortest string that
contains all the substrings.

ACCAGAATACC ACCAGAATACC

TCCAGAATAA
TCCAGAATAA
TACCCGTGATCCA
TACCCGTGATCCA ACCAGAATACCCGTGATCCAGAATAA

Lecture 41: P = NP? 32


Shortest Common
Superstring
Also is NP-Complete:

def scsuperstring (pieces):


maxlen = sum of lengths of all pieces
for k in range(1, maxlen):
if (commonSuperstring (pieces, k)):
return commonSuperstring
(pieces, k)

Lecture 41: P = NP? 33


Human Genome
• 3 Billion base pairs
• 600-700 bases per read
• ~8X coverage required

> (/ (* 8 3000000000)) 650)


36923076 12/13
• So, n ≈ 37 Million sequence fragments
• Celera used 27.2 Million reads (but could
get more than 700 bases per read)

Lecture 41: P = NP? 34


Give up?
1E+30

No way to solve 1E+28


time
an NP-Complete
1E+26

1E+24
2 n
since
“Big
problem (best
1E+22

1E+20 Bang”

known solutions
1E+18

1E+16

being O(2n) for n 1E+14

1E+12

≈ 20 Million) 1E+10

1E+08

1E+06

10000

100

1
2 4 8 16 32 64

Lecture 41: P = NP? 35


Approaches
• Human Genome Project (Collins)
– Change problem: Start by producing a
genome map (using biology, chemistry,
etc) to have a framework for knowing
where the fragments should go
• Celera Solution (Venter)
– Approximate: we can’t guarantee finding
the shortest possible, but we can develop
clever algorithms that get close most of
the time in O(n log n)

Lecture 41: P = NP? 36


Result: Draw?

Venter Collins
President Clinton announces Human Genome Sequence
essentially complete, June 26, 2000

Lecture 41: P = NP? 37


Charge
• Presentation qualification before
midnight tonight

Lecture 41: P = NP? 38

You might also like