Professional Documents
Culture Documents
3 Sort
3 Sort
3 Sort
• Input:
• Output:
2
Structure of data
3
Why Study Sorting Algorithms?
• There are a variety of situations that we can
encounter
– Do we have randomly ordered keys?
– Are all keys distinct?
– How large is the set of keys to be ordered?
– Need guaranteed performance?
4
Some Definitions
• Internal Sort
– The data to be sorted is all stored in the
computer’s main memory.
• External Sort
– Some of the data to be sorted might be stored in
some external slower device.
• In Place Sort
– The amount of extra space required to sort the
data is constant with the input size.
5
Stability
• A STABLE sort preserves relative order of records with equal
keys
Sorted on first key:
6
INSERTION SORT
• Idea: like sorting a hand of playing cards
– Start with an empty left hand and the cards facing
down on the table.
– Remove one card at a time from the table, and insert
it into the correct position in the left hand
• compare it with each of the cards already in the hand, from
right to left
– The cards held in the left hand are sorted
• these cards were originally the top cards of the pile on the
table
8
Insertion Sort
input array
5 2 4 6 1 3
sorted unsorted
9
Insertion Sort
10
INSERTION-SORT
INSERTION-SORT(A) 1 2 3 4 5 6 7 8
1. for j ← 2 to n a1 a2 a3 a4 a5 a6 a7 a8
2. do key ← A[ j ] key
i←j-1
while i > 0 and A[i] > key
do A[i + 1] ← A[i]
i←i–1
A[i + 1] ← key
Invariant: at the start of the for loop the elements in A[1 . . j-1] are in
sorted order
12
Proving Loop Invariants
• Proving loop invariants works like induction
• Initialization (base case):
– It is true prior to the first iteration of the loop
• Maintenance (inductive step):
– If it is true before an iteration of the loop, it remains true before the
next iteration
• Termination:
– When the loop terminates, the invariant gives us a useful property
that helps show that the algorithm is correct
– Stop the induction when the loop terminates
13
Loop Invariant for Insertion Sort
• Initialization:
– Just before the first iteration, j = 2:
the sub array A[1 . . j-1] = A[1],
(the element originally in A[1]) is
sorted.
14
Loop Invariant for Insertion Sort
• Maintenance:
– the while inner loop moves A[j -1], A[j -2], A[j -3],
and so on, by one position to the right until the proper
position for key (which has the value that started out
in A[j]) is found
– At that point, the value of key is placed into this
position.
15
Loop Invariant for Insertion Sort
• Termination:
– The outer for loop ends when j = n + 1 j-1 = n
– Replace n with j-1 in the loop invariant:
• the subarray A[1 . . n] consists of the elements
originally in A[1 . . n], but in sorted order
j-1 j
T (n) c1n c2 (n 1) c4 (n 1) c5 t j c6 t j 1 c7 t j 1 c8 (n 1)
n n n
j 217 j 2 j 2
Best Case Analysis
• The array is already sorted “while i > 0 and A[i] > key”
– A[i] ≤ key upon the first time the while loop test is run
(when i = j -1)
T (n) c1n c2 (n 1) c4 (n 1) c5 t j c6 t j 1 c7 t j 1 c8 (n 1)
n n n
– tj = 1 j 2 j 2 j 2
n(n 1) n( n 1) n(n 1)
T (n ) c1n c2 (n 1) c4 (n 1) c5 1 c6 c7 c8 (n 1)
2 2 2
an 2 bn c a quadratic function of n
j 2 j 2 j 2 19
Comparisons and Exchanges in Insertion Sort
n2/2 exchanges c7
n
i←i–1 j 2
(t j 1)
21
BUBBLE SORT
• Idea:
– Repeatedly pass through the array
– Swaps adjacent elements that are out of order
i
1 2 3 n
8 4 6 9 2 3 1
j
23
Example
8 4 6 9 2 3 1 1 8 4 6 9 2 3
i=1 j i=2 j
8 4 6 9 2 1 3 1 2 8 4 6 9 3
i=1 j i=3 j
8 4 6 9 1 2 3 1 2 3 8 4 6 9
i=1 j i=4 j
8 4 6 1 9 2 3 1 2 3 4 8 6 9
i=1 j i=5 j
8 4 1 6 9 2 3 1 2 3 4 6 8 9
i=1 j i=6 j
8 1 4 6 9 2 3 1 2 3 4 6 8 9
i=1 j i=7
j
1 8 4 6 9 2 3
24
i=1 j
BUBBLESORT(A)
1. for i 1 to length[A]
2. do for j length[A] down to i + 1
3. do if A[j] < A[j -1]
4. i then exchange A[j] A[j-1]
8 4 6 9 2 3 1
i=1 j
25
Bubble-Sort Running Time
BUBBLESORT(A)
for i 1 to length[A] c1
do for j length[A] downto i + 1 c2
Comparisons: n2/2 do if A[j] < A[j -1] c3
Exchanges: n2/2
then exchange A[j] A[j-1] c4
n
(n i )
n n
T(n) = c1(n+1) + c2 (n i 1) c3 (n i ) c4
i 1 i 1 i 1
n
28
Example
8 4 6 9 2 3 1 1 2 3 4 9 6 8
1 4 6 9 2 3 8 1 2 3 4 6 9 8
1 2 6 9 4 3 8 1 2 3 4 6 8 9
1 2 3 9 4 6 8 1 2 3 4 6 8 9
29
SELECTION-SORT(A) 8 4 6 9 2 3 1
1.n ← length[A]
2.for j ← 1 to n - 1
3. do smallest ← j
4. for i ← j + 1 to n
5. do if A[i] < A[smallest]
6. then smallest ← i
7. exchange A[j] ↔ A[smallest]
30
Analysis of Selection Sort
SELECTION-SORT(A) cost times
n ← length[A] c1 1
for j ← 1 to n - 1 c2 n
do smallest ← j c3 n-1
n2/2 for i ← j + 1 to n c4 nj11 (n j 1)
comparisons
do if A[i] < A[smallest] c5
n 1
j 1
(n j )
n
exchanges
then smallest ← i c6
n 1
j 1
(n j )
Then, sort the two sub arrays and combine to form the solution to
the problem array.
MergeSort(A, p, r)
1. If p > r
2. return;
3. q = (p+r)/2
4. MergeSort(A, p, q)
5. MergeSort(A, q+1, r)
6. Merge(A, p, q, r)
Merge( A, p, q, r) 14. j=j+1;
15. }
1. Create L ← A[p..q] and 16. k=k+1;
M ← A[q+1..r] 17. }
2. n1 = q - p + 1; n2 = r - q;
3. i = 0; j = 0; k = p; 18. while (i < n1)
4. while (i < n1 and j < n2) 19. {
5. { 20. A[k] = L[i];
6. if (L[i] <= M[j]) 21. i=i+1; k=k+1;
7. { 22. }
8. A[k] = L[i];
9. i=i+1; 23. while (j < n2)
10. } 24. {
11. else 25. A[k] = M[j];
12. { 26. j=j+1; k=k+1;
13. A[k] = M[j]; 27. }
• Worst Case Time Complexity [ Big-O ]: O(n*log n)
• Best Case Time Complexity [Big-omega]: O(n*log n)
• Average Time Complexity [Big-theta]: O(n*log n)
• Space Complexity: O(n)
QUICKSORT
/* low --> Starting index, high --> Ending index */
QuickSort(arr[], low, high)
1. if (low < high)
2. {
3. //pi is partitioning index, arr[p] is now at right place
4. pi = Partition(arr, low, high);
Comp 122
Binary Heap
• Array viewed as a nearly complete binary tree.
– Physically – linear array.
– Logically – binary tree, filled on all levels (except lowest.)
• Map from array elements to tree nodes and vice versa
– Root – A[1]
– Left[ i ] – A[2i]
– Right[ i ] – A[2i+1]
– Parent[ i ] – A[ i /2]
• length[A] – number of elements in array A.
• heap-size[A] – number of elements in heap stored in
A.
– heap-size[A] length[A]
Heap Property (Max and Min)
• Max-Heap
– For every node excluding the root,
value is at most that of its parent: A[parent[i]] A[i]
• Largest element is stored at the root.
• In any subtree, no values are larger than the value
stored at subtree root.
• Min-Heap
– For every node excluding the root,
value is at least that of its parent: A[parent[i]] A[i]
• Smallest element is stored at the root.
• In any subtree, no values are smaller than the value
stored at subtree root
Heaps – Example
26 24 20 18 17 19 13 12 14 11 Max-heap as an
array.
1 2 3 4 5 6 7 8 9 10
Max-heap as a binary
tree.
26
24 20
18 17 19 13
1. BuildMaxHeap(A)
2. for i length[A] down to 2
3. do exchange A[1] A[i]
4. heap-size[A] heap-size[A] – 1
5. MaxHeapify(A, 1)
Building a heap
• Use MaxHeapify to convert an array A into a max-heap.
• How?
• Call MaxHeapify on each element in a bottom-up manner.
BuildMaxHeap(A)
1. heap-size[A] length[A]
2. for i length[A]/2 down to 1
3. do MaxHeapify(A, i)
MaxHeapify(A, i)
1. l left(i)
2. r right(i)
Assumption:
3. if l heap-size[A] and A[l] > A[i] Left(i) and Right(i) are
max-heaps.
4. then largest l
5. else largest i
6. if r heap-size[A] and A[r] > A[largest]
7. then largest r
8. if largest i
9. then exchange A[i] A[largest]
10. MaxHeapify(A, largest)
Running Time of BuildMaxHeap
• Loose upper bound:
– Cost of a MaxHeapify call No. of calls to MaxHeapify
– O(lg n) O(n) = O(n lg n)
• Tighter bound:
– Cost of a call to MaxHeapify at a node depends on the
height, h, of the node – O(h).
– Height of most nodes smaller than n.
– Height h of nodes ranges from 0 to lg n.
– No. of nodes of height h is n/2h+1
Comp 122
Running Time of BuildMaxHeap
Tighter Bound for T(BuildMaxHeap)
lg n
T(BuildMaxHeap) h
h 0 2
h
lg n
n
h 0 2
h 1
O ( h)
h
h
, x 1 / 2 in (A.8)
h 0 2
lg n h
O n h
1/ 2
h 0 2 (1 1 / 2) 2
2
lg n h h
O n h O n h
h 0 2 h 0 2
O ( n)
Comp 122
Running Time for MaxHeapify
MaxHeapify(A, i)
1. l left(i)
2. r right(i)
3. if l heap-size[A] and A[l] > A[i]
Time to fix node i and
4. then largest l its children = (1)
5. else largest i
6. if r heap-size[A] and A[r] > A[largest] PLUS
7. then largest r
8. if largest i
Time to fix the
9. then exchange A[i] A[largest] subtree rooted at one
10. MaxHeapify(A, largest) of i’s children = T(size
of subree at largest)
Comp 122
Running Time for MaxHeapify(A, n)
• T(n) = T(largest) + (1)
• largest 2n/3 (worst case occurs when the last row
of tree is exactly half full)
• T(n) T(2n/3) + (1) T(n) = O(lg n)
• Alternately, MaxHeapify takes O(h) where h is the
height of the node where MaxHeapify is applied
Comp 122
Heapsort – Example
26 24 20 18 17 19 13 12 14 11
1 2 3 4 5 6 7 8 9 10
26
24 20
18 17 19 13
12 14 11
Comp 122
Algorithm Analysis
• In-place
• Not Stable
Comp 122
Heap Procedures for Sorting
• MaxHeapify O(lg n)
• BuildMaxHeap O(n)
• HeapSort O(n lg n)
Comp 122