Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 18

Searching

COMP 103 #10


Menu

• Cost of ArraySet
• Simple Searching
• Binary Search

• Assignment 4 – Due Next Wednesday

COMP103 – 2006/T3 2
Cost of ArraySet

• ArraySet uses same data structure as ArrayList


• does not have to keep items in order

• Operations are:
n
• contains(item)
• add(item) ← always add at the end
• remove(item) ← don’t need to shift down – just move last
item down

• What are the costs?


• contains:
• remove:
• add:

COMP103 – 2006/T3 COMP 103 11:3


ArrayList/ArraySet costs
ArrayList (duplicates permitted):
• get, set, O(1)
• remove, add (at i) O(n)
• add (at end) O(1) (average) O(n) (worst)

ArraySet (no duplicates):


• When adding, we need to check the element does not exist
already.
• contains, add, remove: O(n)
• All additional cost in the searching.

COMP103 – 2006/T3 4
Duplicate Detection
Add ‘Dog’

Bee Dog Ant Fox Gnu Eel Cat

Add ‘Pig’

7
8

Bee Dog Ant Fox Gnu Eel Cat Pig

Question:
• How can we speed up the
search?
COMP103 – 2006/T3 5
How else can we search?
• Phone Book
• Data is ordered, so it is
• Faster to find items, because…
• We know roughly how far to look
Add ‘Cat’

Ant Bee Cat Dog Eel Fox Gnu Pig

COMP103 – 2006/T3 6
An improvement, but

• This is still an O(n) search.


• Think about how you search a phone
book…
• Exercise: Write down where your eyes
scan when looking for G & W in the list:

AAABBBCCCDDDDEEEEFFFGGGHHHIIIJJJJJJJJJKKKKK
KLLLLLMMMMMNNOOOPPQQRRRSSSTTTTTUUVVV
VVWXXXYZZZ

COMP103 – 2006/T3 7
A better strategy…

• We can model a search pattern on the


human search pattern.
• Idea:
• Look at the middle.
• is it before or after?
• if before, look at the middle of the first half,
• if after, look at the middle of the second half.
• This is called binary search (because we
are spliting the list into two parts).

COMP103 – 2006/T3 8
Making ArraySet faster.
• Binary Search: Finding “Eel”
8 0+7/2 4+4/2 =4 =5
= 3 4+7/2

Ant Bee Cat Dog Eel Fox Gnu Pig


0 1 2 3 4 5 6 7 8

• If the items are sorted (“ordered”), then we can


search fast
• Look in the middle:
• if item is middle item ⇒ return
• if item is before middle item ⇒ look in left half
• if item is after middle item ⇒ look in right half

COMP103 – 2006/T3 9
Binary Search
public boolean contains(E item){
Comparable<E> value = (Comparable<E>) item;
int low = 0;  // min possible index of item
int high = count-1; // max possible index of item
// item in [low .. high] (if
present)
while (low <= high){
    int mid  =  (low + high) / 2;   
    if (value.equals(data[mid]) = 0) // item is present
        return true;
    if (value.compareTo(data[mid]) < 0) // item in [low .. mid-1]
        high = mid - 1;    // item in [low .. high]
    else  // item in [mid+1 .. high]
        low = mid + 1; // item in [low .. high]
  }
 return false;    // item in [low .. high] and low > high, 
// therefore item not present
         
}
COMP103 – 2006/T3 10
An Alternative
Return the index of where the item ought to be, whether present or not.
public int findIndex(Object item){
Comparable<E> value = (Comparable<E>) item;
int low = 0;     
int high  =  count;      // index in [low .. high]
while (low < high){
    int mid  =  (low + high) / 2;
    if (value.compareTo(data[mid]) > 0) // index in [mid+1 .. high]
        low = mid + 1;          // index in [low .. high]  low <=
high
    else                    // index in [low .. mid]
        high = mid;        // index in [low .. high],
low<=high
}
return low;    // index in [low .. high] and low = high 
// therefore index = low
}

COMP103 – 2006/T3 11
Exercise
• Calculate all the mid and return values from the
following searches using findIndex(…):
9 Find ‘Cat’

Ant Bee Cat Dog Eel Fox Gnu Pig Zbu


0 1 2 3 4 5 6 7 8

7 Find ‘Bug’

Ant Bee Cat Dog Eel Fox Gnu


0 1 2 3 4 5 6 7 8

8 Find ‘Orc’

Ant Bee Cat Dog Eel Fox Gnu Orc


0 1 2 3 4 5 6 7 8

COMP103 – 2006/T3 12
Another Exercise
• Write a recursive binary search
below!
public int findIndex(E item, int low, int high){
Comparable<E> value = (Comparable<E>)item;
if(low >= high)
return low;
int mid = (low + high) / 2;
if (value.compareTo(data[mid]) > 0)
return recFind(itm, mid + 1, high);
else
return recFind(itm, low, mid);

COMP103 – 2006/T3 13
Binary Search: Cost
• What is the cost of searching if n items in set?
• key step = ? (look at the code)

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31

• Iteration Size of range


1 n
2 n/2
3 n/4

k 1

COMP103 – 2006/T3 14
Log2(n ) :
The number of times you can divide a set of n things in half.
lg(1000) =10, lg(1,000,000) = 20, lg(1,000,000,000) =30
Every time you double n, you add one step to the cost!

• Arises all over the place in analysing algorithms


“Divide and Conquer” algorithms (easier to solve small
problems):
Problem

Solve Solve

Solution

COMP103 – 2006/T3 15
ArraySet with Binary
Search
ArraySet: unordered
• All cost in the searching: O(n)
• contains: O(n )
• add: O(n )
• remove: O(n )

ArraySet: with Binary Search


• Binary Search is fast: O(log(n ))
• contains: O(log(n ))
• add: O(log(n )) O(n )
• remove: O(log(n )) O(n )

• All the cost is in keeping it sorted!!!!

COMP103 – 2006/T3 16
Making SortedArraySet fast
• If you have to call add() and/or remove() many items,
then SortedArraySet is no better than ArraySet (we’ll see
how to do better later in the course)
• Both O(n )
• Either pay to search
• Or pay to keep it in order

• If you only have to construct the set once, and then many
calls to contains(),
then SortedArraySet is much better than ArraySet.
• SortedArraySet contains() is O(log(n )

• But, how do you construct the set fast?


• A separate constructor.

COMP103 – 2006/T3 17
Alternative Constuctor
public SortedArraySet(Collection<E> c){

// Make space
count=c.size();
data = (E[]) new Object[count];

// Put collection into a list and sort


List<E> temp = new ArrayList<E>(c);
Collections.sort(temp, new ComparableComparator());

// Put sorted list into the data array


int i=0;
for (E item : temp)
      data[i++] = item;
}

• How do you actually sort?  Topic of the next Lecture…

COMP103 – 2006/T3 18

You might also like