Download as pdf or txt
Download as pdf or txt
You are on page 1of 51

A Journey of a Thousand Miles!

Iteration as a strategy for solving computational problems!

Searching and Sorting!

The Linear Search


Algorithm!

The Insertion Sort


Algorithm!

Scalability!

Best Case, Worst Case!

Explorations in Computing!
2012 John S. Conery !

About the Title!

The chapter title is part of a quote from


Tao Te Ching, by Lao-Tzu:!
A journey of a thousand miles
begins with a single step. !

Search!

Searching is a common operation in many different applications!

in iTunes, enter a string in the search box !

the program shows tunes that have that string in the song title, album title, artist name, or
other fields!

Search (cont d)!

Some other examples:!

on-line dictionaries and catalogs!

the find command in a word


processor or text editor!

Dictionary.app on Mac OS X systems

Find command in TextMate

Defining Search!

The examples on the previous slides were similar to searches in everyday life!

look for a book, either on a bookshelf at home or in a library!

find a name in a phone book or a word in a dictionary!

search a file drawer to find customer information or student records!

What these problems have in common:!

we have a large collection of items!

we need to search the collection to find a single item that matches a certain condition (e.g.
name of book, name of a person)!

Other Kinds of Searches!

Not all searches fit this paradigm of looking for a specified item!

Perform a search using different information!

files may be organized by student name, but we need to search by file contents, e.g. find
students whose advisor is Prof. X!

Search for items matching a general description!

select all customer records where the customer s name starts with A !

find all customers with overdue accounts!

I ll know it when I see it searches!

look for a good book to read


(bookcase, bookstore, library, ...)!

scan a menu at a restaurant!

Search Algorithms!

As in real life, there are many variations and many different types of search
algorithms!

search a list to see if it contains a specified number!

search a list to find the largest number!

find all numbers in a list that start with 541!

find strings of digits that match a pattern (e.g. xxx-xx-xxxx)!

search an abstract space of solutions


(e.g. best move in chess or shortest tour)!

machines can be taught to scan images


for interesting items (no exact representation
of the item to search for)!

I ll know it when I see it -- data mining


for interesting or unusual patterns!

solarsystem.nasa.gov

Sorting!

Sorting -- reorganizing information so it s in a particular order -- is closely


relating to searching!
Click a column name to sort by that field

Iterative Algorithms!

The goal for this chapter: study two new algorithms based on iteration!

Build on the main idea from Chapter 3:!

repeating (iterating) a series of small steps can lead to the solution of important problems!

Search algorithm:!
! !linear search!

Sort algorithm:!
! !insertion sort!

The next chapter looks at more sophisticated


techniques for both searching and sorting!

Linear Search!

Linear search is the simplest, most straightforward search strategy!

As the name implies, the idea is to start at the beginning of a collection and
compare items one after another!

Some terminology:!

the item we are looking for is known as the key!

this type of search is also sometimes called a scan!

if the key is not found the search fails!

Linear Search in Ruby!

In Chapter 3 we saw that Ruby uses arrays to hold collections of data!


>> a = [8, 0, 9, 2, 7, 5, 3]!
=> [8, 0, 9, 2, 7, 5, 3]!
>> a.class!
=> Array!

The include? method does a linear search to see if an item is in an array!


>> a.include?(7)!
=> true!
>> a.include?(4)!
=> false!

Array Indices!

In many situations we want to know where an item was found!

Instead of simply returning true or false we want a method to say


something like the thing you are looking for is in the second location in the
array !

an array location is known as an address or an index!

Computer scientists start labeling at 0!

if an array has n items the addresses are 0 to n - 1!

Index Expressions!

We can access any item in an array using an index expression!

if a is an array the expression a[i] means the item at location i in the array !

pronounced a sub i , from the mathematical notation ai!

>> a = ["apple", "lime", "kiwi", "orange", "ugli"]!


>> a[3]!

=> "orange"!
>> a[3] = "tangerine"!
=> "tangerine"!
>> a!
=> ["apple", "lime", "kiwi", "tangerine", "ugli"]!
>> a[5]!
=> nil!

Index Expressions (cont d)!

An index expression gives us an alternative syntax for accessing the items at


the front or end of an array!
>> a = ["apple", "lime", "kiwi", "orange", "ugli"]!
>> a.first!
=> "apple"!
>> a[0]!
=> "apple"!
>> a.last!
=> "ugli"!
>> a[-1]!
=> "ugli"!
>> a[a.length-1]!
=> "ugli"!

A Project!

To investigate the linear search algorithm, we ll write our own versions of


the include? and index methods!
contains?(a,x)!
return

true if array a contains item x!

return

false if x is not in a!

search(a,x)!
return

the location of item x if x is in a!

return

nil if x is not in a!

Conditional Expressions!

A useful construct in Ruby is a conditional expression!

One way to write a conditional expression is to attach a modifier to a


statement!

the modifier consists of the keyword if followed by a Boolean expression !

There are many ways to write conditions -- we ll see others later!

>> a.each { |x| puts x }


8092753
=> [8, 0, 9, 2, 7, 5, 3]

>> a.each { |x| puts x if x % 2 == 0 }


802=> [8, 0, 9, 2, 7, 5, 3]

Writing the contains? Method!

To implement our contains? method we can use each to iterate over the
array!

each will visit every item, in order!

put a return statement in the block executed by each!

attach a modifier so the return is executed as soon as the item is found!

def contains?(a,k)!

the array to search and the key to look for

a.each { |x| return true if x == k }!


return false!
end!

break out of the "


loop early...

... if the key matches an item"


in the array

execute this statement only if each gets


through the entire array without finding x!

contains? (cont d)!

Let s try it out on an array of strings!


>> include IterationLab!
=> Object!
>> a = [8, 0, 9, 2, 7, 5, 3]!
=> [8, 0, 9, 2, 7, 5, 3]!
>> contains?(a, 7)!
=> true!
>> contains?(a, 4)!
=> false!

Writing the search Method!

We want our search method to return the location of the match !

One way to write search: use an iterator named each_with_index!

For this project, though, we ll write it using a while loop!

easier to attach a probe to count steps!

reuse the outline later, for insertion sort!

Optional project:!

get a copy of the source for contains?!

save it in a file named search.rb!

learn about each_with_index, use it to implement search!

>> Source.checkout("contains?", "search.rb")Saved a copy of

search (cont d)!

The basic plan is to use a variable named i to hold an index value!

Use a while loop to compare a[i] with the item we re looking for!

return i if a[i] matches the item!

otherwise add 1 to i and repeat!

while i < a.length

return i if a[i] == k

Note: i += 1 means add 1 to i

i += 1end!

search (cont d)!

Here is a listing of the complete method!

the while loop from the previous slide is on lines 3 to 6!

i will (if necessary) have every


value from 0 to a.length-1!

>> Source.listing("search")

1:

def search(a, k)

notice return values: either i or nil!

2:

i =

search (cont d)!

Trying out the method:!


>> a = TestArray.new(5, :colors)!
=> ["tan", "mint", "salmon", "chocolate", "wheat"]!
>> search(a, "salmon")!
=> 2!
>> search(a, "orange")!
=> nil!

TestArray is a class in the RubyLabs module


Other options are :cars, :fruit, :words, :elements!

Performance!

How many comparisons will the linear search algorithm make as it


searches through an array with n items?!

another way to phrase it: how many iterations will our Ruby method make?!

For an unsuccessful search:!

compare every item before returning false or nil!

i.e. make n comparisons!

For a successful search, anywhere between 1 and n!

search may get lucky and find the item in the first location!

similarly, it might be in the last location!

expect, on average: n / 2 comparisons!

Experiments with search!

We can attach a probe to one of the lines in the method to show the state
of the search at each step!

A method named brackets will print an array, putting [ ] around some of


the items!

brackets(a,i) makes a string showing every item in a!

puts [ before a[i] and ] at the end!

>> a!
=> ["tan", "mint", "salmon", "chocolate", "wheat"]!
>> puts brackets(a, 2)!
tan

mint [salmon

chocolate

wheat]!

Experiments with search (cont d)!

From the listing on a previous slide:!


4:

return i if a[i] == k!

Attach a probe to line 4, telling Ruby to print brackets around the region
that has not been searched yet:!
>> Source.probe( "search", 4, "puts brackets(a,i)" )!
=> true!
>> trace { search(a, "chocolate") }!
[tan

mint

salmon

chocolate

wheat]!

tan [mint

salmon

chocolate

wheat]!

tan

mint [salmon

chocolate

wheat]!

tan

mint

salmon [chocolate

wheat]!

=> 3!

Can you see the unsearched region


getting smaller at each step?
Do you see why the algorithm
terminated when it did?
Do you see why the result of calling
search(a,"chocolate") is 3?

Experiments with search (cont d)!

Here is the trace of an unsuccessful search:!


>> trace { search(a, "orange") }!
[tan

mint

salmon

chocolate

wheat]!

tan [mint

salmon

chocolate

wheat]!

tan

mint [salmon

chocolate

wheat]!

tan

mint

salmon [chocolate

wheat]!

tan

mint

salmon

chocolate [wheat]!

=> nil!

If you want to do some more experiments:!


>> a = TestArray.new(10, :cars)!
=> ["saturn", "ferrari", "bmw", ... "honda"]!

How many comparisons will be


made in an unsuccessful search of
an array with n items?

Try Your Own Experiments!

If you don t specify a type of string in a call to TestArray.new you get an


array of numbers:!
>> a = TestArray.new(10)!
=> [27, 5, 33, 39, 12, 51, 19, 20, 64, 9]!

Call a.random to get a random number from a TestArray!


a.random(:success)
a.random(:fail)

will return a number guaranteed to be in a!

will return a number that is not in a!

Optional project:!
attach
make

a counter on line 4 (pass :count to the method that sets a probe)!

TestArrays of varying size (100, 1000, or even 10,000 numbers)!

how

many comparisons are made by an unsuccessful search?!

how

many (on average) by a successful search?!

count { search(a, a.random(:success)) }!

Another Optional Project!

Write a method named max that will find the largest item in an array!

here is an outline to get you started -- fill in the parts indicated by question marks!

there will be more than one line in the body of the loop!

def max(a)!
x = a[0]!
i = 1!
while ??!
??!
end!
return x!
end!

This method is another variation on the


theme of linear search
x will be the largest item seen so far
On each iteration update x to be the current
item if that item is greater than x!

Note: call Source.checkout( search , max.rb ) to get a copy of


the search method to use as a template for your method

Recap: Linear Search!

The linear search algorithm looks for an item in an array!

start at the beginning (a[0], or the left )!

compare each item, moving systematically to the right (i += 1)!

Variations:!

return true as soon as the item is found!

return the location of the item as soon as it is found!

scan all items to look for the largest!

Performance when searching an array of n items:!

do n comparisons when the search is unsuccessful!

expect an average of n/2 comparisons for a successful search!

Sorting!

The search algorithms shown on the previous slides are examples of linear
algorithms!

start at the beginning of a collection!

systematically progress through the collection, all the way to the end if necessary!

A similar strategy can be used to sort the items in an array!

The next set of slides will introduce a very simple sorting algorithm known
as insertion sort!

Basic idea: pick up an item, find


the place it belongs, insert it back
into the array
Move to the next item and repeat

Insertion Sort!

The important property of the insertion sort


algorithm: at any point in this algorithm
part of the array is already sorted!

The item we currently want to find a


place for will be called the key!

items to the left of the key are already


sorted!

the goal on each iteration is to insert the


key at its proper place in the sorted part !

Example (shown at right):!

when it is time to find a place for the J in this hand the portion to the left is sorted!

Insertion Sort!

Here is a more precise statement of


the insertion sort algorithm!
1. the initial key is the second item in the array
(the Q in this example)!
2. use your left hand to pick up the key!
3. scan left until you find an item lower than
the one in your left hand, or the front of the
array, whichever comes first!
4. insert the key back into the array at this location!
5. the new key is the item to the right of the location of the previous key!
6. go back to step 2!

This new version is precise enough that we can organize a Ruby method
that will implement this algorithm!

Insertion Sort in Ruby!

Here is the insertion sort method,


with a few things still left to fill in:!
def isort(a)!
i = 1!
while i < a.length!
key = a[i]!
remove key from a!
j = location for key!

Example:
when i = 3 key will be J

end!

j will be set to 0 (since a[0] = 9 is


the first location with a card smaller
than J )

return a!

J will be inserted at location 1

insert key at a[j+1]!

end!
A description like this that is part Ruby and
part English is known as pseudocode

Pseudocode vs Real Code!

When algorithms require more than a few lines of Ruby we will use
pseudocode in the lecture slides!

The algorithms have all been fully implemented in a RubyLabs module!

you can call Source.listing or Source.checkout if you want to see the gory
details !

>> Source.listing("isort") 1:
def isort(array) 2:
a = array.clone
# make a copy of the input 3:
i
= 1 4:
while i < a.length 5:
...
!
!
!
!
!
# see slides below
6:
i += 1 7:
end 8:
return a 9:
end=> true!

Insertion Sort Example!

The following pictures show an example of how insertion sort works (using
a list of numbers instead of cards)!

when i is 3 the positions to


the left (0 through 2) have
been sorted!

the first statements in the


body of the loop set key
to 5 and remove it from a!

Insertion Sort Example (cont d)!

the algorithm looks to the left for


a location to put the 5!

j is set to 0 since a[0] is the


first item smaller than 5!

on the next iteration, i will be


4, and the sorted region has
grown by 1 to include all items
in locations 0 to 3!

Helper Method!

The operations of removing


the item from a[i], scanning
left, and re-inserting are
implemented in a method
called insert_left!

programmers call these


special-purpose methods
helper methods !

not intended to be used on its


own, but only during a sort!

def insert_left(a, i)!


x = a.slice!(i) # remove a[i] j
= i-1
# scan from here
while j >= 0 && less(x, a[j])
j =
j-1 end a.insert(j+1, x) # insert
x!
end!

You do not need to understand the details of this code!

The big picture :


a call to insert_left(a,i) moves a[i] somewhere to the left
this method uses iteration -- in fact it s a type of linear search

Nested Loops!

At first glance it might seem that


insertion sort is a linear algorithm
like search and max!

it has a while loop that progresses


through the array from left to right!

But it s important to note what


is happening in insert_left!

the step that finds the proper


location for the current item
is also a loop!

it scans left from location i,


going all the way back to 0
if necessary!

def isort(a) i = 1 while i <


a.length
...
while j >= 0 && less
(x,a[i])
...!
end
...!
end return aend!

The while loop in insert_left!

An algorithm with one loop inside another is said to have nested loops!

Nested Loops (cont d)!

The diagram at right helps visualize


how many comparisons are in isort!

a dot in a square indicates a


potential comparison!

for any value of i, the inner loop might


have to compare key to values from
i - 1 all the way down to 0!

The number of dots in this diagram


is!

In general, for an array with n items,


the potential number of comparisons is !

Snap off the empty top row,


leaving n*(n-1) cells, half of
which contain dots...

Experiments with isort!

We can use the brackets method to watch the progress of isort!

From the listing:!


4:

while i < a.length!

5:

insert_left(a, i)

6:

i += 1!

7:

# find a place for a[i]!

end!

Attach a probe on line 5, to tell Ruby to print brackets around part of a just
before calling insert_left:!
>> Source.probe("isort", 5, "puts brackets(a,i)")!
=> true!

Experiments with isort (cont d)!

Using trace to call isort to sort the array shown on the previous slides: !
>> a!
=> [0, 8, 9, 5, 7, 2, 3]!
>> trace { isort(a) }!
0 [8

3]!

8 [9

3]!

9 [5

3]!

9 [7

3]!

9 [2

3]!

9 [3]!

=> [0, 2, 3, 5, 7, 8, 9]!

As in search, the left bracket moves


steadily to the right (i increases by 1
on each iteration)
Can you see how the part to the left
of i is sorted?

Can you see that the sorted portion


grows on each iteration, until finally
the entire array is sorted?

Try Your Own Experiments!

The isort method works for strings, too:!


>> a = TestArray.new(5, :cars)!
=> ["citroen", "infiniti", "acura", "rolls-royce", "opel"]!
>> trace { isort(a) }!
citroen [infiniti
citroen

acura

rolls-royce

opel]!

infiniti [acura

rolls-royce

opel]!
opel]!

acura

citroen

infiniti [rolls-royce

acura

citroen

infiniti

rolls-royce [opel]!

=> ["acura", "citroen", "infiniti", "opel", "rolls-royce"]!

It might be easier to see if you use arrays of numbers!


>> a = TestArray.new(10)!
=> [60, 49, 50, 29, 30, 25, 42, 31, 48, 65]!

Counting Comparisons!

The helper method that compares two items is named less!


>> Source.listing("less")!
1:

def less(x, y)!

2:

return x < y!

3:

end!

=> true!

Attach a counting probe to line 2:!


>> Source.probe("less", 2, :count)!
=> true!

Counting Comparisons (cont d)!

Count the comparisons made when sorting the array of car names:!
>> count { isort(a) }!
=> 6!

Does that seem right?!



>> trace




citroen


{ isort(a) }!
[infiniti

acura

rolls-royce

opel] citroen

infiniti [acura

ro

Counting Comparisons (cont d)!

There are 5 items in the array of cars!

The algorithm will do 4 iterations, so there will be a minimum of 4


comparisons!

The maximum number of comparisons is!

6 is somewhere in between these two extremes !

Challenge!

Explain how many comparisons will be made in the general case (for any
arbitrary size array) when isort is passed an array that is already sorted:!
>> a = TestArray.new(100)!
=> [142, 617, 826, ... 949, 550]!
>> count { isort(a.sort) }!
=> ??!

You can predict exactly how many


comparisons will be made without
running isort!

>> count { isort(a.sort.reverse) }!


=> ??!

The challenge is to explain why the predicted


number of comparisons will be made

Estimating the Number of Comparisons!

The formula for the worst case number of comparisons is !

For small arrays we should probably compute the exact answer:!

For large arrays the n doesn t add very much, and we can get a good
estimate by just computing !

Big-Oh Notation!

Because the
term in the equation is the dominant term when n is large,
we can use it to estimate the number of comparisons!

Computer scientists use the notation


number of comparisons will be roughly

pronounced oh of n-squared !

or sometimes big oh of n-squared !

to mean for large n the


!

There is a precise definition of what it means for an algorithm to be


or!

for our purpose we ll just use the notation informally!

for isort, the notation means on the order of

comparisons !

Scalability!

The fact that the number of comparisons grows as the square of the array
size may not seem important!

for small to moderate size arrays it is not a big deal!

but execution time will start to be a factor for larger arrays!

We ll revisit this idea after looking at more


sophisticated sorting algorithms in the next chapter

Recap: Insertion Sort!

The insertion sort algorithm is another example of iteration!

It uses nested loops -- one loop inside another one!

The outer loop has the same structure as the iteration in linear search!

an array index i ranges from 1 up to n-1!

at any time, the items to the left of i are sorted!

the inner loop moves a[i] to its proper location in the sorted region!

the size of the sorted region grows on each iteration!

The number of comparison can be as small as n-1!

In the worst case the number of comparisons is roughly !

Summary!

This set of slides introduced two new algorithms based on iteration!

Simple linear search scans an array from left to right!

A straightforward sorting algorithm known as insertion sort also involves a


scan from left to right!

an important difference: there is a second loop inside the main loop!

the inner loop scans back from right to left to find the place for an item!

New technology introduced for this topic:!

array index!

conditional execution (modifiers with the keyword if )!

pseudocode!

You might also like