Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 22

Big O

Notation
with
Searching
& Sorting
Big O Notation with Searching & Sorting

What is Big O Notation?


Big O notation is used used as a tool to describe the growth rate of a function in
terms of the number of instructions that need to be processed (time complexity) or
the amount of memory required (space complexity).
This allows different algorithms to be compared in terms of their efficiency.
Note: Two algorithms can have the same Big O notation but in practice have wildly
different execution times in practice. This is because the Big O describes the growth
rate of complexity, not the actual complexity itself.
Time Complexity
Time complexity refers to the growth in the number of instructions executed (and therefore the
time taken) as the length of the array to be searched increases.

Space Complexity
Space complexity refers to the growth in the size of memory space that is required as the length of
the array is increased.

Algorithm Performance
There are a number of factors that affect the performance of search / sorting algorithms. Some
algorithms perform well with high entropy (randomness) data, other algorithms work better when
the data is partially sorted in some manner. This means that no one algorithm works best in every
situation and the nature of the data being sorted needs to considered.
import turtle,math

t= turtle.Turtle()

#Draw the graph


t.penup()
t.goto(-110,120)
t.write("Time Complexity - Big O",font=("Arial", 14, "normal"))

t.penup()
t.goto(-100,-100)
t.pendown()
t.goto(-100,100)
t.goto(-100,-100)
t.goto(100,-100)
t.penup()
t.goto(-160,-0)
t.write("Complexity")
t.penup()
t.goto(-50,-120)
t.write("Length of Array")

#Plot O(1)
t.penup()
n=1
t.pencolor("yellow")
for i in range(200):
t.goto(n-100,1-100) #Offset all x,y values by -100 to fit on screen
t.pendown()
n+=1
t.penup()
t.goto(100,-110)
t.pendown()
t.write("O(1))")

#Plot O(log(n))
t.penup()
t.pencolor("red")
n=1
for i in range(200):
t.goto(n-100,math.log(n)-100)
t.pendown()
n+=1
t.write("O(Log(n))")

#Plot O(O(n))
t.penup()
n=1
t.pencolor("pink")
for i in range(160):
t.goto(n-100,n -100)
t.pendown()
n+=1
t.write("O(n))")

#Plot O(n log(n))


t.penup()
n=1
t.pencolor("green")
for i in range(45):
t.goto(n-100,(n * math.log(n))-100)
t.pendown()
n+=1
t.write("O(n Log(n))")

n=1
#Plot O(N Squared)
t.penup()
t.pencolor("blue")
for i in range(14):
t.goto(n-100,(n*n)-100)
t.pendown()
n +=1
t.write("O(n^2)")
Bubble Sort
Time Complexity: O(n²) – Space Complexity: O(1)
Pretty much the worst sorting algorithm ever, mostly just used in
teaching as a comparison tool. There are almost no legitimate use
cases where it is the most efficient algorithm.
import time

def bubbleSort(arr):
while True:
changed = False
for i in range(len(arr)-1):
if arr[i] > arr[i+1]:
arr[i],arr[i+1] = arr[i+1],arr[i] #Easy python way of swapping two items
changed = True
print(arr) #Only needed for demonstration
time.sleep(1) #Only needed for demonstration
if not changed:
return arr

srtd = bubbleSort([5,4,2,6,33,22,100,2,99])

print(srtd)
Insertion Sort
Time Complexity: O(n²) – Space Complexity:
O(1)
import time

def insertion_sort(arr):
for index in range(1, len(arr)):
cur_val = arr[index]
cur_pos = index

while cur_pos > 0 and arr[cur_pos - 1] > cur_val:


arr[cur_pos] = arr[cur_pos -1]
cur_pos = cur_pos - 1
print(arr) #For debugging/demonstration
time.sleep(1) #For debugging/demonstration

arr[cur_pos] = cur_val

arr = [5,8,1,12,99,34,23,7,100,13,44,72,19]
print(arr)
insertion_sort(arr)
Merge Sort
Time Complexity: O(n log(n)) – Space Complexity: O(n)
Note: You are not required to know how to code a merge sort
algorithm for your A-level exam, it is simply here for
reference/comparison as it is a n log(n) algorithm, rather than a n²
algorithm.
Linear Search
Time Complexity: O(n) – Space Complexity O(1)
Linear search is a simple searching algorithm that iterates through an array from left to right
looking for a target item.
 If the target is found then it returns the index(location) of the item.
 If the target is not in the array then usually -1 is return (or None/Null depending on the
implementation)
 If multiple instances of the target are found, the index of the leftmost instance is returned)
 Linear search is generally considered to be quite inefficient, as the whole list sometimes has to
be search, however it has an advantage over binary search in that it works on both sorted and
unsorted arrays.
def linearSearch(arr,target):
for i in range(len(arr)):
if arr[i] == target:
return i
return -1

l = [44,5,3,2,7,8,10]
print(linearSearch(l,99))
Binary Search
Time Complexity: O(log n) – Space Complexity O(1)
Binary search is a divide and conquer searching algorithm that can only be performed on a
sorted list.
Each iteration through the algorithm the middle item of the array is checked to see if it is a match,
it it is the index is returned, otherwise half the array is disregarded and the remaining component
is searched in the same manner.
Binary search is highly efficient in its execution, however it’s application is limited to sorted
arrays. This means that in most cases the list would have to be pre-sorted first. For arrays with
static components that would work well but with arrays where the values change frequently any
efficiency gains are lost due to the time taken to presort the data.
 
def binarySearch(arr, elem, start=0, end=None):
if end is None:
end = len(arr) - 1
if start > end:
return False

mid = (start + end) // 2


if elem == arr[mid]:
return mid
if elem < arr[mid]:
return binarySearch(arr, elem, start, mid-1)

return binarySearch(arr, elem, mid+1, end)

testArr = [1,2,3,4,56,78,100]

target = int(input("Which integer are you searching for?"))

location = binarySearch(testArr,target)

if location:
print("Target location Index: ", location)

else:
print("Target not in array")
SORTING ALGOS
A beginner's guide to Big O Notation
Big O notation is used in Computer Science to describe the performance or complexity of an algorithm. Big O
specifically describes the worst-case scenario, and can be used to describe the execution time required or the space
used (e.g. in memory or on disk) by an algorithm.
Anyone who’s read Programming Pearls or any other Computer Science books and doesn’t have a grounding in
Mathematics will have hit a wall when they reached chapters that mention O(N log N) or other seemingly bizarre
syntax. Hopefully this article will help you gain an understanding of the basics of Big O and Logarithms.
As a programmer first and a mathematician second (or maybe third or fourth) I found the best way to understand Big
O thoroughly was to produce some examples in code. So, below are some common orders of growth along with
descriptions and examples where possible.
O(1)
O(1) describes an algorithm that will always execute in the same time (or space) regardless of the
size of the input data set.
bool IsFirstElementNull(IList<String> elements)
{ return elements[0] == null; }
O(N)
O(N) describes an algorithm whose performance will grow linearly and in direct proportion to the
size of the input data set. The example below also demonstrates how Big O favours the worst-case
performance scenario; a matching string could be found during any iteration of the for loop and
the function would return early, but Big O notation will always assume the upper limit where the
algorithm will perform the maximum number of iterations.
bool ContainsValue(IEnumerable<string> elements, string value)
{ foreach (var element in elements)
{ if (element == value) return true;
} return false; }
O(N²)
O(N²) represents an algorithm whose performance is directly proportional to the square of the size of the input
data set. This is common with algorithms that involve nested iterations over the data set. Deeper nested iterations
will result in O(N³), O(N⁴) etc.
bool ContainsDuplicates(IList<string> elements)
{ for (var outer = 0; outer < elements.Count; outer++)
{ for (var inner = 0; inner < elements.Count; inner++)
{ // Don't compare with self
if (outer == inner) continue;
if (elements[outer] == elements[inner]) return true;
}
} return false; }
Logarithms
Logarithms are slightly trickier to explain so I’ll use a common example:
Binary search is a technique used to search sorted data sets. It works by selecting the
middle element of the data set, essentially the median, and compares it against a target
value. If the values match, it will return success. If the target value is higher than the
value of the probe element, it will take the upper half of the data set and perform the
same operation against it. Likewise, if the target value is lower than the value of the
probe element, it will perform the operation against the lower half. It will continue to halve
the data set with each iteration until the value has been found or until it can no longer
split the data set.
This type of algorithm is described as O(log N). The iterative halving of data sets
described in the binary search example produces a growth curve that peaks at the
beginning and slowly flattens out as the size of the data sets increase e.g. an input data
set containing 10 items takes one second to complete, a data set containing 100 items
takes two seconds, and a data set containing 1,000 items will take three seconds.
Doubling the size of the input data set has little effect on its growth as after a single
iteration of the algorithm the data set will be halved and therefore on a par with an input
data set half the size. This makes algorithms like binary search extremely efficient when
dealing with large data sets.
In conclusion
Hopefully this has helped remove some of the mystery around Big O notation and
many of the common growth functions. A grasp of Big O is an important tool when
dealing with algorithms that need to operate at scale, allowing you to make the
correct choices and acknowledge trade-offs when working with different data sets.

You might also like