Download as pdf or txt
Download as pdf or txt
You are on page 1of 259

Algorithms and DataStructures - Niet-Numerieke

Algoritmen
An introduction

Lubos Omelina, Bart Jansen

2023-2024

Lubos Omelina, Bart Jansen Algo Data 2023-2024 1 / 259


Preliminaries

Lubos Omelina, Bart Jansen Algo Data 2023-2024 2 / 259


Team / Contact Details

Silvia Zaccardi Katka Kostkova Dr. Lubos Omelina Prof. Bart Jansen
PL9.1 PL9.1 PL9.1 PL9.1
szaccard@etrovub.be kkostkova@etrovub.be lomelina@etrovub.be bjansen@etrovub.be

Lubos Omelina, Bart Jansen Algo Data 2023-2024 3 / 259


Expected knowledge

You know very basic algebra.


You know the basics of JAVA and can write simple programs in JAVA.
... Or you start working on that immediately.

Lubos Omelina, Bart Jansen Algo Data 2023-2024 4 / 259


Summary

The goal of this course is to truly understand a set of basic data structures
and algorithms, by implementing them yourself in concrete computer
programs. These algorithms and data structures are the building blocks for
more complex algorithms and data structures in other courses and are
required in almost all day to day programming.

Lubos Omelina, Bart Jansen Algo Data 2023-2024 5 / 259


Broader Semester Objectives

The goal is to learn how to write computer programs in one semester.


This is very ambitious. This involves learning to reason very analytically.
This involves learning a basic level of JAVA and Python.

Lubos Omelina, Bart Jansen Algo Data 2023-2024 6 / 259


Expected efforts

You program as much as ever possible.


This means from now on you program every day until the end of the
semester !!!

Lubos Omelina, Bart Jansen Algo Data 2023-2024 7 / 259


Course Organization

24 contact hours Seminar, Exercises or Practicals


24 contact hours Lecture
36 contact hours Independent or External Form of Study

Lubos Omelina, Bart Jansen Algo Data 2023-2024 8 / 259


Reminder

You program as much as ever possible.


36 hours effort in 6 weeks is 6 hours per week. This is close to one
hour per day !!!

Lubos Omelina, Bart Jansen Algo Data 2023-2024 9 / 259


The same, but in other words ...

This course is about programming skills and insight, not about


knowledge.
There is almost nothing to memorize.
At the written exam you will make exercises as in the exercise sessions.
“open book” means that we provide correct starting code.
I can not teach you to program or teach you insights in data
structures in 24 hours. You have to learn this by practicing a lot.

Lubos Omelina, Bart Jansen Algo Data 2023-2024 10 / 259


Course Topics

Some theoretical concepts.


Searching in arrays, building linked and double linked lists, searching
in linked and double linked lists.
Methods for sorting arrays and lists.
Stacks, queues and priority queues.
Methods to represent trees, for searching trees and to balance trees.
Methods to represent graphs and to search graphs.

Lubos Omelina, Bart Jansen Algo Data 2023-2024 11 / 259


Organization of the exercise sessions

In the computer room, we write computer programs.


You can work at your own speed, not everyone has the same amount
of programming experience.
However, I expect you to have a minimal basic knowledge of JAVA.
Every week, we start with a new topic. So, what you should complete
what you did not finish.
These assignments do not have a unique correct answer! You can
verify the correctness of your program by investigating the output of
your program.

Lubos Omelina, Bart Jansen Algo Data 2023-2024 12 / 259


Course Material

1 Slides used in class. See Canvas.


2 The code I provide (See Canvas) and the code you write
3 The course text. See Canvas.
4 Your course notes. Without your notes, the material might be not
understandable.
5 Any JAVA book. (I have Big Java by Cay Horstman)
6 If you find it helpful, get a book on algorithms and data structures in
JAVA.

Lubos Omelina, Bart Jansen Algo Data 2023-2024 13 / 259


Reference book

A reference book far beyond the scope of this course is:


Introduction to Algorithms, Second Edition by Thomas H. Cormen,
Charles E. Leiserson, Ronald L. Rivest and Clifford Stein
This is the type of book you can take with you the next ten years.
This is not the type of book which will help you understand basic problems
with JAVA.

Lubos Omelina, Bart Jansen Algo Data 2023-2024 14 / 259


Reference book

Lubos Omelina, Bart Jansen Algo Data 2023-2024 15 / 259


Exam

The exam consists of two parts:


1 A written exam (= 60%). You get questions on the data structures
and algorithms studied in class or on SIMILAR ones.
2 A project, with oral discussion(= 40%). You get questions on YOUR
implementation of a small program that is based on the material seen
in class.

Lubos Omelina, Bart Jansen Algo Data 2023-2024 16 / 259


Project

The project will force you to think about the proper selection and usage of
the different algorithms and data structures. You have two options:
1 We provide you a description of the assignment. We have it split in
four parts that match with the progress of the course.
2 You design your own project. More freedom, but less structure.
Individual discussions to set goals. A game
There will be 3 project deadlines:
1 1st intermediate deadline: 26.11.2023
2 2nd intermediate deadline: 17.12.2023
3 final deadline - one week before the written exam

Lubos Omelina, Bart Jansen Algo Data 2023-2024 17 / 259


Written exam: 9.7 on average

16

14

12

10

0
0.0 2.5 5.0 7.5 10.0 12.5 15.0 17.5

Lubos Omelina, Bart Jansen Algo Data 2023-2024 18 / 259


Project: 11.4 on average

20

15

10

0
0.0 2.5 5.0 7.5 10.0 12.5 15.0 17.5 20.0

Lubos Omelina, Bart Jansen Algo Data 2023-2024 19 / 259


Warning

Week after week, we will build more complex code, based on previous
code.
The advantage is that you will be able to build some complex data
structures and algorithms very soon.
The danger is that you get lost soon!
The required programming skills and the concepts get more difficult
at a very fast pace.
It is your duty and not mine to do everything you can not to get lost!
Be proactive, work hard, ask for help, ...
If you get lost in this semester (this actually happens for many
students), do not wait for a miracle but do something about it.

Lubos Omelina, Bart Jansen Algo Data 2023-2024 20 / 259


Lubos Omelina, Bart Jansen Algo Data 2023-2024 21 / 259
Lubos Omelina, Bart Jansen Algo Data 2023-2024 22 / 259
Algorithms? Data structures?

Algorithms are recipes to compute something


I How to test whether a number is prime or not?
I How to search for a given word in a text
I How to find the shortest path between 2 cities in a graph?
I ...
Example: The mathematical definition of prime numbers does not tell
you how to test whether a given number is prime in an efficient way.
So, this course will teach you a basic repertoire of methods to solve
simple problems.

Lubos Omelina, Bart Jansen Algo Data 2023-2024 23 / 259


Prime numbers

Theorem 1
A number p ∈ N is prime if it can only be divided by 1 and itself.

This suggests we should check whether p can be divided by


2, 3, 4, 5, 6, ..., p − 1 This is actually stupid ... but a real efficient solution
is not obvious.

Lubos Omelina, Bart Jansen Algo Data 2023-2024 24 / 259


Why we study algorithms and data structures

Fundamental recipes of computer science.


Understanding is necessary to use them as building blocks for
extending them yourself.
We look at their implementation and not only at their use.

Lubos Omelina, Bart Jansen Algo Data 2023-2024 25 / 259


Some JAVA preliminaries

Lubos Omelina, Bart Jansen Algo Data 2023-2024 26 / 259


The very first JAVA program

p u b l i c c l a s s MyFirstJavaProgram {

/∗ T h i s i s my f i r s t j a v a program .
∗ T h i s w i l l p r i n t ’ H e l l o World ’ a s t h e o u t p u t
∗ T h i s i s an e x a m p l e o f m u l t i−l i n e comments .
∗/

p u b l i c s t a t i c v o i d main ( S t r i n g [ ] a r g s ) {
// T h i s i s an e x a m p l e o f s i n g l e l i n e comment
/∗ T h i s i s a l s o an e x a m p l e o f s i n g l e l i n e comment . ∗/
System . o u t . p r i n t l n ( ” H e l l o World ” ) ;
}
}

Lubos Omelina, Bart Jansen Algo Data 2023-2024 27 / 259


Variables and numbers

p u b l i c c l a s s MySecondJavaProgram {

p u b l i c s t a t i c v o i d main ( S t r i n g [ ] a r g s ) {
// f o r now we p u t a l l c o d e h e r e and don ’ t a s k why

int x = 5;
int y = 6;
int z = x + y ;
double j = 1.21 + z ;

System . o u t . p r i n t l n ( z ) ;
System . o u t . p r i n t l n ( j ) ;
}
}

Lubos Omelina, Bart Jansen Algo Data 2023-2024 28 / 259


Conditions

p u b l i c c l a s s MyThirdJavaProgram
{
p u b l i c s t a t i c v o i d main ( S t r i n g [ ] a r g s )
{
// f o r now we p u t a l l c o d e h e r e and don ’ t a s k why
int x = 5;
i f ( x > 6)
{
System . o u t . p r i n t l n ( ”Too s m a l l ” ) ;
}
else
{
System . o u t . p r i n t l n ( ” L a r g e enough ” ) ;
}
}
}

Lubos Omelina, Bart Jansen Algo Data 2023-2024 29 / 259


Repetitions

p u b l i c c l a s s MyFourthJavaProgram
{
p u b l i c s t a t i c v o i d main ( S t r i n g [ ] a r g s )
{
// f o r now we p u t a l l c o d e h e r e and don ’ t a s k why
f o r ( i n t x = 0 ; x < 1 0 ; x++)
{
System . o u t . p r i n t l n ( x ) ;
}
}
}

Lubos Omelina, Bart Jansen Algo Data 2023-2024 30 / 259


Repetitions (2)

p u b l i c c l a s s MyFifthJavaProgram
{
p u b l i c s t a t i c v o i d main ( S t r i n g [ ] a r g s )
{
// f o r now we p u t a l l c o d e h e r e and don ’ t a s k why
i n t x = 10;
w h i l e ( x>6)
{
System . o u t . p r i n t l n ( x − 1 ) ;
}
}
}

Lubos Omelina, Bart Jansen Algo Data 2023-2024 31 / 259


Functions / methods

p u b l i c c l a s s MySixthJavaProgram
{
p u b l i c s t a t i c i n t sum ( i n t x , i n t y )
{
return x + y ;
}

p u b l i c s t a t i c v o i d main ( S t r i n g [ ] a r g s )
{
i n t r e s u l t = sum ( 1 , 2 ) ;
System . o u t . p r i n t l n ( r e s u l t ) ;
}
}

Lubos Omelina, Bart Jansen Algo Data 2023-2024 32 / 259


Arrays

p u b l i c c l a s s MySeventhJavaProgram
{
p u b l i c s t a t i c v o i d main ( S t r i n g [ ] a r g s )
{
i n t d a t a [ ] = new i n t [ 3 ] ;
data [ 0 ] = 3;
data [ 1 ] = 6;
data [ 2 ] = 9
System . o u t . p r i n t l n ( d a t a [ 0 ] ) ;
}
}

Lubos Omelina, Bart Jansen Algo Data 2023-2024 33 / 259


Vectors

Lubos Omelina, Bart Jansen Algo Data 2023-2024 34 / 259


Vectors

One of the simplest data structures.


Extends the functionality of the array with various useful methods:
I adding elements
I accessing the elements
I removing all elements
I removing specific elements
I ...

Lubos Omelina, Bart Jansen Algo Data 2023-2024 35 / 259


Functionality

getFirst: returns the first element of the vector.


getLast: returns the last element of the vector.
addFirst: adds an element to the front.
addLast: adds an element to the end.
get: gets an element at a given position.
size: returns the nr of elements in the vector.
contains: verifies whether the vector contains a given element or not.
removeFirst: removes the first element.
removeLast: removes the last element.
...

Lubos Omelina, Bart Jansen Algo Data 2023-2024 36 / 259


A simple Vector implementation

p u b l i c c l a s s Vector
{
p r i v a t e i n t data [ ] ;
p r i v a t e i n t count ;

p u b l i c V e c t o r ( i n t c a p a c i t y ){ }

public int s i z e (){ }

p u b l i c i n t g e t ( i n t i n d e x ){ }

p u b l i c v o i d a d d F i r s t ( i n t i t e m ){ }
}

Lubos Omelina, Bart Jansen Algo Data 2023-2024 37 / 259


A simple Vector implementation

p u b l i c c l a s s Vector
{
p r i v a t e i n t data [ ] ;
p r i v a t e i n t count ;

p u b l i c Vector ( i n t capacity )
{
d a t a = new i n t [ c a p a c i t y ] ;
count = 0;
}

public int size ()


{
r e t u r n count ;
}

...
}

Lubos Omelina, Bart Jansen Algo Data 2023-2024 38 / 259


A simple Vector implementation

p u b l i c c l a s s Vector
{
...

p u b l i c i n t get ( i n t index )
{
r e t u r n data [ index ] ;
}

p u b l i c void a d d F i r s t ( i n t item )
{
f o r ( i n t i = c o u n t ; i > 0 ; i −−){
d a t a [ i ] = d a t a [ i −1];
}
data [ 0 ] = item ;
c o u n t ++;
}
}

Lubos Omelina, Bart Jansen Algo Data 2023-2024 39 / 259


Example for using the Vector

public class TestClass


{
p u b l i c s t a t i c v o i d main ( S t r i n g [ ] a r g s )
{
V e c t o r numbers = new V e c t o r ( 1 0 ) ;
numbers . a d d F i r s t ( 1 ) ;
numbers . a d d F i r s t ( 2 ) ;
numbers . a d d F i r s t ( 3 ) ;

System . o u t . p r i n t l n ( numbers . g e t ( 0 ) ) ;
System . o u t . p r i n t l n ( numbers . g e t ( 1 ) ) ;
}
}

Lubos Omelina, Bart Jansen Algo Data 2023-2024 40 / 259


Binary Search
A divide and Conquer approach

When the vector is sorted, searching for a given element can be done
more efficiently.
First check whether the element is supposed to be in the first or the
second half of the vector.
This way only half of it needs to be searched for.
In the right half of the vector, check whether the element should be
in the first or the second half.
...
How to make sure a Vector is sorted?
I Implement addSorted(...) method.
I Remove other add∗ methods.
I Rename Vector class to SortedVector .

Lubos Omelina, Bart Jansen Algo Data 2023-2024 41 / 259


Binary Search

Lubos Omelina, Bart Jansen Algo Data 2023-2024 42 / 259


Binary Search Implementation

p u b l i c boolean b i n a r y S e a r c h ( i n t key )
{
int start = 0;
i n t end = c ou nt −1;
w h i l e ( s t a r t <= end )
{
i n t m i d d l e = ( s t a r t + end + 1 ) / 2 ;
i f ( k e y < d a t a [ m i d d l e ] ) end = m i d d l e −1;
e l s e i f ( key > data [ middle ] ) s t a r t = middle + 1 ;
else return true ;
}
return false ;
}

Lubos Omelina, Bart Jansen Algo Data 2023-2024 43 / 259


Some Remarks

1 Simple, yet useful. Vector will be the base for building more complex
structures.
2 Implementation is limited to a vector containing integers.
3 We need generic data structures in order to build a reusable tool set.
4 using Object, Comparable, Generics, ...

Lubos Omelina, Bart Jansen Algo Data 2023-2024 44 / 259


Mathematical Induction, recursion and time complexities

Lubos Omelina, Bart Jansen Algo Data 2023-2024 45 / 259


What is mathematical induction?

A technique to prove properties of natural numbers P(n) with n ∈ N.


1 Proof for P(0) (base case)
2 Proof ∀n ∈ N : P(n) ⇒ P(n + 1) (induction step)

Lubos Omelina, Bart Jansen Algo Data 2023-2024 46 / 259


Mathematical Induction

Theorem 2
Pn n(n+1)
i=1 i = 2

Proof.
1 For n = 1. Trivial
2 (induction step)
Pn+1 n(n+1) 2n+2+n2 +n (n+1)(n+2)
i=1 i = n + 1 + 2 = 2 = 2

Lubos Omelina, Bart Jansen Algo Data 2023-2024 47 / 259


Mathematical Induction

Theorem 3
Any positive integer greater than one can be written as a product of
primes.
OR more formally
For any integer n ≥ 1 there exists primes p1 , ..., pm such that
n = p1 ...pm

Lubos Omelina, Bart Jansen Algo Data 2023-2024 48 / 259


Mathematical Induction

Proof.
1 n is prime. A trivial basecase
2 n is not prime. Hence, it is a product of two numbers, say n1 and n2 ,
and nor n1 nor n2 are 1.
n1 and n2 are smaller than n, hence the induction hypothesis says
that n1 and n2 can be written as a product of primes.
Hence, n1 n2 is also a product of primes.

Lubos Omelina, Bart Jansen Algo Data 2023-2024 49 / 259


Remark

Obviously, the base case does not have to be P(0) ...

Lubos Omelina, Bart Jansen Algo Data 2023-2024 50 / 259


More formally

Definition 1 (Axiom of induction)


∀P[[P(0) ∧ ∀k ∈ N(P(k) ⇒ P(k + 1))] ⇒ ∀n ∈ N[P(n)]]

Lubos Omelina, Bart Jansen Algo Data 2023-2024 51 / 259


Why mathematical induction is mentioned in this course?

It is the foundation for recursion: from mathematical induction to


structural induction.
It is a good technique to prove properties of algorithms.

Lubos Omelina, Bart Jansen Algo Data 2023-2024 52 / 259


Towards recursion

Definition 2 (Factorial)
Qn−1
n! = i=0 n−i

Example: 5! = 5 × 4 × 3 × 2 × 1 = 120

Lubos Omelina, Bart Jansen Algo Data 2023-2024 53 / 259


An iterative version of factorial

int f a c t o r i a l ( int n)
{
double r e s u l t = 1;
f o r ( i n t i =0; i<n ; i ++)
{
r e s u l t ∗= ( n − i ) ;
}
return result ;
}

Lubos Omelina, Bart Jansen Algo Data 2023-2024 54 / 259


Another definition of factorial

Definition 3 (Factorial)

1 if n = 0
n! =
n × (n − 1)! otherwise

Lubos Omelina, Bart Jansen Algo Data 2023-2024 55 / 259


Factorial example

5! = 5 × 4!
= 5 × 4 × 3!
= 5 × 4 × 3 × 2!
= 5 × 4 × 3 × 2 × 1!
= 5 × 4 × 3 × 2 × 1 × 0!
=5×4×3×2×1×1
= 120

Lubos Omelina, Bart Jansen Algo Data 2023-2024 56 / 259


Recursion

Definition 4
A function is recursive if it is defined in terms of itself.

Lubos Omelina, Bart Jansen Algo Data 2023-2024 57 / 259


A recursive version of factorial

int f a c t o r i a l ( int n)
{
i f ( n == 0 ) r e t u r n 1 ;
e l s e r e t u r n n ∗ f a c t o r i a l ( n −1);
}

Lubos Omelina, Bart Jansen Algo Data 2023-2024 58 / 259


The greatest common divisor

Theorem 4 (gcd)
The greatest common divisor (gcd) of two positive integers is the
largest integer that divides evenly into both of them. (divides evenly =
the remainder of the division is zero)

For example, the greatest common divisor of 102 and 68 is 34 since both
102 and 68 are multiples of 34, but no integer larger than 34 divides
evenly into 102 and 68.

Lubos Omelina, Bart Jansen Algo Data 2023-2024 59 / 259


The greatest common divisor

Theorem 5 (Euclid’s algorithm)


If p > q, the gcd of p and q is the same as the gcd of q and p mod q.

Lubos Omelina, Bart Jansen Algo Data 2023-2024 60 / 259


The greatest common divisor

p u b l i c i n t gcd ( i n t p , i n t q )
{
w h i l e ( q != 0 )
{
i n t temp = q ;
q = p % q;
p = temp ;
}
return p ;
}

Lubos Omelina, Bart Jansen Algo Data 2023-2024 61 / 259


The greatest common divisor

p u b l i c i n t gcd ( i n t p , i n t q )
{
i f ( q == 0 ) r e t u r n p ;
e l s e r e t u r n gcd ( q , p % q ) ;
}

Lubos Omelina, Bart Jansen Algo Data 2023-2024 62 / 259


What does this function do?

p u b l i c i n t mcCarthy ( i n t n )
{
i f ( n > 100)
return n − 10;
r e t u r n mcCarthy ( mcCarthy ( n + 1 1 ) ) ;
}

Lubos Omelina, Bart Jansen Algo Data 2023-2024 63 / 259


Towards time complexities
An introductory example

Suppose we have an array of numbers and we are checking whether the


array contains a specific element ...
public boolean contains ( i n t obj )
{
f o r ( i n t i =0; i<c o u n t ; i ++)
{
i f ( d a t a [ i ] == o b j ) r e t u r n t r u e ;
}
return false ;
}

How many comparisons do we need to do before we can find the element?

Lubos Omelina, Bart Jansen Algo Data 2023-2024 64 / 259


Time Complexity of the contains method
The number of comparisons

If the array has N elements, T(N) is the number of required


comparisons.
The best case: T(N) = 1
The worst case: T(N) = N
The average case: T(N) = (1 + 2 + ... N) / N = (N+1) / 2
On average, searching in an array that is twice as large, will take
twice as much time.
We say the find method is linear in time.

Lubos Omelina, Bart Jansen Algo Data 2023-2024 65 / 259


Time Complexities (CL 3.1: Asymptotic Notation)
Definitions

Definition 5 (big-Oh)
T (n) is O(F (n)) if there are positive constants c and n0 such that
T (n) ≤ cF (n) when n ≥ no .

Definition 6 (big-Omega)
T (n) is Ω(F (n)) if there are positive constants c and n0 such that
T (n) ≥ cF (n) when n ≥ no .

Definition 7 (big-Theta)
T (n) is Θ(F (n)) if and only if T (n) is O(F (n)) and T (n) is Ω(F (n)).

Lubos Omelina, Bart Jansen Algo Data 2023-2024 66 / 259


Time Complexities
Examples

Time complexity
1000

O(2n )
O(n2 )
800
O(nlogn)
O(n)
O(logn)
Running time

600

400

200

0
1 2 3 4 5 6 7 8 9 10

Input size (n)

Lubos Omelina, Bart Jansen Algo Data 2023-2024 67 / 259


Time Complexities
Examples

Time complexity
25

O(nlogn)
O(n)
20
O(logn)
Running time

15

10

0
1 2 3 4 5 6 7 8 9 10

Input size (n)

Lubos Omelina, Bart Jansen Algo Data 2023-2024 68 / 259


Time Complexities
Examples

T (n) = 3n2 + 2n + 7 = O(n2 ) (quadratic).


T (n) = n + n + n = O(n) (linear).
T (n) = 7 = O(1) (constant).
T (n) = 2n + 7n2 + 11 = O(2n ) (exponential).
T (n) = 2 log(n) + 1 = O(log(n)) (logarithmic).

Lubos Omelina, Bart Jansen Algo Data 2023-2024 69 / 259


Time Complexities
Logarithmic bases

Theorem 6 (The base doesn’t matter)


For any constant B ≥ 1, logB (n) = O(log(n)).

Proof.

T (n) = logB (n)


log2 (n)
=
log2 (B)
≤ cF (n) for c = 1/log2 (B) and F (n) = log2 (n)
= O(log2 (n))

Lubos Omelina, Bart Jansen Algo Data 2023-2024 70 / 259


Time Complexities for the Vector class

The time complexity of binary search is intuitively clear: O(log(n)).


If at each step the length of the vector is divided by two, then the
length of vector becomes 1 in log(n) steps.

We should be more explicit and formal.

Lubos Omelina, Bart Jansen Algo Data 2023-2024 71 / 259


A generic approach for Divide And Conquer Algorithms
CL 4: Divide and Conquer

Assume the total size of the problem is n.


1 The algorithm splits the problem into a pieces, each of size n/b.
(Please note that a is not per se equal to b, as the subproblems could
for instance overlap.)
2 The algorithm solves the problem on each of these a pieces. (this is
the recursion step)
3 The algorithm aggregates the solutions to the a smaller pieces into a
single solution.

Lubos Omelina, Bart Jansen Algo Data 2023-2024 72 / 259


Divide And Conquer Algorithms

n/b n/b n/b

... ... ...


1 1
... 1

Lubos Omelina, Bart Jansen Algo Data 2023-2024 73 / 259


Time Complexity of Divide And Conquer Algorithms

Theorem 7
The time complexity of divide and conquer algorithms is defined by
the following recurrence relation: T(n) = a T(n/b) + f(n)

where f (n) is the cost of aggregating the a different simpler problems.

Lubos Omelina, Bart Jansen Algo Data 2023-2024 74 / 259


Time Complexity of Divide And Conquer Algorithms

We assume that we stop dividing the problem into smaller problems


when the problem reaches size 1.
We assume that n is a power of b.

The consequence of the second assumption is that the subproblems of


size 1 are always obtained in an integer number of division steps
(because the problem size is divided by a factor b in each step).
Hence, both assumptions together result in the fact that the
algorithm always stops the recursion in an integer number of steps.

Lubos Omelina, Bart Jansen Algo Data 2023-2024 75 / 259


A theorem for the Time Complexity of Divide And Conquer
Algorithms
CL 4.5: The master theorem for solving recurrences.

Theorem 8
Under the above stated conditions, the solution of the recurrence
relation
T (n) = aT (n/b) + f (n)
is given by
log n/ log b
X
T (n) = ai f (n/b i )
i=0

Lubos Omelina, Bart Jansen Algo Data 2023-2024 76 / 259


CL 4.6: Proof of the master theorem for solving
recurrences - more formal.

Proof.
At the start of the algorithm (level 0 ), there is a single problem of
size n (or put differently a0 pieces of size n/b 0 each).
At level 1, the problem is divided into a pieces of size n/b each (or
put differently a1 pieces of size n/b 1 each).
Each of these a smaller pieces will be broken down into smaller pieces.
Consequently, at level i, there will be ai pieces, each of size n/b i .
This process of dividing the problem into smaller problems continues
until the problems reach size 1. But after how many steps is this
exactly?
At the lowest level i, we have that n/b i = 1, so n = b i , so
i = logb n = log2 n/ log2 b.

Lubos Omelina, Bart Jansen Algo Data 2023-2024 77 / 259


[Proof Continued]
Once the problem is broken into a series of problems of size 1, these
smaller problems become trivial to solve (e.g. sorting a list of a single
element or finding an element in a list of a single element).
Now as specified by step 3 of the divide and conquer paradigm, the
solutions to the simpler problems need to be aggregated into a
solution of the actual problem.
At level i, the size of each of the pieces is n/b i and so the cost of
aggregating these pieces can be generally referred to as f (n/b i ) for
level i.
Putting this all together, the total cost of solving a problem according
to the divide and conquer paradigm, is the sum of the costs at each
of the i levels.
At level i, the cost is the product of the aggregation cost on the
pieces f (n/b i ) with the number of pieces at the i-th level ai .
Plog n/ log b i
So, in total we obtain: T (n) = i=0 a f (n/b i ).

Lubos Omelina, Bart Jansen Algo Data 2023-2024 78 / 259


Back to Binary Search

Theorem 9
The time complexity of the Binary Search algorithm on a sorted vector
is O(log n).

Proof.
At each level, there is one comparison to be made for deciding whether to
go to the left or to the right and in each step the size of the subproblems
is reduced by a factor of 2.

T (n) = T (n/2) + 1 (1)


So, this is the recurrence relation from theorem 7, with a = 1, b = 2 and
f (n) = 1.
Lubos Omelina, Bart Jansen Algo Data 2023-2024 79 / 259
According to theorem 8, the solution to this recurrence relation is given by
log
Xn
T (n) = 1
i=0
= log n + 1
= O(log n)

Lubos Omelina, Bart Jansen Algo Data 2023-2024 80 / 259


Time Complexities of the other operations

getFirst: O(1)
getLast: O(1)
addFirst: O(n)
addLast: O(1)
get: O(1)
size: O(1)
contains: O(n)
removeFirst: O(n)
removeLast: O(1)
...

Lubos Omelina, Bart Jansen Algo Data 2023-2024 81 / 259


Linked Lists

Lubos Omelina, Bart Jansen Algo Data 2023-2024 82 / 259


Linked Lists
CL 10.2: Linked Lists

Lubos Omelina, Bart Jansen Algo Data 2023-2024 83 / 259


Linked List
public class LinkedList
{
private class ListElement
{
p r i v a t e Object e l 1 ;
private ListElement el2 ;
p u b l i c ListElement ( Object el , ListElement nextElement )
{
el1 = el ;
e l 2 = nextElement ;
}
p u b l i c ListElement ( Object e l )
{
t h i s ( el , n u l l ) ;
}
p u b l i c Object f i r s t ()
{
return el1 ;
}
public ListElement rest ()
{
return el2 ;
}
p u b l i c void s e t F i r s t ( Object value )
{
el1 = value ;
}
public void setRest ( ListElement value )
{
el2 = value ;
}
}

p r i v a t e L i s t E l e m e n t head ;
Lubos Omelina, Bart Jansen Algo Data 2023-2024 84 / 259
AddFirst: Adding a new element at the beginning of the
list

Lubos Omelina, Bart Jansen Algo Data 2023-2024 85 / 259


AddFirst: 3 crucial steps

p u b l i c void a d d F i r s t ( Object o )
{
L i s t E l e m e n t newElement = new L i s t E l e m e n t ( o ) ; // s t e p 1
newElement . s e t R e s t ( head ) ; // s t e p 2
head = newElement ; // s t e p 3
}

Lubos Omelina, Bart Jansen Algo Data 2023-2024 86 / 259


AddFirst: The same 3 steps done at once

p u b l i c void a d d F i r s t ( Object o )
{
head = new L i s t E l e m e n t ( o , head ) ;
}

Lubos Omelina, Bart Jansen Algo Data 2023-2024 87 / 259


Linked List: retrieving the n-th element

p u b l i c Object get ( i n t n)
{
L i s t E l e m e n t d = head ;
w h i l e ( n>0)
{
d = d. rest ();
n−−;
}
return d . el1 ;
}

Lubos Omelina, Bart Jansen Algo Data 2023-2024 88 / 259


Time Complexities
At least in our most naive version

getFirst: O(1)
getLast: O(n)
addFirst: O(1)
addLast: O(n)
get: O(n)
size: O(1)
contains: O(n)
removeFirst: O(1)
removeLast: O(n)
...

Lubos Omelina, Bart Jansen Algo Data 2023-2024 89 / 259


Stacks and Queues

Lubos Omelina, Bart Jansen Algo Data 2023-2024 90 / 259


Stacks: push
CL 10.1: Stacks and Queues

Lubos Omelina, Bart Jansen Algo Data 2023-2024 91 / 259


Stacks: pop

Lubos Omelina, Bart Jansen Algo Data 2023-2024 92 / 259


Stacks

Push: New items are stored at the end


Pop: new items are removed at the end

Lubos Omelina, Bart Jansen Algo Data 2023-2024 93 / 259


A Stack based on a Vector

p u b l i c c l a s s Stack
{
p r i v a t e Vector data ;

p u b l i c Stack ()
{
d a t a = new V e c t o r ( ) ;
}

public v o i d push ( O b j e c t o ){}


public O b j e c t pop (){}
public O b j e c t t o p (){}
public i n t s i z e (){}
public b o o l e a n empty (){}
}

Lubos Omelina, Bart Jansen Algo Data 2023-2024 94 / 259


Queue

Push: New elements are added at the back.


Pop: Elements are removed from the front of the queue
(Or the other way around)

Lubos Omelina, Bart Jansen Algo Data 2023-2024 95 / 259


A Queue based on a vector

p u b l i c c l a s s Queue
{
p r i v a t e Vector data ;

p u b l i c Queue ( )
{
d a t a = new V e c t o r ( ) ;
}

public v o i d push ( O b j e c t o ){}


public O b j e c t pop (){}
public O b j e c t t o p (){}
public i n t s i z e (){}
public b o o l e a n empty (){}
}

Lubos Omelina, Bart Jansen Algo Data 2023-2024 96 / 259


Time Complexities
At least in our most naive version

Stack (both vector and linked list based)


I Push: O(1)
I Pop: O(1)
Queue (both vector and linked list based)
I Push: O(1)
I Pop: O(n)
Queue (alternative)
I Push: O(n)
I Pop: O(1)

Lubos Omelina, Bart Jansen Algo Data 2023-2024 97 / 259


Remark

The queue only implements two basic methods (push and pop), but
their time complexities are very different.
This is the case for both the vector and the linked list based
implementation.
See later for O(1) push and pop methods.

Lubos Omelina, Bart Jansen Algo Data 2023-2024 98 / 259


Priority Queue

Lubos Omelina, Bart Jansen Algo Data 2023-2024 99 / 259


Priority Queue

Similar to the Stack and Queue data structure, the priority queue is a
simple structure supporting the push and pop methods.
The pop method does not return the topmost element in the
structure, but the element with the highest priority.
As a consequence, the push method does not only store the data, but
also the priority.

Lubos Omelina, Bart Jansen Algo Data 2023-2024 100 / 259


Priority Queues: Two different approaches

Maintaining the data structure sorted on the priority.The Push


method needs to insert the element at the right position in an already
sorted data structure, to ensure that the structure remains sorted
(O(n)). As the element with the highest priority always is at the first
/ last position, the pop remains very simple (O(1)).
The second approach implements the opposite logic: the push
operation is very simple (as in the Stack and Queue implementation it
just stores the new data at the end or at the beginning), at the price
of a more complex pop operation. As the data is now stored unsorted,
the pop operation must check the entire data structure to identify the
element with the highest priority. In this approach, the push operation
has a time complexity of O(1), while the pop operation has a time
complexity of O(n).

Lubos Omelina, Bart Jansen Algo Data 2023-2024 101 / 259


Remark

This datastructure only implements two basic methods (push and


pop), but their time complexities are very different.
This is the case for both described approaches.
This is the case for both the vector and the linked list based
implementation.
See later for O(log n) push and pop methods.

Lubos Omelina, Bart Jansen Algo Data 2023-2024 102 / 259


A sorted Linked List

Lubos Omelina, Bart Jansen Algo Data 2023-2024 103 / 259


Adding elements in a sorted linked list
Basic Idea

// THIS CODE I S NOT CORRECT JAVA CODE


// BUT IT ILLUSTRATES THE LOGIC OF THE ALGORITHM

p u b l i c void addSorted ( Object o )


{
// an empty l i s t , add e l e m e n t i n f r o n t
i f ( head == n u l l ) head = new L i s t E l e m e n t ( o , n u l l ) ;
e l s e i f ( o < head . f i r s t ( ) )
{
// we h a v e t o add t h e e l e m e n t i n f r o n t
head = new L i s t E l e m e n t ( o , head ) ;
}
else
{
// we h a v e t o f i n d t h e f i r s t e l e m e n t w h i c h i s b i g g e r
L i s t E l e m e n t d = head ;
w h i l e ( ( d . r e s t ( ) != n u l l )&&(o > d . r e s t ( ) . f i r s t ( ) ) )
{
d = d. rest ();
}
ListElement next = d . r e s t ( ) ;
d . s e t R e s t ( new L i s t E l e m e n t ( o , n e x t ) ) ;
}
c o u n t ++;
}

Lubos Omelina, Bart Jansen Algo Data 2023-2024 104 / 259


Adding elements in a sorted linked list
More correct

p u b l i c v o i d addSorted ( Comparable o )
{
// an empty l i s t , add e l e m e n t i n f r o n t
i f ( head == n u l l ) head = new L i s t E l e m e n t ( o , n u l l ) ;
e l s e i f ( head . f i r s t ( ) . compareTo ( o ) > 0 )
{
// we h a v e t o add t h e e l e m e n t i n f r o n t
head = new L i s t E l e m e n t ( o , head ) ;
}
else
{
// we h a v e t o f i n d t h e f i r s t e l e m e n t w h i c h i s b i g g e r
L i s t E l e m e n t d = head ;
w h i l e ( ( d . r e s t ( ) != n u l l )&&
( d . r e s t ( ) . f i r s t ( ) . compareTo ( o ) < 0 ) )
{
d = d. rest ();
}
ListElement next = d . r e s t ( ) ;
d . s e t R e s t ( new L i s t E l e m e n t ( o , n e x t ) ) ;
}
c o u n t ++;
}

Lubos Omelina, Bart Jansen Algo Data 2023-2024 105 / 259


import j a v a . u t i l . Comparator ;

public class PriorityQueue


{
p r i v a t e c l a s s P r i o r i t y P a i r implements Comparable
{
p u b l i c Object element ;
p u b l i c Object p r i o r i t y ;

p u b l i c P r i o r i t y P a i r ( Object element , Object p r i o r i t y )


{
t h i s . element = element ;
this . priority = priority ;
}

p u b l i c i n t compareTo ( O b j e c t o )
{
P r i o r i t y P a i r p2 = ( P r i o r i t y P a i r ) o ;
r e t u r n ( ( C o m p a r a b l e ) p r i o r i t y ) . compareTo ( p2 . p r i o r i t y ) ;
}
}

p r i v a t e L i n k e d L i s t data ;

public PriorityQueue ()
{
d a t a = new L i n k e d L i s t ( ) ;
}

p u b l i c v o i d push ( O b j e c t o , i n t p r i o r i t y )
{
// make a p a i r o f o and p r i o r i t y
// add t h i s p a i r t o t h e s o r t e d l i n k e d l i s t .
}

p u b l i c O b j e c t pop ( )
{
Lubos//Omelina,
add y oBartu r cJansen
ode h e r e Algo Data 2023-2024 106 / 259
Circular Vectors

Lubos Omelina, Bart Jansen Algo Data 2023-2024 107 / 259


Circular Vector
Logical versus physical layout

Lubos Omelina, Bart Jansen Algo Data 2023-2024 108 / 259


AddFirst of 1 and 0

Lubos Omelina, Bart Jansen Algo Data 2023-2024 109 / 259


Operations

addFirst
I Add the new element at position ”start - 1”
I Unless start is 1, then we need to add at the end of the vector, i.e. at
”capacity - 1”.
I start = (start + capacity - 1) % capacity
addLast
I Add the new element at position ”start + count”
I When start + count ≥ capacity, then we need to add at the beginning
of the vector.
I Store the element at (start + count) % capacity

Lubos Omelina, Bart Jansen Algo Data 2023-2024 110 / 259


Implementation
public class CircularVector
{
private Object data [ ] ;
private int f i r s t ;
private i n t count ;
private int capacity ;

public CircularVector ( int capacity )


{
count = 0;
f i r s t = 0;
d a t a = new O b j e c t [ c a p a c i t y ] ;
this . capacity = capacity ;
}

public int size ()


{
r e t u r n count ;
}

public v o i d a d d F i r s t ( O b j e c t e l e m e n t ){}
public v o i d a d d L a s t ( O b j e c t e l e m e n t ){}
public O b j e c t g e t F i r s t (){}
public O b j e c t g e t L a s t (){}
public v o i d r e m o v e F i r s t (){}
public v o i d r e m o v e L a s t (){}
public v o i d p r i n t (){}
}

Lubos Omelina, Bart Jansen Algo Data 2023-2024 111 / 259


Implementation

public void removeFirst ()


{
i f ( count > 0)
{
f i r s t = ( f i r s t + 1) % c a p a c i t y ;
c o u n t −−;
}
}

public void print ()


{
System . o u t . p r i n t ( ” [ ” ) ;
f o r ( i n t i =0; i<c o u n t ; i ++)
{
int index = ( f i r s t + i ) % capacity ( ) ;
System . o u t . p r i n t ( d a t a [ i n d e x ] + ” ” ) ;
}
System . o u t . p r i n t l n ( ” ] ” ) ;
}

Lubos Omelina, Bart Jansen Algo Data 2023-2024 112 / 259


Time Complexities - Circular Vectors

addFirst: O(1)
addLast: O(1)
removeFirst: O(1)
removeLast: O(1)

Lubos Omelina, Bart Jansen Algo Data 2023-2024 113 / 259


Remember ...

p u b l i c c l a s s Queue
{
p r i v a t e Vector data ;

p u b l i c Queue ( )
{
d a t a = new V e c t o r ( ) ;
}

public v o i d push ( O b j e c t o ){}


public O b j e c t pop (){}
public O b j e c t t o p (){}
public i n t s i z e (){}
public b o o l e a n empty (){}
}

Lubos Omelina, Bart Jansen Algo Data 2023-2024 114 / 259


... and compare with ...

p u b l i c c l a s s Queue
{
p r i v a t e C i r c u l a r V e c t o r data ;

p u b l i c Queue ( )
{
d a t a = new C i r c u l a r V e c t o r ( ) ;
}

public v o i d push ( O b j e c t o ){}


public O b j e c t pop (){}
public O b j e c t t o p (){}
public i n t s i z e (){}
public b o o l e a n empty (){}
}

Lubos Omelina, Bart Jansen Algo Data 2023-2024 115 / 259


Double Linked Lists

Lubos Omelina, Bart Jansen Algo Data 2023-2024 116 / 259


Double Linked List

Adding and removing elements at the end of a linked list is not


efficient.
By keeping a pointer to the last element, adding an element can be
done in constant time.
For making the remove more efficient, we need a pointer to the
second last element
AND we need to be able to update it to the previous element

Lubos Omelina, Bart Jansen Algo Data 2023-2024 117 / 259


Double Linked List

We have a linked list which we traverse from left to right.


Simultaneously, we store the same data in a linked list we traverse
from right to left.
Code should make sure both lists are kept up-to-data in an efficient
manner.

Lubos Omelina, Bart Jansen Algo Data 2023-2024 118 / 259


Double Linked List

Lubos Omelina, Bart Jansen Algo Data 2023-2024 119 / 259


Double Linked List

public class DoubleLinkedList


{
private c l a s s DoubleLinkedListElement {...}

p r i v a t e i n t count ;
p r i v a t e D o u b l e L i n k e d L i s t E l e m e n t head ;
private DoubleLinkedListElement t a i l ;

public DoubleLinkedList ()
{
head = n u l l ;
tail = null ;
count = 0;
}

public Object g e t F i r s t ( ) { . . . }
public Object getLast ( ) { . . . }
public int size (){...}
public void a d d F i r s t ( Object value ) { . . . }
public void addLast ( Object value ) { . . . }
}

Lubos Omelina, Bart Jansen Algo Data 2023-2024 120 / 259


Double Linked List
private c l a s s DoubleLinkedListElement
{
p r i v a t e Object data ;
p r i v a t e DoubleLinkedListElement nextElement ;
private DoubleLinkedListElement previousElement ;

p u b l i c Do ubleLin kedList Element ( Object v , Doubl eLinked ListEle ment next ,


DoubleLinkedListElement previous )
{
data = v ;
nextElement = next ;
previousElement = previous ;
i f ( n e x t E l e m e n t != n u l l ) n e x t E l e m e n t . p r e v i o u s E l e m e n t = t h i s ;
i f ( p r e v i o u s E l e m e n t != n u l l ) p r e v i o u s E l e m e n t . n e x t E l e m e n t = t h i s ;
}
p u b l i c DoubleLinkedListElement ( Object v )
{
this (v , null , null );
}
public DoubleLinkedListElement previous ()
{
return previousElement ;
}
p u b l i c Object value ()
{
r e t u r n data ;
}
p u b l i c DoubleLinkedListElement next ()
{
r e t u r n nextElement ;
}
public void setNext ( DoubleLinkedListElement value )
{
nextElement = value ;
Lubos Omelina, Bart Jansen Algo Data 2023-2024 121 / 259
Double Linked List
p u b l i c Object g e t F i r s t ()
{
r e t u r n head . v a l u e ( ) ;
}
p u b l i c Object getLast ()
{
return t a i l . value ( ) ;
}
public int size ()
{
r e t u r n count ;
}
p u b l i c void a d d F i r s t ( Object value )
{
head = new D o u b l e L i n k e d L i s t E l e m e n t ( v a l u e , head , n u l l ) ;
i f ( t a i l == n u l l ) t a i l = head ;
c o u n t ++;
}
p u b l i c void addLast ( Object value )
{
t a i l = new D o u b l e L i n k e d L i s t E l e m e n t ( v a l u e , n u l l , t a i l ) ;
i f ( head == n u l l ) head = t a i l ;
c o u n t ++;
}
public void print ()
{
D o u b l e L i n k e d L i s t E l e m e n t d = head ;
System . o u t . p r i n t ( ” ( ” ) ;
w h i l e ( d != n u l l )
{
System . o u t . p r i n t ( d . v a l u e ( ) + ” ” ) ;
d = d . next ( ) ;
}
System . o u t . p r i n t l n ( ” ) ” ) ;
Lubos Omelina, Bart Jansen Algo Data 2023-2024 122 / 259
Binary Search Trees

Lubos Omelina, Bart Jansen Algo Data 2023-2024 123 / 259


Binary Search Tree

A Binary Search Tree is a binary tree in which for each node K all the
nodes in the left subtree of K are smaller than the node K and all the
nodes in the right subtree of K are larger than K .

Lubos Omelina, Bart Jansen Algo Data 2023-2024 124 / 259


Binary Search Tree
100

7 200

4 11

1 9 12

8 10

8,5

Lubos Omelina, Bart Jansen Algo Data 2023-2024 125 / 259


Binary Search Tree
import j a v a . u t i l . Comparator ;

p u b l i c c l a s s Tree
{
p r i v a t e c l a s s TreeNode i m p l e m e n t s C o m p a r a b l e
{
p r o t e c t e d Comparable v a l u e ;
p r o t e c t e d TreeNode l e f t N o d e ;
p r o t e c t e d TreeNode r i g h t N o d e ;
p r o t e c t e d TreeNode p a r e n t N o d e ;

p u b l i c TreeNode ( C o m p a r a b l e v , TreeNode l e f t , TreeNode r i g h t , TreeNode p a r e n t )


{
value = v ;
leftNode = l e f t ;
rightNode = r i g h t ;
parentNode = parent ;
}
p u b l i c TreeNode ( C o m p a r a b l e v )
{
this (v , null , null , null );
}

public TreeNode g e t L e f t T r e e ( ) { r e t u r n l e f t N o d e ; }
public TreeNode g e t R i g h t T r e e ( ) { r e t u r n r i g h t N o d e ; }
public TreeNode g e t P a r e n t ( ) { r e t u r n p a r e n t N o d e ; }
public Comparable g e t V a l u e (){ r e t u r n v a l u e ;}

p u b l i c i n t compareTo ( O b j e c t a r g 0 )
{
TreeNode node2 = ( TreeNode ) a r g 0 ;
r e t u r n v a l u e . compareTo ( node2 . v a l u e ) ;
}
}
Lubos Omelina, Bart Jansen Algo Data 2023-2024 126 / 259
Binary Search Tree
p u b l i c c l a s s TreeNode i m p l e m e n t s C o m p a r a b l e
{
p r o t e c t e d Comparable v a l u e ;
p r o t e c t e d TreeNode l e f t N o d e ;
p r o t e c t e d TreeNode r i g h t N o d e ;
p r o t e c t e d TreeNode p a r e n t N o d e ;

p u b l i c TreeNode ( C o m p a r a b l e v , TreeNode l e f t , TreeNode r i g h t , TreeNode p a r e n t )


{
value = v ;
leftNode = l e f t ;
rightNode = r i g h t ;
parentNode = parent ;
}
p u b l i c TreeNode ( C o m p a r a b l e v )
{
this (v , null , null , null );
}

public TreeNode g e t L e f t T r e e ( ) { r e t u r n l e f t N o d e ; }
public TreeNode g e t R i g h t T r e e ( ) { r e t u r n r i g h t N o d e ; }
public TreeNode g e t P a r e n t ( ) { r e t u r n p a r e n t N o d e ; }
public Comparable g e t V a l u e (){ r e t u r n v a l u e ;}

p u b l i c i n t compareTo ( O b j e c t a r g 0 )
{
TreeNode node2 = ( TreeNode ) a r g 0 ;
r e t u r n v a l u e . compareTo ( node2 . v a l u e ) ;
}
}

Lubos Omelina, Bart Jansen Algo Data 2023-2024 127 / 259


Searching for given element in the binary search tree

The tree is empty, so the element is not present in the tree.


The root of the tree equals to the element, so we have found the
element in the tree.
The root of the tree is bigger than the element. This means that if
the element is part of the tree, it should be in the left subtree. So, we
should continue the search in the left subtree.
The root of the tree is smaller than the element.This means that if
the element is part of the tree, it should be in the right subtree. So,
we should continue the search in the right subtree.

Lubos Omelina, Bart Jansen Algo Data 2023-2024 128 / 259


Binary Search Tree

p r i v a t e b o o l e a n findNode ( Comparable element ,


TreeNode c u r r e n t )
{
i f ( c u r r e n t == n u l l ) r e t u r n f a l s e ;
e l s e i f ( e l e m e n t . compareTo ( c u r r e n t . v a l u e ) == 0 )
return true ;
e l s e i f ( e l e m e n t . compareTo ( c u r r e n t . v a l u e ) < 0 )
{
r e t u r n findNode ( element , c u r r e n t . g e t L e f t T r e e ( ) ) ;
}
e l s e r e t u r n findNode ( element , c u r r e n t . getRightTree ( ) ) ;
}

p u b l i c b o o l e a n f i n d ( Comparable element )
{
r e t u r n findNode ( element , r o o t ) ;
}

Lubos Omelina, Bart Jansen Algo Data 2023-2024 129 / 259


Binary Search Tree
p u b l i c v o i d i n s e r t ( Comparable element )
{
insertAtNode ( element , root , n u l l ) ;
}

p r i v a t e v o i d i n s e r t A t N o d e ( Comparable element ,
TreeNode c u r r e n t ,
TreeNode p a r e n t )
{
// i f t h e node we c h e c k i s empty
i f ( c u r r e n t == n u l l )
{
TreeNode newNode = new TreeNode ( e l e m e n t ) ;
// t h e c u r r e n t node i s empty , b u t we h a v e a p a r e n t
i f ( p a r e n t != n u l l )
{
i f ( e l e m e n t . compareTo ( p a r e n t . v a l u e ) < 0 ) p a r e n t . l e f t N o d e = newNode ;
e l s e p a r e n t . r i g h t N o d e = newNode ;
}
e l s e r o o t = newNode ;
newNode . p a r e n t N o d e = p a r e n t ;
}
e l s e i f ( e l e m e n t . compareTo ( c u r r e n t . v a l u e ) == 0 )
{
// we do n o t c a r e
}
// i f t h e e l e m e n t i s s m a l l e r t h a n c u r r e n t , go l e f t
e l s e i f ( e l e m e n t . compareTo ( c u r r e n t . v a l u e ) < 0 )
{
insertAtNode ( element , c u r r e n t . g e t L e f t T r e e ( ) , c u r r e n t ) ;
}
// i f t h e e l e m e n t i s b i g g e r t h a n c u r r e n t , go r i g h t
e l s e insertAtNode ( element , c u r r e n t . getRightTree ( ) ,
current );
Lubos Omelina, Bart Jansen Algo Data 2023-2024 130 / 259
Binary Search Tree: Deleting 7
100

7 200

4 11

1 9 12

8 10

8,5

Lubos Omelina, Bart Jansen Algo Data 2023-2024 131 / 259


Binary Search Tree: Deleting 7
100

7 200

4 11

1 9 12

8 10

8,5

Lubos Omelina, Bart Jansen Algo Data 2023-2024 132 / 259


Binary Search Tree: Deleting 7
100

7 200

4 11

1 9 12

8 10

8,5

Lubos Omelina, Bart Jansen Algo Data 2023-2024 133 / 259


Binary Search Tree: Deleting 7

100

8 200

4 11

1 9 12

8,5 10

Lubos Omelina, Bart Jansen Algo Data 2023-2024 134 / 259


Binary Search Tree: Deleting 7

100

8 200

4 11

1 9 12

8,5 10

Lubos Omelina, Bart Jansen Algo Data 2023-2024 135 / 259


Binary Search Tree: Deleting 7

100

8 200

4 11

1 9 12

8,5 10

Lubos Omelina, Bart Jansen Algo Data 2023-2024 136 / 259


Binary Search Tree

q q

(a) z r

NIL r

Lubos Omelina, Bart Jansen Algo Data 2023-2024 137 / 259


Binary Search Tree

q q

(b) z l

l NIL

q q

Lubos Omelina, Bart Jansen Algo Data 2023-2024 138 / 259


Binary Search Tree

q q

(c) z y

l y l x

NIL x

q q

Lubos Omelina, Bart Jansen Algo Data 2023-2024 139 / 259


Binary Search Tree

q q q

(d) z z y y

l r l NIL r l r

y x x

NIL x

Lubos Omelina, Bart Jansen Algo Data 2023-2024 140 / 259


FInding the smallest element in a subtree

p u b l i c TreeNode minNode ( TreeNode c u r r e n t )


{
i f ( c u r r e n t == n u l l ) r e t u r n n u l l ;
e l s e i f ( c u r r e n t . g e t L e f t T r e e ( ) == n u l l ) r e t u r n c u r r e n t ;
e l s e r e t u r n minNode ( c u r r e n t . g e t L e f t T r e e ( ) ) ;
}

Lubos Omelina, Bart Jansen Algo Data 2023-2024 141 / 259


Replacing a subtree with another subtree

p r i v a t e v o i d t r a n s p l a n t ( TreeNode oldNode , TreeNode newNode )


{
i f ( o l d N o d e . p a r e n t N o d e == n u l l ) r o o t = newNode ;
e l s e i f ( o l d N o d e . p a r e n t N o d e . g e t L e f t T r e e ( ) == o l d N o d e )
{
o l d N o d e . p a r e n t N o d e . l e f t N o d e = newNode ;
}
e l s e o l d N o d e . p a r e n t N o d e . r i g h t N o d e = newNode ;
i f ( newNode != n u l l ) newNode . p a r e n t N o d e = o l d N o d e . p a r e n t N o d e ;
}

Lubos Omelina, Bart Jansen Algo Data 2023-2024 142 / 259


Remove an element from a binary search tree
p u b l i c v o i d remove ( C o m p a r a b l e e l e m e n t )
{
removeNode ( e l e m e n t , r o o t ) ;
}

p r i v a t e v o i d removeNode ( C o m p a r a b l e e l e m e n t , TreeNode c u r r e n t )
{
i f ( c u r r e n t == n u l l ) r e t u r n ;
e l s e i f ( e l e m e n t . compareTo ( c u r r e n t . v a l u e ) == 0 )
{
i f ( c u r r e n t . l e f t N o d e == n u l l ) t r a n s p l a n t ( c u r r e n t , c u r r e n t . r i g h t N o d e ) ;
e l s e i f ( c u r r e n t . r i g h t N o d e == n u l l ) t r a n s p l a n t ( c u r r e n t , c u r r e n t . l e f t N o d e ) ;
else
{
TreeNode y = minNode ( c u r r e n t . r i g h t N o d e ) ;
i f ( y . p a r e n t N o d e != c u r r e n t )
{
transplant (y , y . rightNode ) ;
y . rightNode = current . rightNode ;
y . rightNode . parentNode = y ;
}
transplant ( current , y ) ;
y . leftNode = current . leftNode ;
y . l e f t N o d e . parentNode = y ;
}
}
e l s e i f ( e l e m e n t . compareTo ( c u r r e n t . v a l u e ) < 0 )
{
removeNode ( e l e m e n t , c u r r e n t . g e t L e f t T r e e ( ) ) ;
}
e l s e removeNode ( e l e m e n t , c u r r e n t . g e t R i g h t T r e e ( ) ) ;
}

Lubos Omelina, Bart Jansen Algo Data 2023-2024 143 / 259


Time Complexities

For a tree of depth H, findNode, insertNode and removeNode are


performed in O(H)
If we assume that insertions are purely random, than the depth of a
binary search tree will be O(log n) on average.
Although this is intuitively clear, it is far from obvious to prove.

Lubos Omelina, Bart Jansen Algo Data 2023-2024 144 / 259


Tree Traversals, state space search

Lubos Omelina, Bart Jansen Algo Data 2023-2024 145 / 259


How to traverse a (binary search) tree

A tree traverse is a method that visits each element in the tree and
performs some operation on the element.
Print all elements
Call a method on each element
Double each node value if it is a number
...

Lubos Omelina, Bart Jansen Algo Data 2023-2024 146 / 259


A recursive method to print the tree

p r i v a t e v o i d p r i n t N o d e ( TreeNode n )
{
i f ( n != n u l l )
{
System . o u t . p r i n t l n ( n . v a l u e ) ;
printNode (n . leftNode ) ;
printNode (n . rightNode ) ;
}
}

public void print2 ()


{
printNode ( root ) ;
}

Lubos Omelina, Bart Jansen Algo Data 2023-2024 147 / 259


The same, but using a stack to handle the recursion

public void recPrint ()


{
S t a c k t = new S t a c k ( ) ;
t . pu sh ( r o o t ) ;
w h i l e ( ! t . empty ( ) )
{
TreeNode n = ( TreeNode ) t . pop ( ) ;

System . o u t . p r i n t l n ( n . v a l u e ) ;

i f ( n . g e t R i g h t T r e e ( ) != n u l l ) t . push ( n . g e t R i g h t T r e e ( ) ) ;
i f ( n . g e t L e f t T r e e ( ) != n u l l ) t . push ( n . g e t L e f t T r e e ( ) ) ;
}
}

Lubos Omelina, Bart Jansen Algo Data 2023-2024 148 / 259


From depth first to breadth first

public void recPrint ()


{
Queue t = new Queue ( ) ;
t . pu sh ( r o o t ) ;
w h i l e ( ! t . empty ( ) )
{
TreeNode n = ( TreeNode ) t . pop ( ) ;

System . o u t . p r i n t l n ( n . v a l u e ) ;

i f ( n . g e t R i g h t T r e e ( ) != n u l l ) t . push ( n . g e t R i g h t T r e e ( ) ) ;
i f ( n . g e t L e f t T r e e ( ) != n u l l ) t . push ( n . g e t L e f t T r e e ( ) ) ;
}
}

Lubos Omelina, Bart Jansen Algo Data 2023-2024 149 / 259


What if we want to do something else than printing?

p u b l i c void t r a v e r s e ( TreeAction action )


{
Queue t = new Queue ( ) ;
t . pus h ( r o o t ) ;
w h i l e ( ! t . empty ( ) )
{
TreeNode n = ( TreeNode ) t . pop ( ) ;
a c t i o n . run ( n ) ;
i f ( n . g e t L e f t T r e e ( ) != n u l l ) t . push ( n . g e t L e f t T r e e ( ) ) ;
i f ( n . g e t R i g h t T r e e ( ) != n u l l ) t . push ( n . g e t R i g h t T r e e ( ) ) ;
}
}

Lubos Omelina, Bart Jansen Algo Data 2023-2024 150 / 259


The TreeAction class

p u b l i c abstract c l a s s TreeAction
{
p u b l i c a b s t r a c t v o i d r u n ( TreeNode n ) ;
}

Subclass from this one for each operation you want to perform.

Lubos Omelina, Bart Jansen Algo Data 2023-2024 151 / 259


JAVA allows to make anonymous classes to handle this

public void print ()


{
t r a v e r s e ( new T r e e A c t i o n ( )
{
p u b l i c v o i d r u n ( TreeNode n )
{
// p u t y o u r c o d e h e r e
}
});
}

Lubos Omelina, Bart Jansen Algo Data 2023-2024 152 / 259


State space Search

Lubos Omelina, Bart Jansen Algo Data 2023-2024 153 / 259


State Space Search

As long as the agenda is not empty:


The current state is the first one in the agenda.
If the current state is the goal, we have found the goal.
Otherwise, we generate all successor states, and add them to the
agenda.

Lubos Omelina, Bart Jansen Algo Data 2023-2024 154 / 259


Types of search algorithms. See the Artificial Intelligence
Course

Uninformed Search - blind search


I Depth first search. Agenda is stack
I Breadth first search. Agenda is queue
Informed search - heuristic search
I A*. Agenda is a priority queue.

Lubos Omelina, Bart Jansen Algo Data 2023-2024 155 / 259


State Space Search, an example

Can we place 8 queens on a chessboard, such that no queen can take


any other queen?
Or in general, can we place n queens on an n-by-n board?
The board is the state.
A successor is a state with one more queen added to it.
The goal is reached when there are n queens.

Lubos Omelina, Bart Jansen Algo Data 2023-2024 156 / 259


The eight queens problem

! !
! !
! !
! !
! !
! !
! !
! !

Lubos Omelina, Bart Jansen Algo Data 2023-2024 157 / 259


Example: The n-queens problem
p u b l i c c l a s s NQueens {

p u b l i c NQueens ( i n t n )
{
L i n k e d L i s t v i s i t e d L i s t = new L i n k e d L i s t ( ) ;
C h e s s B o a r d s t a t e = new C h e s s B o a r d ( n ) ;
S t a c k t o d o L i s t = new S t a c k ( ) ;
t o d o L i s t . push ( s t a t e ) ;
w h i l e ( ! t o d o L i s t . empty ( ) )
{
C h e s s B o a r d c u r r e n t S t a t e = ( C h e s s B o a r d ) t o d o L i s t . pop ( ) ;
v i s i t e d L i s t . addFirst ( currentState );
i f ( c u r r e n t S t a t e . goalReached ( ) )
{
System . o u t . p r i n t l n ( ”GOAL REACHED” ) ;
currentState . print ();
return ;
}
else
{
LinkedList successorStates = currentState . successors ();
f o r ( i n t i =0; i<s u c c e s s o r S t a t e s . s i z e ( ) ; i ++)
{
ChessBoard next = ( ChessBoard ) s u c c e s s o r S t a t e s . get ( i ) ;
i f ( ! v i s i t e d L i s t . f i n d ( n e x t ) ) t o d o L i s t . push ( n e x t ) ;
}
}
}
}
}

Lubos Omelina, Bart Jansen Algo Data 2023-2024 158 / 259


Example: The n-queens problem

p u b l i c c l a s s ChessBoard implements Comparable {

private int size ;


p r i v a t e boolean board [ ] [ ] ;
p r i v a t e i n t count = 0;

p u b l i c C h e s s B o a r d ( i n t n){}

p u b l i c C h e s s B o a r d ( C h e s s B o a r d a n o t h e r B o a r d ){}

p u b l i c b o o l e a n g e t ( i n t i , i n t j ){}

p u b l i c i n t g e t C o u n t (){}

p u b l i c v o i d s e t ( i n t i , i n t j , b o o l e a n v a l u e ){}

p u b l i c L i n k e d L i s t s u c c e s s o r s (){}

p u b l i c b o o l e a n g o a l R e a c h e d (){}

p u b l i c v o i d p r i n t (){}

p r i v a t e b o o l e a n canAdd ( i n t x , i n t y ){}

p u b l i c i n t compareTo ( O b j e c t a r g 0 ) {}
}

Lubos Omelina, Bart Jansen Algo Data 2023-2024 159 / 259


Example: The n-queens problem

p u b l i c ChessBoard ( i n t n )
{
size = n;
b o a r d = new b o o l e a n [ n ] [ n ] ;
f o r ( i n t i= 0 ; i<n ; i ++)
f o r ( i n t j =0; j<n ; j ++)
board [ i ] [ j ] = f a l s e ;
}

p u b l i c ChessBoard ( ChessBoard anotherBoard )


{
s i z e = anotherBoard . s i z e ;
b o a r d = new b o o l e a n [ a n o t h e r B o a r d . s i z e ] [ a n o t h e r B o a r d . s i z e ] ;
f o r ( i n t i= 0 ; i<s i z e ; i ++)
f o r ( i n t j =0; j<s i z e ; j ++)
board [ i ] [ j ] = anotherBoard . get ( i , j ) ;
count = anotherBoard . count ;
}

Lubos Omelina, Bart Jansen Algo Data 2023-2024 160 / 259


Example: The n-queens problem

p u b l i c boolean get ( i n t i , i n t j )
{
r e t u r n board [ i ] [ j ] ;
}

p u b l i c i n t getCount ( )
{
r e t u r n count ;
}

public void set ( int i , i n t j , boolean value )


{
i f ( b o a r d [ i ] [ j ] && ! v a l u e ) c ou nt −−;
e l s e i f ( ! board [ i ] [ j ] && v a l u e ) c o u n t ++;
board [ i ] [ j ] = v a l u e ;
}

Lubos Omelina, Bart Jansen Algo Data 2023-2024 161 / 259


Example: The n-queens problem

p u b l i c boolean goalReached ()
{
r e t u r n c o u n t == s i z e ;
}

public void print ()


{
f o r ( i n t i= 0 ; i<s i z e ; i ++)
{
f o r ( i n t j =0; j<s i z e ; j ++)
{
System . o u t . p r i n t ( b o a r d [ i ] [ j ] ) ;
System . o u t . p r i n t ( ” ” ) ;
}
System . o u t . p r i n t l n ( ) ;
}
}

Lubos Omelina, Bart Jansen Algo Data 2023-2024 162 / 259


Example: The n-queens problem

p u b l i c i n t compareTo ( O b j e c t a r g 0 ) {
ChessBoard another = ( ChessBoard ) arg0 ;
f o r ( i n t i =0; i<s i z e ; i ++)
{
f o r ( i n t j =0; j<s i z e ; j ++)
{
i f ( b o a r d [ i ] [ j ] != a n o t h e r . b o a r d [ i ] [ j ] ) r e t u r n 1 ;
}
}
return 0;
}

Lubos Omelina, Bart Jansen Algo Data 2023-2024 163 / 259


Example: The n-queens problem

public LinkedList successors ()


{
L i n k e d L i s t r e s u l t = new L i n k e d L i s t ( ) ;
f o r ( i n t i= 0 ; i<s i z e ; i ++)
{
f o r ( i n t j =0; j<s i z e ; j ++)
{
i f ( canAdd ( i , j ) )
{
C h e s s B o a r d c h i l d = new C h e s s B o a r d ( t h i s ) ;
child . set ( i , j , true );
result . addFirst ( child );
}
}
}
return result ;
}

Lubos Omelina, Bart Jansen Algo Data 2023-2024 164 / 259


Example: The n-queens problem

Lubos Omelina, Bart Jansen Algo Data 2023-2024 165 / 259


Example: The n-queens problem
p u b l i c b o o l e a n canAdd ( i n t x , i n t y )
{
i f ( board [ x ] [ y ] ) r e t u r n f a l s e ;
else
{
// i f t h e r e i s a l r e a d y a queen on t h e same row , r e t u r n f a l s e
f o r ( i n t c o l =0; c o l<s i z e ; c o l ++)
i f ( board [ x ] [ c o l ] ) r e t u r n f a l s e ;
// i f t h e r e i s a l r e a d y a queen on t h e same c o l , r e t u r n f a l s e
f o r ( i n t row =0; row<s i z e ; row++)
i f ( b o a r d [ row ] [ y ] ) r e t u r n f a l s e ;
// i f t h e r e i s a l r e a d y a queen on any o f t h e two d i a g o n a l s , r e t u r n false
i n t r=x ;
i n t c=y ;
w h i l e ( ( r<s i z e )&&(c<s i z e ) )
i f ( b o a r d [ r ++][ c ++]) r e t u r n f a l s e ;
r=x −1; c=y +1;
w h i l e ( ( r >=0)&&(c<s i z e ) )
i f ( b o a r d [ r −−][c ++]) r e t u r n f a l s e ;
r=x +1; c=y −1;
w h i l e ( ( r<s i z e )&&(c >=0))
i f ( b o a r d [ r ++][ c−−]) r e t u r n f a l s e ;
r=x −1; c=y −1;
w h i l e ( ( r >=0)&&(c >=0))
i f ( b o a r d [ r −−][c−−]) r e t u r n f a l s e ;
// e l s e r e t u r n t r u e
return true ;
}
}

Lubos Omelina, Bart Jansen Algo Data 2023-2024 166 / 259


Some Remarks

It works!
But we are a little bit naive in our implementation:
I The visited list is O(n) for finding.
I The board is 2D, comparing two n-by-n boards requires up to n2
comparisons, O(n2 ).
I To generate a successor, the board is copied to a new board ( O(n2 )).
I Generating the successors of a board also requires quite some checks at
the horizontal, vertical and diagonal lines.

Lubos Omelina, Bart Jansen Algo Data 2023-2024 167 / 259


More advanced binary search trees

Lubos Omelina, Bart Jansen Algo Data 2023-2024 168 / 259


Two approaches for improving retrieval efficiency

Guarantee that those elements that are accessed often are close to
the root.
I Keep the BST as is, when accessing move elements up.
I Example: Splay Trees
All Kinds of approaches to guarantee that the tree is more or less
balanced.
I If all paths have equal length, the length of the path H is log2 (n).
I Example: Red-Black Trees

Lubos Omelina, Bart Jansen Algo Data 2023-2024 169 / 259


Towards Splay Trees: Left and Right Rotations

y x

Right rotation
x y

C A

B B
A Left rotation C

Lubos Omelina, Bart Jansen Algo Data 2023-2024 170 / 259


Rotations

p r i v a t e v o i d r o t a t e L e f t ( TreeNode x )
{
TreeNode y = x . r i g h t N o d e ;
x . rightNode = y . leftNode ;
i f ( y . l e f t N o d e != n u l l ) y . l e f t N o d e . p a r e n t N o d e = x ;
y . parentNode = x . parentNode ;
i f ( x . p a r e n t N o d e == n u l l ) r o o t = y ;
e l s e i f ( x == x . p a r e n t N o d e . l e f t N o d e ) x . p a r e n t N o d e . l e f t N o d e = y ;
e l s e x . parentNode . rightNode = y ;
y . leftNode = x ;
x . parentNode = y ;
}

Lubos Omelina, Bart Jansen Algo Data 2023-2024 171 / 259


Rotations

p r i v a t e v o i d r o t a t e R i g h t ( TreeNode y )
{
TreeNode x = y . l e f t N o d e ;
y . leftNode = x . rightNode ;
i f ( x . r i g h t N o d e != n u l l ) x . r i g h t N o d e . p a r e n t N o d e = y ;
x . parentNode = y . parentNode ;
i f ( y . p a r e n t N o d e == n u l l ) r o o t = x ;
e l s e i f ( y == y . p a r e n t N o d e . r i g h t N o d e ) y . p a r e n t N o d e . r i g h t N o d e = x ;
e l s e y . parentNode . l e f t N o d e = x ;
x . rightNode = y ;
y . parentNode = x ;
}

Lubos Omelina, Bart Jansen Algo Data 2023-2024 172 / 259


Splay Trees

Upon access of the node x, Left and Right rotations are used to move
x up to the root.
If x is accessed in the near future, its access will be more efficient.
Each rotation maintains the BST properties and can be done in
constant time.
Each rotation moves the node x one level up into the tree. If the tree
has a depth H, then this extra cost for inserting elements is
O(H) = O(log2 (n))

Lubos Omelina, Bart Jansen Algo Data 2023-2024 173 / 259


Example 1: splaying the x

g p x

(a) p x g p

x g

g g x

Lubos Omelina, Bart Jansen Algo Data 2023-2024 174 / 259


Example 2: splaying the x

g g x

p x p g

(b) x p

Lubos Omelina, Bart Jansen Algo Data 2023-2024 175 / 259


Case 0

p
p

x x

x is right child rotate g to the left

p x

x p

x is left child rotate p to the right

Lubos Omelina, Bart Jansen Algo Data 2023-2024 176 / 259


Case 1

g x

p p

x g

x g

p is left child rotate g to the right rotate p to the right


x is left child

g g

p x
p g

x p

p is left child rotate p to the left rotate g to the right


x is right child

Lubos Omelina, Bart Jansen Algo Data 2023-2024 177 / 259


Case 2

g
x

p p

g x

x g

p is right child rotate g to the left rotate p to the left


x is right child

g g

p x
g p

x p

p is right child rotate p to the right rotate g to the left


x is left child

Lubos Omelina, Bart Jansen Algo Data 2023-2024 178 / 259


Splay Trees

p u b l i c c l a s s SplayTree extends BinarySearchTree


{
p u b l i c Comparable f i n d ( Comparable element )
{
TreeNode c u r r e n t = f i n d N o d e ( e l e m e n t , r o o t ) ;
i f ( c u r r e n t != n u l l ) s p l a y ( c u r r e n t ) ;
return current ;
}
p r i v a t e v o i d r o t a t e L e f t ( TreeNode x ){}
p r i v a t e v o i d r o t a t e R i g h t ( TreeNode y ){}
p r i v a t e v o i d s p l a y ( TreeNode s p l a y N o d e ){}
}

Lubos Omelina, Bart Jansen Algo Data 2023-2024 179 / 259


p r i v a t e v o i d s p l a y ( TreeNode s p l a y N o d e )
{
TreeNode p a r e n t , g r a n d P a r e n t ;
w h i l e ( ( p a r e n t = s p l a y N o d e . p a r e n t N o d e ) != n u l l )
{
i f ( ( g r a n d P a r e n t = p a r e n t . p a r e n t N o d e ) == n u l l )
{
i f ( s p l a y N o d e . p a r e n t N o d e . l e f t N o d e == s p l a y N o d e ) r o t a t e R i g h t ( p a r e n t ) ;
e l s e rotateLeft ( parent ) ;
}
else
{
i f ( p a r e n t . p a r e n t N o d e . l e f t N o d e == p a r e n t )
{
i f ( s p l a y N o d e . p a r e n t N o d e . l e f t N o d e == s p l a y N o d e )
{
rotateRight ( grandParent ) ; rotateRight ( parent ) ;
}
else
{
r o t a t e L e f t ( parent ) ; rotateRight ( grandParent ) ;
}
}
else
{
i f ( s p l a y N o d e . p a r e n t N o d e . r i g h t N o d e == s p l a y N o d e )
{
r o t a t e L e f t ( grandParent ) ; r o t a t e L e f t ( parent ) ;
}
else
{
rotateRight ( parent ) ; r o t a t e L e f t ( grandParent ) ;
}
}
}
}
}
Lubos Omelina, Bart Jansen Algo Data 2023-2024 180 / 259
Splay Trees

In splay tree we do not care about balancing the trees.


We only make the access to recently accessed elements faster.
Actually, a perfectly balanced tree can converted in a highly
degenerated tree.

Lubos Omelina, Bart Jansen Algo Data 2023-2024 181 / 259


Red-Black Trees

Definition 8 (Red-Black Tree)


A red-black tree is a binary search tree that satisfies the following
red-black properties:
1 Every node is either red or black.
2 The root is black.
3 Every leaf is black.
4 If a node is red, then both children are black.
5 For each node, all simple paths from the node to the descendant
leafs, contain the same number of black nodes.

Lubos Omelina, Bart Jansen Algo Data 2023-2024 182 / 259


Red-Black Trees: Ensuring that the tree is well balanced

Definition 9 (black-height)
The black-height of a node x bh(x) in a red-black tree is the number of
black nodes on any simple path from, but not including, a node x down to
a leaf.

Lubos Omelina, Bart Jansen Algo Data 2023-2024 183 / 259


Red-Black Trees: An example tree

3 26
3 17 41 2

2 14 2 21 2 30 1 47
2 10 1 16 1 19 1 23 1 28 1 38 NIL NIL

1 7 1 12 1 15 NIL NIL 1 20 NIL NIL NIL NIL 1 35 1 39


1 3 NIL NIL NIL NIL NIL NIL NIL NIL NIL NIL NIL

NIL NIL
(a)

Lubos Omelina, Bart Jansen Algo Data 2023-2024 184 / 259


Lemma 1
The subtree rooted at any node x contains at least 2bh(x) − 1 internal
nodes.

Proof.
The proof is based on induction on the height h of the tree x.
If h(x) = 0, then x is a leaf. Then bh(x) = 0, so 2bh(x) − 1 = 0 and the
statement holds.
If however h(x) ≥ 0, then x has two children. Each child has a
black-height of either bh(x) or bh(x) − 1, depending on whether its colour
is red or black. So, black height is at least bh(x) − 1.
As the height of the subtrees is smaller than h(x), we apply the induction
hypothesis and find that each child has at least 2bh(x)−1 − 1 internal
nodes. So, the subtree rooted at x has at least
(2bh(x)−1 − 1) + (2bh(x)−1 − 1) + 1 = 2bh(x) − 1 nodes

Lubos Omelina, Bart Jansen Algo Data 2023-2024 185 / 259


Lemma 2
A red-back tree with n internal nodes has a height of at most 2 log(n + 1).

Proof.
Let h be the height of the tree. According to red-black property 4, at least
half of the nodes on the any simple path from root to leaf, not including
the root, is black. Hence, the back-height of the root must be at least
h/2, hence n ≥ 2h/2 − 1, hence n + 1 ≥ 2h/2 , and consequently
log2 (n + 1) ≥ h/2 and h ≤ 2 log2 (n + 1).

Lubos Omelina, Bart Jansen Algo Data 2023-2024 186 / 259


p u b l i c c l a s s RedBlackTree
{
p u b l i c enum T r e e N o d e C o l o r {Red , B l a c k }

p r o t e c t e d c l a s s C o l o u r e d T r e e N o d e i m p l e m e n t s C o m p a r a b l e {}

p r o t e c t e d ColouredTreeNode r o o t ;

p r i v a t e void r o t a t e L e f t ( ColouredTreeNode x )
{}

p r i v a t e void r o t a t e R i g h t ( ColouredTreeNode y )
{}

p u b l i c v o i d i n s e r t ( Comparable element )
{
C o l o u r e d T r e e N o d e newNode = new C o l o u r e d T r e e N o d e ( e l e m e n t , T r e e N o d e C o l o r . Red ) ;
i n s e r t A t N o d e ( newNode , r o o t , n u l l ) ;
f i x U p ( newNode ) ;
}

p r o t e c t e d v o i d i n s e r t A t N o d e ( C o l o u r e d T r e e N o d e newNode ,
ColouredTreeNode current , ColouredTreeNode parent )
{}

p r i v a t e void fixUp ( ColouredTreeNode z )


{}
}

Lubos Omelina, Bart Jansen Algo Data 2023-2024 187 / 259


protected class ColouredTreeNode implements Comparable
{
protected TreeNodeColor c o l o r ;
protected Comparable v a l u e ;
protected ColouredTreeNode l e f t N o d e ;
protected ColouredTreeNode rightNode ;
protected ColouredTreeNode parentNode ;

p u b l i c ColouredTreeNode ( Comparable value , TreeNodeColor c o l o r )


{
this . value = value ;
this . color = color ;
}

public String toString ()


{
return value . toString () + ” : ” + color ;
}

p u b l i c i n t compareTo ( O b j e c t a r g 0 )
{
C o l o u r e d T r e e N o d e node2 = ( C o l o u r e d T r e e N o d e ) a r g 0 ;
r e t u r n v a l u e . compareTo ( node2 . v a l u e ) ;
}
}

Lubos Omelina, Bart Jansen Algo Data 2023-2024 188 / 259


Red Black Tree: Inserting an element

We keep the same insert method as in the original binary search tree.
The node we add is coloured red.
After insertion, we are guaranteed we still have a binary search tree.
Which red-black tree properties can be violated by inserting this way?

Lubos Omelina, Bart Jansen Algo Data 2023-2024 189 / 259


Red Black Tree: Possible violations to red-black tree
properties after inserting a node z

1 Every node is either red or black. OK


2 The root is black. NOT OK (if z is the root)
3 Every leaf is black. OK
4 If a node is red, then both children are black. NOT OK
(if the parent of z is also red)
5 For each node,all simple paths from the node to the descendant leafs,
contain the same number of black nodes. OK

Lubos Omelina, Bart Jansen Algo Data 2023-2024 190 / 259


Red Black Tree: Possible violations to red-black tree
properties after inserting a node z

If the binary search tree meets all the red-black tree properties, except
property 2 (the root is black), then a simple recolouring of the root to
black is possible. This does never violate any of the other red-black
tree properties.
If the binary search tree violates the fourth property (if a node is red
then both the children are black), then we first solve this violation.
I This solution should not violate red-black properties 1, 3 and 5.
I If the solution violates red-black property 2, then see above.

Lubos Omelina, Bart Jansen Algo Data 2023-2024 191 / 259


Example: Insert 4 and colour it red

11

2 14

1 7 15
5 8 y
z 4 Case 1

Lubos Omelina, Bart Jansen Algo Data 2023-2024 192 / 259


4 is red so 5 and 8 must become black, 7 becomes red.
But now 7 and 2 are red ...

11
2 14 y
1 7 z 15

5 8
4 Case 2

Lubos Omelina, Bart Jansen Algo Data 2023-2024 193 / 259


7 and 14 must become black, but 14 is already black, so
we can’t recolor easily. Left rotate “the problem 7” to
move it higher up in the tree

11
7 14 y

(c) z 2 8 15

1 5
Case 3
4

Lubos Omelina, Bart Jansen Algo Data 2023-2024 194 / 259


Right rotate “the problem 7” to move it even higher up in
the tree. Recolor once it becomes root

z 2 11

1 5 8 14

4 15

Lubos Omelina, Bart Jansen Algo Data 2023-2024 195 / 259


Case 1: z is red, its parent is red, its uncle y is red. Recolor

C new z C

(a) A D y A D

α B z δ ε α B δ ε
β γ β γ

C new z C

(b) B D y B D
z A γ δ ε A γ δ ε
α β α β

Lubos Omelina, Bart Jansen Algo Data 2023-2024 196 / 259


Case 2: z is red, its parent is red, its uncle y is black. We
can’t recolor but we rotate.

C C B

A δ y B δ y z A C

α B z z A γ α β γ δ
β γ α β
Case 2 Case 3

case 2: z is a right child, case 3: z is a left child

Lubos Omelina, Bart Jansen Algo Data 2023-2024 197 / 259


p r i v a t e void fixUp ( ColouredTreeNode z )
{
w h i l e ( ( z != n u l l ) && ( z . p a r e n t N o d e != n u l l ) && ( z . p a r e n t N o d e . p a r e n t N o d e != n u l l )
&& ( z . p a r e n t N o d e . c o l o r == T r e e N o d e C o l o r . Red ) )
{
i f ( z . p a r e n t N o d e == z . p a r e n t N o d e . p a r e n t N o d e . l e f t N o d e )
{
ColouredTreeNode y = z . parentNode . parentNode . rightNode ;
i f ( y . c o l o r == T r e e N o d e C o l o r . Red )
{
z . parentNode . c o l o r = TreeNodeColor . Black ;
y . c o l o r = TreeNodeColor . Black ;
z . p a r e n t N o d e . p a r e n t N o d e . c o l o r = T r e e N o d e C o l o r . Red ;
z = z . parentNode . parentNode ;
}
else
{
i f ( z == z . p a r e n t N o d e . r i g h t N o d e )
{
z = z . parentNode ;
rotateLeft (z );
}
z . parentNode . c o l o r = TreeNodeColor . Black ;
z . p a r e n t N o d e . p a r e n t N o d e . c o l o r = T r e e N o d e C o l o r . Red ;
r o t a t e R i g h t ( z . parentNode . parentNode ) ;
}
}
else
{
// e x a c t l y t h e same , w i t h l e f t and r i g h t swapped . . .
}
}
i f ( z == r o o t ) r o o t . c o l o r = T r e e N o d e C o l o r . B l a c k ;
}

Lubos Omelina, Bart Jansen Algo Data 2023-2024 198 / 259


Red Black Trees in summary

Trees are guaranteed to be balanced.


Find in O(log n).
Insert is a find + a fix among the path from the inserted element
upto the root. The fix involves recoloring and rotations. O(log n).
Remove is somewhat similar, just a bit more complicated ;-) O(log n).

Lubos Omelina, Bart Jansen Algo Data 2023-2024 199 / 259


Graphs

Lubos Omelina, Bart Jansen Algo Data 2023-2024 200 / 259


Terminology

Vertex (vertices) : node


Edge: connection between two vertices.
Edges can have a weight: weighted graphs and unweighted graphs
So any graph G = (V , E ) is fully defined by the sets V of vertices
and E of edges
For weighted graphs, there is also a weight function w : E → R, so
w (u, v ) is the weight of the edge (u, v ) ∈ E .

Lubos Omelina, Bart Jansen Algo Data 2023-2024 201 / 259


More terminology

Edges can have a direction (marked by the arrow). Directed and


undirected graphs.
Two vertices i and j are adjacent if they are connected by an edge
(i, j). The edge (i, j) is said to be incident on the vertices i and j.
The degree of a vertex in an undirected graph is the number of edges
incident on it.
In a directed graph, the out-degree is the number of edges leaving it,
the in-degree is the number of edges entering it. The degree of a
vertex in a directed graph is hence the sum of its in-degree and its
out-degree.

Lubos Omelina, Bart Jansen Algo Data 2023-2024 202 / 259


Even more terminology

A path is a sequence of adjacent vertices.


If there is a path from every vertex to every other vertex, the graph is
connected.
Any path starting from a given node that terminates in the same
node, is called a cycle. Cyclic and Acyclic graphs.
Directed graphs without cycles are often referred to as directed
acyclic graphs or DAG.
A subset of a graph is called a subgraph. The graph G 0 = (V 0 , E 0 ) is
a subgraph of G = (V , E ) if V 0 ⊆ V and E 0 ⊆ E .

Lubos Omelina, Bart Jansen Algo Data 2023-2024 203 / 259


A graph and its matrix representation

1 2 3 4 5 6
1 0 1 0 1 0 0
1 2 3 2 0 0 0 0 1 0
3 0 0 0 0 1 1
4 0 1 0 0 0 0
4 5 6 5 0 0 0 1 0 0
(a) 6 0 (b)
0 0 0 0 1

(b) (c)

Lubos Omelina, Bart Jansen Algo Data 2023-2024 204 / 259


Adjacancy matrix representation

Definition 10 (Adjacancy matrix)


if the vertices of some Graph G = (V , E ) are somehow numbered
1, 2, ..., |V |, then the adjaceny 
matrix representation is a matrix A = (aij )
w (i, j) if (i, j) ∈ E
of size |V | by |V | where aij =
0 otherwise

Lubos Omelina, Bart Jansen Algo Data 2023-2024 205 / 259


A matrix class

p u b l i c c l a s s Matrix
{
// some a p p r o p r i a t e p r i v a t e members .

p u b l i c Matrix ( i n t nrNodes )
{
// a l l o c a t e an N−by−N m a t r i x w h e r e N = n r N o d e s
// a l l e l e m e n t s a r e i n i t i a l l y 0
}

p u b l i c v o i d s e t ( i n t row , i n t c o l , C o m p a r a b l e w e i g h t )
{
// s t o r e t h e w e i g h t a t t h e g i v e n row and column .
}

p u b l i c C o m p a r a b l e g e t ( i n t row , i n t c o l )
{
// r e t u r n t h e w e i g h t a t t h e g i v e n row and column .
}
}

Lubos Omelina, Bart Jansen Algo Data 2023-2024 206 / 259


Adjacancy matrix representation of a matrix

p u b l i c c l a s s MatrixGraph
{
p r i v a t e Matrix data ;

p u b l i c MatrixGraph ( i n t nrNodes )
{
d a t a = new M a t r i x ( n r N o d e s ) ;
}

p u b l i c v o i d addEdge ( i n t from , i n t to , d o u b l e w)
{
d a t a . s e t ( from , to , w ) ;
}

p u b l i c d o u b l e g e t E d g e ( i n t from , i n t t o )
{
r e t u r n ( Do ub le ) d a t a . g e t ( from , t o ) ;
}
}

Lubos Omelina, Bart Jansen Algo Data 2023-2024 207 / 259


Edge list representation

1 2 4
1 2 3 2 5
3 6 5
4 2
4 5 6 5 4
6 6
(a) (b)
(a) (b)

Lubos Omelina, Bart Jansen Algo Data 2023-2024 208 / 259


Edge list representation
p u b l i c c l a s s Node i m p l e m e n t s C o m p a r a b l e
{
p r i v a t e Comparable i n f o ;
p r i v a t e Vector edges ;

p u b l i c Node ( C o m p a r a b l e l a b e l )
{
info = label ;
e d g e s = new V e c t o r ( ) ;
}

p u b l i c v o i d addEdge ( Edge e )
{
edges . addLast ( e ) ;
}

public int compareTo ( O b j e c t o )


{
// two n o d e s a r e e q u a l i f t h e y h a v e t h e same l a b e l
Node n = ( Node ) o ;
return n . i n f o . compareTo ( i n f o ) ;
}

p u b l i c Comparable g e t L a b e l ( )
{
return info ;
}
}

Lubos Omelina, Bart Jansen Algo Data 2023-2024 209 / 259


Edge list representation

p r i v a t e c l a s s Edge i m p l e m e n t s C o m p a r a b l e
{
p r i v a t e Node toNode ;
p u b l i c Edge ( Node t o )
{
toNode = t o ;
}

p u b l i c i n t compareTo ( O b j e c t o )
{
// two e d g e s a r e e q u a l i f t h e y p o i n t
// t o t h e same node .
// t h i s a s s u m e s t h a t t h e e d g e s a r e
// s t a r t i n g from t h e same node ! ! !
Edge n = ( Edge ) o ;
r e t u r n n . toNode . compareTo ( toNode ) ;
}
}

Lubos Omelina, Bart Jansen Algo Data 2023-2024 210 / 259


Edge list representation
p u b l i c c l a s s Graph
{
p r i v a t e c l a s s Node i m p l e m e n t s C o m p a r a b l e
{
p r i v a t e Comparable i n f o ;
p r i v a t e Vector edges ;

p u b l i c Node ( C o m p a r a b l e l a b e l )
{
info = label ;
e d g e s = new V e c t o r ( ) ;
}

p u b l i c v o i d addEdge ( Edge e )
{
edges . addLast ( e ) ;
}

p u b l i c i n t compareTo ( O b j e c t o ){ . . . }

p u b l i c Comparable g e t L a b e l ( )
{
return info ;
}
}

p r i v a t e c l a s s Edge i m p l e m e n t s C o m p a r a b l e
{
p r i v a t e Node toNode ;
p u b l i c Edge ( Node t o )
{
toNode = t o ;
}
Lubos Omelina, Bart Jansen Algo Data 2023-2024 211 / 259
Edge list representation

p r i v a t e Node f i n d N o d e ( C o m p a r a b l e n o d e L a b e l )
{
Node r e s = n u l l ;
f o r ( i n t i =0; i<n o d e s . s i z e ( ) ; i ++)
{
Node n = ( Node ) n o d e s . g e t ( i ) ;
i f ( n . g e t L a b e l ( ) == n o d e L a b e l )
{
res = n;
break ;
}
}
return res ;
}

This is almost the find method on the vector ... put the right responsibility
in the right datastructure.

Lubos Omelina, Bart Jansen Algo Data 2023-2024 212 / 259


Remember ...

public boolean contains ( i n t obj )


{
f o r ( i n t i =0; i<c o u n t ; i ++)
{
i f ( d a t a [ i ] == o b j ) r e t u r n t r u e ;
}
return false ;
}

Lubos Omelina, Bart Jansen Algo Data 2023-2024 213 / 259


Better would be ...

p u b l i c Comparable c o n t a i n s ( Comparable o b j )
{
f o r ( i n t i =0; i<c o u n t ; i ++)
{
i f ( d a t a [ i ] . compareTo ( o b j ) == 0 ) r e t u r n o b j ;
}
return null ;
}

Lubos Omelina, Bart Jansen Algo Data 2023-2024 214 / 259


Better would be ...

p r o t e c t e d Node f i n d N o d e ( C o m p a r a b l e n o d e L a b e l )
{
r e t u r n ( Node ) n o d e s . c o n t a i n s ( new Node ( n o d e L a b e l ) ) ;
}

Lubos Omelina, Bart Jansen Algo Data 2023-2024 215 / 259


FInding a path between two vertices as a depth first graph
traversal

p u b l i c b o o l e a n f i n d P a t h ( Comparable nodeLabel1 ,
Comparable nodeLabel2 )
{
Node s t a r t S t a t e = f i n d N o d e ( n o d e L a b e l 1 ) ;
Node e n d S t a t e = f i n d N o d e ( n o d e L a b e l 2 ) ;
S t a c k t o D o L i s t = new S t a c k ( ) ;
t o D o L i s t . push ( s t a r t S t a t e ) ;
w h i l e ( ! t o D o L i s t . empty ( ) )
{
Node c u r r e n t = ( Node ) t o D o L i s t . pop ( ) ;
i f ( c u r r e n t == e n d S t a t e )
return true ;
else
{
f o r ( i n t i =0; i<c u r r e n t . e d g e s . s i z e ( ) ; i ++)
{
Edge e = ( Edge ) c u r r e n t . e d g e s . g e t ( i ) ;
t o D o L i s t . push ( e . toNode ) ;
}
}
}
return false ;
}

Lubos Omelina, Bart Jansen Algo Data 2023-2024 216 / 259


Some Graph algorithms and problems

Lubos Omelina, Bart Jansen Algo Data 2023-2024 217 / 259


Depth first search, depth first traversals, ...

The DFS on the previous slide does not have a visited list.
Just as in the tree search, it can easily be added, but it is time
consuming to check the visited list at each step.
Keep a boolean in each node to mark whether it was visited before or
not.
Recursive version

Lubos Omelina, Bart Jansen Algo Data 2023-2024 218 / 259


Depth First Graph Traversal

p r i v a t e v o i d DFS( Node c u r r e n t )
{
current . v i s i t e d = true ;
f o r ( i n t i =0; i<c u r r e n t . e d g e s . s i z e ( ) ; i ++)
{
Edge e = ( Edge ) c u r r e n t . e d g e s . g e t ( i ) ;
Node n e x t = ( Node ) e . toNode ;
i f ( ! n e x t . v i s i t e d ) DFS( n e x t ) ;
}

p u b l i c v o i d DFS ( )
{
f o r ( i n t i =0; i<n o d e s . s i z e ( ) ; i ++)
{
Node c u r r e n t = ( Node ) n o d e s . g e t ( i ) ;
current . visited = false ;
}
f o r ( i n t i =0; i<n o d e s . s i z e ( ) ; i ++)
{
Node c u r r e n t = ( Node ) n o d e s . g e t ( i ) ;
i f ( ! c u r r e n t . v i s i t e d ) DFS( c u r r e n t ) ;
}
}

Lubos Omelina, Bart Jansen Algo Data 2023-2024 219 / 259


Time Complexity of DFS

Set all Nodes to unvisited: O(|V |)


How many calls to DFS are there?
For each node, if it is not visited yet, then visit it.
I For each edge in the adjacency list, if not visited yet, visit the
destination of the edge.
So, the number of DFS calls is the sum of all nr of nodes in the
adjacency lists of all the nodes.
sumv ∈V |adj(v )| = O(|E |)
Time complexity of a full depth first traversal is O(|V | + |E |).

Lubos Omelina, Bart Jansen Algo Data 2023-2024 220 / 259


Topological Sorting

Definition 11 (Topological Sorting)


A topological sort of a dag G=(V,E) is a linear ordering of all its
vertices such that if G contains an edge (u,v), then u appears before v
in the ordering.

DFS exactly gives this ordering (see how they are printed)!

Lubos Omelina, Bart Jansen Algo Data 2023-2024 221 / 259


p r i v a t e v o i d DFS( Node c u r r e n t )
{
current . v i s i t e d = true ;
f o r ( i n t i =0; i<c u r r e n t . e d g e s . s i z e ( ) ; i ++)
{
Edge e = ( Edge ) c u r r e n t . e d g e s . g e t ( i ) ;
Node n e x t = ( Node ) e . toNode ;
i f ( ! n e x t . v i s i t e d ) DFS( n e x t ) ;
}
System . o u t . p r i n t l n ( c u r r e n t . i n f o ) ;
}

p u b l i c v o i d DFS ( )
{
f o r ( i n t i =0; i<n o d e s . s i z e ( ) ; i ++)
{
Node c u r r e n t = ( Node ) n o d e s . g e t ( i ) ;
current . visited = false ;
}
f o r ( i n t i =0; i<n o d e s . s i z e ( ) ; i ++)
{
Node c u r r e n t = ( Node ) n o d e s . g e t ( i ) ;
i f ( ! c u r r e n t . v i s i t e d ) DFS( c u r r e n t ) ;
}
}

Lubos Omelina, Bart Jansen Algo Data 2023-2024 222 / 259


Cycle checking

If there is a cycle, there does not exist a topological sorting.


So, cycle checking in O(|V | + |E |).

Lubos Omelina, Bart Jansen Algo Data 2023-2024 223 / 259


Definition 12
Back edges are those edges (u,v) connecting a vertex u to an ancestor
v in a depth-first tree. We consider self-loops, which may occur in
directed graphs, to be back edges.

Lubos Omelina, Bart Jansen Algo Data 2023-2024 224 / 259


Theorem 10
A directed graph G is acyclic if and only if a depth-first search of G yields
no back edges.

Proof.
Assume G has a cycle c. Define v to be the first vertex in c discovered in
the DFS. Let (u,v) be an edge in c. The DFS from v will explore all
vertices reachable from v, also u. So, there is a path from v to u in G. So,
(u,v) is a back edge.
Assume the DFS results in a back edge (u,v). Then v is an ancestor of u
in G. So, there is a path from u to v in G. This path, together with (u,v)
forms a cycle.

Lubos Omelina, Bart Jansen Algo Data 2023-2024 225 / 259


Towards cycle checking in adjacency matrix notation

In an acyclic graph with n vertices, there can be at most n edges.


In an acyclic graph with n vertices, the longest path can contain at
most n - 1 edges.

Lubos Omelina, Bart Jansen Algo Data 2023-2024 226 / 259


Towards cycle checking in adjacency matrix notation

Let A be the adjacency representation of G.


If trace(A) > 0, then there is at least one self loop. (i.e. a cycle of
length 1)
If trace(A ∗ A) > 0, then there is at least one cycle of length 2.
If trace(An ) > 0, then there is at least one cycle of length n.
Consequently, a graph with n nodes has no cycles if for all k smaller
than n, trace(Ak ) = 0.

Lubos Omelina, Bart Jansen Algo Data 2023-2024 227 / 259


Why is that?

Pn
Consider C = A x B, then cij = k=1 aik bkj
So in A2 , aij = nk=1 aik akj
P

aij > 0 in A2 if there is a node k such that there is an edge (i,k) and
an edge (k,j)
(which is a path of length 2).
So, in A2 , an element on the diagonal is larger than zero if there is an
element with a path of length two to itself.

Lubos Omelina, Bart Jansen Algo Data 2023-2024 228 / 259


Shortest path algorithms

We are looking for the path with the smallest total weight.
We are assuming there is no heuristic available (otherwise A∗ ).
We focus on algorithms for finding Single-Source shortest paths, i.e.
algorithms that find the shortest path to all reachable nodes from a
given node.

Lubos Omelina, Bart Jansen Algo Data 2023-2024 229 / 259


Shortest path algorithms

Theorem 11 (Subpaths of a shortest path are also shortest paths)


Let p = v0 . . . vk be the shortest path from v0 to vk and let
pij = vi . . . vj with 0 ≤ i ≤ j ≤ k be the subpath from vi to vj , then pij
is the shortest path from vi to vj .

Proof.
0i p pij pjk
p can be written as v0 −→ vi −→ vj −→ vk and hence
w (p) = w (p0i ) + w (pij ) + w (pjk ). Assume there is a path pij0 from vi to vj
p pij0 pjk
such that w (pij0 ) < w (pij ), then p 0 = v0 −→
0i
vi −→ vj −→ vk is a path
from v0 to vk with a weight w (p ) = w (p0i ) + w (pij0 ) + w (pjk ) < w (p).
0

This contradicts the assumption that p is the shortest path from v0 to vk .

Lubos Omelina, Bart Jansen Algo Data 2023-2024 230 / 259


Shortest path algorithms

Theorem 12 (Shortest paths and cycles)


A shortest path cannot have cycles.

Lubos Omelina, Bart Jansen Algo Data 2023-2024 231 / 259


Relaxation Based Algorithms

Many shortest path algorithms are based on a relaxation principle:


To each vertex v ∈ V a number v .d is associated, representing the
current estimated distance from the source vertex s to the current
vertex v .
For all nodes v ∈ V , v .d = ∞, except for s.d = 0.
The process of relaxing an edge (u, v ) consists of testing whether we
can improve the shortest path to v found so far by going through u
and, if so, updating v .d.

Lubos Omelina, Bart Jansen Algo Data 2023-2024 232 / 259


Relaxation

u v u v
2 2
5 9 5 6 R ELAX .u; ; w/
RELAX(u,v,w) RELAX(u,v,w) 1 if :d > u:d C w.u; /
u v u v 2 :d D u:d C w.u; /
2 2
5 7 5 6 3 : D u
(a) (b)

Lubos Omelina, Bart Jansen Algo Data 2023-2024 233 / 259


The Bellman-Ford Algorithm

B ELLMAN -F ORD .G; w; s/


1 I NITIALIZE -S INGLE -S OURCE .G; s/
2 for i D 1 to jG:Vj ! 1
3 for each edge .u; / 2 G:E
4 R ELAX .u; ; w/
5 for each edge .u; / 2 G:E
6 if :d > u:d C w.u; /
7 return FALSE
8 return TRUE

Lubos Omelina, Bart Jansen Algo Data 2023-2024 234 / 259


Proving the Bellman-Ford Algorithm

Lemma 3 (The path relaxtion property)


If p = v0 . . . vk is a shortest path from s = v0 to vk , and we relax the
edges of p in the order (v0 , v1 ), (v1 , v2 ), . . . , (vk−1 , vk ), then
vk .d = δ(s, vk ) (δ(s, vk )is the weight of the shortest path from s to
vk ).

Lubos Omelina, Bart Jansen Algo Data 2023-2024 235 / 259


Proving the Bellman-Ford Algorithm

Theorem 13
Let G=(V,E) be a weighted, directed graph with source s and weight
function w : E → R, and assume that G contains no
(negative-weight) cycles that are reachable from s. Then, after the
|V | − 1 iterations of the for loop of lines 24 of BELLMAN-FORD, we
have v .d = δ(s, v ) for all vertices v that are reachable from s.

Lubos Omelina, Bart Jansen Algo Data 2023-2024 236 / 259


Proof.
Consider any vertex v that is reachable from s, and let p = v0 . . . vk , where
v0 = s and vk = v , be any shortest path from s to v . Because shortest
paths are simple, p has at most |V | − 1 edges, and so k ≤ |V | − 1. Each
of the |V | − 1 iterations of the for loop of lines 24 relaxes all |E | edges.
Among the edges relaxed in the ith iteration, for i = 1, . . . , k is (vi−1 , vi ).
By the path-relaxation property, therefore, v .d = vk .d = δ(s, vk ) = δ(s, v ).

Lubos Omelina, Bart Jansen Algo Data 2023-2024 237 / 259


The Bellman-Ford Algorithm

This is a straight forward approach that runs in O(VE ).


The longest path (without cycle) has |V | − 1 edges.
It repeats |V | − 1 times a relaxation on all edges.
At each step i, at least the edge (vi − 1, vi ) is relaxed.
(So, if we could traverse the nodes in ”the right order”, we would
need far less passes.)

Lubos Omelina, Bart Jansen Algo Data 2023-2024 238 / 259


DAG -S HORTEST-PATHS .G; w; s/
1 topologically sort the vertices of G
2 I NITIALIZE -S INGLE -S OURCE .G; s/
3 for each vertex u, taken in topologically sorted order
4 for each vertex  2 G:AdjŒu
5 R ELAX .u; ; w/

Lubos Omelina, Bart Jansen Algo Data 2023-2024 239 / 259


Sorting

Lubos Omelina, Bart Jansen Algo Data 2023-2024 240 / 259


Insertion Sort on Linked Lists

A very simple idea.


Traverse the list and for each element, insert it at the right position in
a new list.
Almost no code due to addSorted on Linked List.
The list is copied rather than sorted in place.
O(n2 ).
Not good, but it is a start.

Lubos Omelina, Bart Jansen Algo Data 2023-2024 241 / 259


Insertion Sort: Simple, a naive version

public LinkedList insertionSort ()


{
L i n k e d L i s t r e s u l t = new L i n k e d L i s t ( ) ;
L i s t E l e m e n t d = head ;
w h i l e ( d != n u l l )
{
r e s u l t . addSorted (d . f i r s t ( ) ) ;
d = d. rest ();
}
return result ;
}

Lubos Omelina, Bart Jansen Algo Data 2023-2024 242 / 259


Insertion Sort: on vector, in place and more efficient

We consider the vector to have two parts: a first part that is already
sorted and a second part that still needs to be sorted.
Start with the first element in the first part and the second part to be
the rest.
Take all elements from the still to be sorted part and move them one
by one in the sorted part.

Lubos Omelina, Bart Jansen Algo Data 2023-2024 243 / 259


Insertion Sort: on vector, in place and more efficient

1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6
(a) 5 2 4 6 1 3 (b) 2 5 4 6 1 3 (c) 2 4 5 6 1 3

1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6
(d) 2 4 5 6 1 3 (e) 1 2 4 5 6 3 (f) 1 2 3 4 5 6

Lubos Omelina, Bart Jansen Algo Data 2023-2024 244 / 259


Insertion Sort: on vector and in place!

p r i v a t e v o i d i n s e r t S o r t e d ( Comparable o , i n t l a s t )
{
i f ( isEmpty ( ) ) data [ 0 ] = o ;
else
{
int i = last ;
w h i l e ( ( i >=0)&&(d a t a [ i ] . compareTo ( o ) > 0 ) )
{
d a t a [ i +1] = d a t a [ i ] ;
i −−;
}
d a t a [ i +1]=o ;
}
}

public void i n s e r t i o n S o r t ()
{
f o r ( i n t i = 1 ; i<c o u n t ; i ++)
{
i n s e r t S o r t e d ( d a t a [ i ] , i −1);
}
}

Lubos Omelina, Bart Jansen Algo Data 2023-2024 245 / 259


Tree Sort

Was explained when introducing BST


Traverse the list, for each element, insert it in a BST.
Traverse the BST and insert each element in a new Linked List
(if the tree is balanced) nO(log n) + nO(1) = O(n log n)

Lubos Omelina, Bart Jansen Algo Data 2023-2024 246 / 259


Tree Sort

public LinkedList treeSort ()


{
Tree s o r t T r e e = l i s t 2 T r e e ( ) ;
f i n a l L i n k e d L i s t r e s u l t = new L i n k e d L i s t ( ) ;
s o r t T r e e . t r a v e r s e I n O r d e r ( new T r e e A c t i o n ( )
{
p u b l i c v o i d r u n ( T r e e . TreeNode n )
{
r e s u l t . addFirst (n . getValue ( ) ) ;
}
});
return result ;
}

p u b l i c Tree l i s t 2 T r e e ( )
{
T r e e s o r t T r e e = new T r e e ( ) ;
L i s t E l e m e n t d = head ;
w h i l e ( d!= n u l l )
{
sortTree . i n s e r t (d . f i r s t ( ) ) ;
d = d. rest ();
}
return sortTree ;
}

Lubos Omelina, Bart Jansen Algo Data 2023-2024 247 / 259


Merge Sort: A divide and conquer approach

Split the vector in two parts: the first half and the second half.
Sort the first half and Sort the second half.
I This is done by doing the merge sort on the first half and the second
half.
I If a vector has no elements or one element, the sorting is trivial.
Merge both sorted halves together.

Lubos Omelina, Bart Jansen Algo Data 2023-2024 248 / 259


Sorting [5 2 4 7 1 3 2 6]

sorted sequence
1 2 2 3 4 5 6 7

merge

2 4 5 7 1 2 3 6

merge merge

2 5 4 7 1 3 2 6

merge merge merge merge

5 2 4 7 1 3 2 6
initial sequence

Lubos Omelina, Bart Jansen Algo Data 2023-2024 249 / 259


Merging two sorted vectors
s t a t i c V e c t o r merge ( V e c t o r v1 , V e c t o r v2 )
{
i n t l e n g t h C = v1 . s i z e ()+ v2 . s i z e ( ) ;
V e c t o r r e s u l t = new V e c t o r ( ) ;
i n t indA = 0 ;
i n t indB = 0 ;
i n t indC = 0 ;
w h i l e ( ( i n d C < l e n g t h C ) && ( indA < v1 . s i z e ())&&
( indB < v2 . s i z e ( ) ) )
{
i f ( v1 . g e t ( indA ) . compareTo ( v2 . g e t ( indB ) ) < 0 )
r e s u l t . s e t ( i n d C++, v1 . g e t ( indA ++));
e l s e r e s u l t . s e t ( i n d C++, v2 . g e t ( indB ++));
}

f o r ( ; indA < v1 . s i z e ( ) ; indA++)


{
r e s u l t . s e t ( i n d C++, v1 . g e t ( indA ) ) ;
}
f o r ( ; indB < v2 . s i z e ( ) ; indB++)
{
r e s u l t . s e t ( i n d C++, v2 . g e t ( indB ) ) ;
}
r e s u l t . count = lengthC ;

return result ;
}

Lubos Omelina, Bart Jansen Algo Data 2023-2024 250 / 259


Merge Sort

s t a t i c V e c t o r m e r g e S o r t A u x ( V e c t o r v , i n t from , i n t t o )
{
i f ( t o > from )
{
i n t h a l f = ( from + t o ) / 2 ;
V e c t o r v1 = m e r g e S o r t A u x ( v , from , h a l f ) ;
V e c t o r v2 = m e r g e S o r t A u x ( v , h a l f +1 , t o ) ;
V e c t o r r e s u l t = merge ( v1 , v2 ) ;
return result ;
}
else
{
V e c t o r r e s u l t = new V e c t o r ( ) ;
r e s u l t . a d d L a s t ( v . g e t ( from ) ) ;
return result ;
}
}

s t a t i c Vector mergeSort ( Vector v )


{
r e t u r n m e r g e S o r t A u x ( v , 0 , v . co unt −1);
}

Lubos Omelina, Bart Jansen Algo Data 2023-2024 251 / 259


Time Complexity of Merge Sort

Theorem 14
The time complexity of merge sort is O(n log n).

Proof.
The number of comparisons in merge sort is given by
T (n) = 2T (n/2) + n − 1
Hence, T (n) = aT (n/b) + f (n) with a = 2, b = 2 and f (n) = n − 1.
According to theorem 8, the solution to this recurrence relation is
Plog n/ log b i
a f (n/b i ) = log
P n i i
given by T (n) = i=0 i=0 2 (n/2 − 1).

Lubos Omelina, Bart Jansen Algo Data 2023-2024 252 / 259


Time Complexity of Merge Sort

log
Xn
T (n) = 2i (n/2i − 1)
i=0
log
Xn
= (n − 2i )
i=0
= n(log n + 1) − (2n − 1)1
= n log n − n + 1
= O(n log n)

Plog n
because i=0 2i = 20 + 21 + ... + 2log n = 1 + ... + n/2 + n = 2n − 1

Lubos Omelina, Bart Jansen Algo Data 2023-2024 253 / 259


Can we do better than O(n log n) ?

It depends . . .
In Comparison Sort (i.e. determining the sorted order by comparing
elements), we cannot do better.
Why is this the lower bound on comparison sorting?
This slide seems to imply that one can sort without comparing
elements ?!?!?

Lubos Omelina, Bart Jansen Algo Data 2023-2024 254 / 259


Comparison Sort

In order to sort a sequence of n elements, we will make a series of


pair-wise comparisons to determine the appropriate sequence of
elements.
This sequence of comparisons can be seen as a binary tree (see next
slide), in which each node is a comparison.
The leafs of the nodes are the possible orderings of the n input
elements.
A sequence of n elements has n! different permutations, so there
must be at least n! leafs in the tree.

Lubos Omelina, Bart Jansen Algo Data 2023-2024 255 / 259


Comparison Sort

1:2
≤ >
2:3 1:3
≤ > ≤ >
〈1,2,3〉 1:3 〈2,1,3〉 2:3
≤ > ≤ >
〈1,3,2〉 〈3,1,2〉 〈2,3,1〉 〈3,2,1〉

Lubos Omelina, Bart Jansen Algo Data 2023-2024 256 / 259


A lower bound for comparison sort

Theorem 15
Any decision tree that sorts n elements has a height of Ω(n log n).

Proof.
Consider a tree of height h that sorts n elements. Since there are n!
permutations of n elements, each representing a distinct sorted order,
the tree must have at least n! leaves.
Since a binary tree of height h has no more than 2h leaves, we have
that n! ≤ 2h and hence:
h ≥ log(n!)2 ≥ log(n/e)n = n log n − n log e = Ω(n log n)

2

Stirling’s formula: n! ≈ (n/e)n 2πn
Lubos Omelina, Bart Jansen Algo Data 2023-2024 257 / 259
Beyond Comparison Sort

If we now that we are sorting a vector containing only natural


numbers . . .
and if we know that the largest number that can occur is K . . .
we can do better!

Lubos Omelina, Bart Jansen Algo Data 2023-2024 258 / 259


Counting Sort

We keep a vector of length K that will hold the number of times each
number occurs in the sequence.
We need to traverse the entire vector and update the counts.
From the counts, we compute the positions in the sorted vector of the
new element.
We traverse the vector once more and copy all elements to these
computed positions.
O(n + k) !!!

Lubos Omelina, Bart Jansen Algo Data 2023-2024 259 / 259

You might also like