Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 33

Data Structures- Recap

What is the "Data Structure" ?


 Ways to represent data
Why data structure ?
 To design and implement large-scale computer
system
 Have proven correct algorithms
 The art of programming
How to master in data structure ?
 practice, discuss, and think
System Life Cycle
Summary
 RADRCV
Requirements
 What inputs, functions, and outputs
Analysis
 Break the problem down into manageable pieces
 Top-down approach
 Bottom-up approach
System Life Cycle(Cont.)
Design
 Create abstract data types and the algorithm
specifications, language independent
Refinement and Coding
 Determining data structures and algorithms
Verification
 Developing correctness proofs, testing the
program, and removing errors
Verification
Correctness proofs
 Prove program mathematically
 time-consuming and difficult to develop for large
system
Testing
 Verify that every piece of code runs correctly
 provide data including all possible scenarios

Error removal
 Guarantee no new errors generated

Notes
 Select a proven correct algorithm is important
 Initial tests focus on verifying that a program runs correctly, then
reduce the running time
What is the module all about?

Design and Analysis of Algorithms

What is an Algorithm?
What is the module all about?
The theoretical study of design and analysis of
computer algorithms

Basic goals for an algorithm:


• always correct
• always terminates
• This class: performance
 Performance often draws the line between
what is possible and what is impossible.
Analysis of Algorithms

Input Algorithm Output

An algorithm is a step-by-step procedure for


solving a problem in a finite amount of time.
Design and Analysis of Algorithms

• Analysis: predict the cost of an algorithm in terms of


resources and performance

• Design: design algorithms which minimize the cost


Algorithm Specification
Definition
 An algorithm is a finite set of instructions that, if followed,
accomplishes a particular task. In addition, all algorithms
must satisfy the following criteria:
(1)Input.
(2)Output
(3)Definiteness
(4)Finiteness
(5)Effectiveness.

Meaning of the above criterions?????


Algorithm Specification
Definition
 An algorithm is a finite set of instructions that, if followed,
accomplishes a particular task. In addition, all algorithms must satisfy
the following criteria:
(1)Input. There are zero or more quantities that are externally supplied.
(2)Output. At least one quantity is produced.
(3)Definiteness. Each instruction is clear and unambiguous.
(4)Finiteness. If we trace out the instructions of an algorithm, then for
all cases, the algorithm terminates after a finite number of steps.
(5)Effectiveness. Every instruction must be basic enough to be carried
out, in principle, by a person using only pencil and paper. It is not
enough that each operation be definite as in (3); it also must be
feasible.
Describing Algorithms
Natural language
 English, Chinese

 Instructions must be definite and effectiveness


Graphic representation
 Flowchart

 work well only if the algorithm is small and simple


Pseudo language
 Readable

 Instructions must be definite and effectiveness

Combining English and C++, Python


Translating a Problem into an Algorithm
Problem
 Devise a program that sorts a set of n>= 1

integers.

How can we generate an algorithm?


Translating a Problem into an Algorithm
Problem
 Devise a program that sorts a set of n>= 1 integers

Step I - Concept
 From those integers that are currently unsorted, find the

smallest and place it next in the sorted list


Step II - Algorithm
 for (i= 0; i< n; i++) {

Examine list[i] to list[n-1] and suppose that the


smallest integer is list[min];
Interchange list[i] and list[min];
}
Translating a Problem into an Algorithm(Cont.)
Step III - Coding

void sort(int *a, int n)


{
for (i= 0; i< n; i++)
{
int j= i;
for (int k= i+1; k< n; k++){
if (a[k ]< a[ j]) j= k;
int temp=a[i]; a[i]=a[ j]; a[ j]=temp;
}
}
Correctness Proof

Theorem
 Function sort(a, n) correctly sorts a set of n>= 1

integers. The result remains in a[0], ..., a[n-1]


such that a[0]<= a[1]<=...<=a[n-1].
Proof:
For i= q, following the execution of line 6-11, we
have a[q]<= a[r], q< r< =n-1.
For i> q, observing, a[0], ..., a[q] are unchanged.
Hence, increasing i, for i= n-2, we have
a[0]<= a[1]<= ...<=a[n-1]
Analysis of algorithms
Issues:
 correctness
 time efficiency
 space efficiency
 optimality

Approaches:
 theoretical analysis
 empirical analysis
16
Running Time
Most algorithms transform best case

input objects into output average case


worst case
objects. 120

The running time of an 100


algorithm typically grows

Running Time
80
with the input size.
60
Average case time is often
difficult to determine. 40

We focus on the worst case 20

running time. 0
1000 2000 3000 4000
 Easier to analyze Input Size
 Crucial to applications such as
games, finance and robotics
Analysis of Algorithms 17
Experimental Studies
Write a program 9000

implementing the 8000

algorithm 7000

Run the program with 6000

Time (ms)
inputs of varying size and 5000
composition 4000
Use a method like 3000
System.currentTimeMillis() to 2000
get an accurate measure
1000
of the actual running time
0
Plot the results 0 50 100
Input Size

Analysis of Algorithms 18
Limitations of Experiments
It is necessary to implement the
algorithm, which may be difficult
Results may not be indicative of the
running time on other inputs not included
in the experiment.
In order to compare two algorithms, the
same hardware and software
environments must be used

Analysis of Algorithms 19
Theoretical Analysis
Uses a high-level description of the
algorithm instead of an implementation
Characterizes running time as a function
of the input size, n.
Takes into account all possible inputs
Allows us to evaluate the speed of an
algorithm independent of the
hardware/software environment
Analysis of Algorithms 20
Theoretical analysis of time
efficiency
Time efficiency is analyzed by determining
the number of repetitions of the basic
operation as a function of input size
Basic operation: the operation that
contributes most towards the running time
of the algorithm
input size

T(n) ≈ copC(n)
running time execution time Number of times
for basic operation basic operation is
executed 21
Input size and basic operation examples

Problem Input size measure Basic operation

Searching for key in a Number of list’s items, i.e.


Key comparison
list of n items n

Multiplication of two Matrix dimensions or total


??
matrices number of elements

Checking primality of a n’size = number of digits


??
given integer n (in binary representation)

Typical graph problem #vertices and/or edges ??

22
Input size and basic operation examples

Problem Input size measure Basic operation

Searching for key in a Number of list’s items, i.e.


Key comparison
list of n items n

Multiplication of two Matrix dimensions or total Multiplication of two


matrices number of elements numbers

Checking primality of a n’size = number of digits


Division
given integer n (in binary representation)

Visiting a vertex or
Typical graph problem #vertices and/or edges
traversing an edge

23
Empirical analysis of time efficiency

Select a specific (typical) sample of inputs

Use physical unit of time (e.g., milliseconds)


or
Count actual number of basic operation’s
executions

Analyze the empirical data

24
Best-case, average-case, worst-case
For some algorithms efficiency depends on form of input:
Worst case: Cworst(n) – maximum over inputs of size n

Best case: Cbest(n) – minimum over inputs of size n

Average case: Cavg(n) – “average” over inputs of size n

 Number of times the basic operation will be executed on


typical input
 Expected number of basic operations considered as a
random variable under some assumption about the
probability distribution of all possible inputs
25
Pseudocode
High-level description Example: find max
of an algorithm element of an array
More structured than Algorithm arrayMax(A, n)
English prose Input array A of n integers
Less detailed than a Output maximum element of A
program
Preferred notation for currentMax  A[0]
describing algorithms for i  1 to n  1 do
Hides program design if A[i]  currentMax then
issues currentMax  A[i]
return currentMax

Analysis of Algorithms 26
Pseudocode Details
Control flow Method call
 if … then … [else …] var.method (arg [, arg…])
 while … do … Return value
 repeat … until … return expression
 for … do … Expressions
 Indentation replaces braces  Assignment
(like  in Java)
Method declaration  Equality testing
Algorithm method (arg [, arg…]) (like  in Java)
Input … n2 Superscripts and other
Output … mathematical
formatting allowed

Analysis of Algorithms 27
The Random Access Memory
(RAM) Model
A CPU

An potentially unbounded
bank of memory cells, 2
1
each of which can hold an 0
arbitrary number or
character
Memory cells are numbered and accessing
any cell in memory takes unit time.
Analysis of Algorithms 28
Primitive Operations
Basic computations
Examples:
performed by an algorithm  Evaluating an
Identifiable in pseudocode expression
Largely independent from the  Assigning a value
to a variable
programming language  Indexing into an
Assumed to take a constant array
amount of time in the RAM  Calling a method
model  Returning from a
method

Analysis of Algorithms 30
Growth Rate of Running Time
Changing the hardware/ software
environment
 Affects T(n) by a constant factor, but
 Does not alter the growth rate of T(n)
The linear growth rate of the running
time T(n) is an intrinsic property of
algorithm arrayMax

Analysis of Algorithms 31
Seven Important Functions
Seven functions that
often appear in 1E+29
algorithm analysis: 1E+27 Cubic
 Constant  1 1E+25 Quadratic
1E+23
 Logarithmic  log n 1E+21 Linear
 Linear  n 1E+19
1E+17
 N-Log-N  n log n
T(n)
1E+15
 Quadratic  n2 1E+13
 Cubic  n3 1E+11
1E+9
 Exponential  2n 1E+7
1E+5
In a log-log chart, the 1E+3
1E+1
slope of the line 1E-1
corresponds to the 1E-1 1E+2 1E+5 1E+8

growth rate of the n


function
Analysis of Algorithms 32
Constant Factors
1E+25
The growth rate is 1E+23
Quadratic

not affected by 1E+21


Quadratic
Linear
1E+19
 constant factors or 1E+17 Linear
 lower-order terms 1E+15
T(n)
1E+13
Examples 1E+11
 102n  105 is a linear 1E+9
1E+7
function 1E+5
 105n2  108n is a 1E+3
quadratic function 1E+1
1E-1
1E-1 1E+2 1E+5 1E+8
n

Analysis of Algorithms 33
Math you need to Review
Summations
Logarithms and Exponents
properties of logarithms:
logb(xy) = logbx + logby
logb (x/y) = logbx - logby
logbxa = alogbx
logba = logxa/logxb
properties of exponentials:
a(b+c) = aba c
Proof techniques abc = (ab)c
Basic probability ab /ac = a(b-c)
b = a logab
bc = a c*logab

Analysis of Algorithms 34

You might also like