Chapter 1

Data Structures and Algorithms
(CoSc2092)
Chapter contents
▪Introduction to Data structures
✓Classification of data structure
✓Abstract Data Types
✓Abstraction
▪Algorithms
✓Properties of algorithms
✓Algorithm Analysis Concept
✓Complexity Analysis
▪Asymptotic analysis
2
Introduction
▪What is Data Structure?
✓It is a set of procedures to define, store, access and manipulate data
✓It is the organized collection of data
✓It is the logical or mathematical model of a particular organization
of data
▪ Program = Data Structure + Algorithm
✓Data Structure is the way data is organized in a computer’s memory so
that it can be used effectively and efficiently.
✓Algorithm is the sequence of computational steps to solve a problem.
▪ Data Structure is a language construct that the programmer has
defined in order to implement an abstract data type.
3
The easiest way to understand DS
▪Let’s understand Data structure with the help of real Life
▪Examples:
✓In dictionary, words are arranged alphabetically in a dictionary.
You can quickly and efficiently search and discover a term.
✓A business cash-in-cash-out statement. Much as certain data
structures, it is straightforward to aggregate and extract data in
orderly columns.
4
Why Data Structure is Important?
▪ Data structures organize data => more efficient programs.
▪ More powerful computers => more complex applications.
▪ More complex applications demand more calculations.
▪ Complex computing tasks are unlike our everyday experience
▪ The choice of data structure and algorithm can make the
difference between a program running in a few seconds or many
days.
▪ You can not do anything without data on the world.
5
Why Data Structure is Important?
▪ Computer Science deal with storing, organizing and retrieving data
effectively.
▪ Computer Programmer should get data from user or another sources and use
them.
▪ It is essential ingredient in creating fast and powerful algorithms.
▪ For this purpose you have to use data structure for every software program or
system.
▪ Each data structure has costs and benefits.
▪ Rarely is one data structure better than another in all situations.
▪ A data structure requires:
✓Space for each data item it stores,
✓Time to perform each basic operation,
✓Programming effort.
6
Classifications Data Structure
▪ There are two types of data structure available for the programming purpose:
✓Primitive data structure: is a fundamental type of data structure that stores the data of
only one type .
✓Non-primitive data structure: is a type of data structure which is a user-defined that
stores the data of different types in a single entity.
7
Abstraction
▪ It is an abstraction of a data structure that provides only the interface to
which the data structure must adhere.
▪ The interface does not give any specific details about something should be
implemented or in what programming language.
▪ Abstraction: It is a technique of hiding the internal details from the user and
only showing the necessary details to the user.
▪ Encapsulation: It is a technique of combining the data and the member
function in a single unit is known as encapsulation.
8
What is Abstract Data Type?
▪ An Abstract Data Type is composed of:
✓A collection of data / set of values
✓A set of operations on data
▪ Specification of ADT indicate
✓What the ADT operations do, not how to implement them
▪ Implementation of an ADT
✓Includes choosing a particular data structure
✓DS is a constructor that can be defined in a programming language to
store a collection of data.
9
What is ADT?
▪ ADT consists of an abstract data structure and operations
▪ Specifies:
✓What can be stored in the ADT
✓What operations can be done on/by the ADT
▪ E.g. if we are going to model employees of an organization:
✓ADT stores employees with
• their relevant attributes and
• discarding irrelevant attributes.
✓ADT supports
• hiring, firing, retiring, … operations.
▪ An ADT tells what is to be done and data structure tells how it is to be
done.
▪ In other words, we can say that ADT gives us the blueprint while data
structure provides the implementation part.
10
What is ADT?
11
What is ADT?
▪ There are lots of formalized and standard Abstract data types
✓such as Stacks, Queues, Trees, etc.
12
What is Algorithm?
▪ It is a process or a set of rules required to perform calculations or
some other problem-solving operations especially by a computer.
▪ The formal definition:
✓It contains the finite set of instructions which are being carried in a
specific order to perform the specific task.
✓It is not the complete program or code; it is just a solution (logic) of a
problem, which can be represented either as an informal description
using a Flowchart or Pseudocode.
13
Properties of an Algorithm
▪ The following are the characteristics of an algorithm:
✓ Input
✓ Output
✓ Finiteness
✓ Definiteness
✓ Sequence
✓ Feasibility
✓ Correctness
✓ Language Independent
14
Why do we need Algorithms?
▪ We need algorithms because of the following reasons:
✓Scalability:
• It helps us to understand the scalability.
• When we have a big real-world problem, we need to scale it down into small-
small steps to easily analyze the problem.
✓Performance:
• The real-world is not easily broken down into smaller steps.
• If the problem can be easily broken into smaller steps means that the problem is
feasible.
15
Importance of Algorithms
▪ Theoretical importance:
✓When any real-world problem is given to us and we break the problem into
small-small modules.
✓To break down the problem, we should know all the theoretical aspects.
▪ Practical importance:
✓As we know that theory cannot be completed without the practical
implementation.
✓So, the importance of algorithm can be considered as both theoretical and
practical.
▪ Issues of Algorithms
▪ The following are the issues that come while designing an algorithm:
✓How to design algorithms
✓How to analyze algorithm efficiency
16
Algorithm Analysis Concept
▪ It refers to the process of determining the amount of computing time and
storage space required by different algorithms.
▪ It is a process of predicting the resource requirement of algorithms in a
given environment.
▪ Main resources are:
✓Running Time
✓Memory Usage
✓Communication Bandwidth
17
Algorithm Analysis
▪ The algorithm can be analyzed in two levels
✓First is before creating the algorithm and
✓Second is after creating the algorithm.
▪ The following are the two analysis of an algorithm:
✓Priori Analysis:
• Theoretical analysis of an algorithm which is done before implementing
the algorithm. Various factors can be considered before implementing the
algorithm like processor speed, which has no effect on the implementation
part.
✓Posterior Analysis:
• practical analysis of an algorithm.
• The practical analysis is achieved by implementing the algorithm using any
programming language.
• This analysis basically evaluate that how much running time and space
taken by the algorithm.
18
Complexity Analysis of Algorithm
▪ Complexity Analysis of Algorithm is the systematic study of the cost of
computation measured either:
✓in time units or
✓in operations performed, or
✓in the amount of storage space required
▪ The goal is to have a meaningful measure that permits comparison of
algorithms independent of operating platform.
19
Complexity Analysis of Algorithm
▪ The performance of the algorithm can be measured in two factors:
✓Time complexity:
• The amount of time required to complete the execution.
• Is denoted by the big O notation. Here, big O notation is the asymptotic
notation to represent the time complexity.
• Is mainly calculated by counting the number of steps to finish the
execution.
✓Space complexity:
• The amount of space required to solve a problem and produce an output.
• Similar to the time complexity, space complexity is also expressed in big O
notation.
20
How to measure the efficiency of algorithms?
▪ There are two approaches, Empirical & Theoretical.
▪ Empirical:
✓competing algorithms and trying them on different instances.
▪ Theoretical:
✓Determining the quantity of resources required mathematically
(Execution time, memory space, etc.) needed by each algorithm
▪ However, it is difficult to use actual clock-time as a consistent
measure of an algorithm‘s efficiency, because clock-time can vary
based on many things. For example:
✓Specific processor speed, Current processor load
✓Specific data for a particular run of the program (Input Size, Input
Properties)
✓Operating Environment 21
Cont...
▪ An algorithm’s performance depends on internal and external
factors.
▪ Internal:
✓The algorithm’s efficiency, in terms of:
• Time required to run
• Space (memory storage) required to run
▪ External:
✓The algorithm’s efficiency, in terms of:
• Size of the input to the algorithm
• Speed of the computer on which it is run
• Quality of the compiler
22
Cont...
▪ What metric should be used to judge performance of an algorithms?
✓Length of the program (lines of code)
✓Ease of programming (bugs, maintenance)
✓Memory required
✓Running time
▪ Running time is the dominant standard.
✓Quantifiable and easy to compare
▪ An algorithm may run differently depending on:
✓The hardware platform (PC, processor speed)
✓The programming language (C, Java, C++)
✓The programmer (you, me)
23
Phases of Complexity analysis
▪ Complexity analysis involves two distinct phases.
▪ Algorithm Analysis:
✓Analysis of the algorithm or data structure to produce a function T(n)
that describes the algorithm in terms of the operations performed in order
to measure the complexity of the algorithm.
▪ Order of Magnitude Analysis:
✓Analysis of the function T(n) to determine the general complexity
category to which it belongs.
24
Complexity analysis Rules
▪ We assume an arbitrary time unit.
▪ Execution of one of the following operations takes time 1:
✓Assignment Operation
✓Single Input/output Operation
✓Single Boolean Operations
✓Single Arithmetic Operations
✓Function Return
25
Cont...
▪ Running time of a selection statement (if, switch)is:
✓The time for the condition evaluation + The maximum of the running times
for the individual clauses in the selection.
▪ Loops:
✓Running time for a loop is equal to:
• The running time for the statements inside the loop * number of iterations.
✓The total running time of a statement inside a group of nested loops is:
• The running time of the statements multiplied by the product of the sizes of all
the loops.
✓For nested loops, analyze inside out.
✓Always assume that the loop executes the maximum number of iterations
possible.
▪ Running time of a function call is:
✓1 for setup + the time for any parameter calculations + the time required
for the execution of the function body.
26
void func()
Example Time Units to Compute
{ -------------------------------------------------
int x=0; 1 for the first assignment statement: x=0;
int i=0; 1 for the second assignment statement: i=0;

1 for the third assignment statement: j=1;
int j=1;
1 for the output statement.
cout<<"Enter an Integer value";
1 for the input statement.
cin>>n;
In the first while loop:
while (i<n)
n+1 tests
{ x++;
n loops of 2 units for the two increment (addition)
i++; operations
} In the second while loop:
while (j<n) n tests
{ j++; n-1 increments
} -------------------------------------------------------
T (n)= 1+1+1+1+1+n+1+2n+n+n-1 = 5n+5 = O(n)
}
27
???
Next class…
28
Asymptotic Analysis
▪ While checking the running time for any algorithm, we want three
things:
✓The algorithm should be machine-independent.
✓It should work on all possible inputs.
✓It should be general and not programming language-specific.
▪ However, we never do accurate analysis, rather we do approximate
analysis. We are interested in the increase in the running time with
the increase in the input size.
▪ Thus, we are only interested in the ‘growth of the function’, which
we can measure/analyses through asymptotic analysis.
29
Asymptotic Analysis
▪ The term asymptotic means approaching a value or curve
arbitrarily closely (i.e., as some sort of limit is taken).
▪ Asymptotic analysis:
✓It is a general methodology to compare or to find the efficiency of any
algorithm.
✓We measure the efficiency in terms of the growth of the function.
• The growth of any function depends on how much the running time is increasing
with the increase in the size of the input.
▪ Asymptotic analysis of an algorithm refers to defining the
mathematical boundary of its run-time performance.
30
Asymptotic Analysis
▪ The time required by an algorithm falls under three types.
✓Best Case:
• Minimum time required for program execution.
✓Average Case:
• Average time required for program execution.
✓Worst Case
• Maximum time required for program execution.
31
Asymptotic Notation
▪ Asymptotic Notation:
✓Whenever we want to perform analysis of an algorithm, we need to
calculate the complexity of that algorithm.
✓But when we calculate the complexity of an algorithm it does not provide
the exact amount of resource required.
✓So instead of taking the exact amount of resource, we represent that
complexity in a general form (Notation) which produces the basic nature
of that algorithm.
✓We use that general form (Notation) for analysis process.
✓So asymptotic notation is a language that allow us to analyze an
algorithm's running time by identifying its behavior as the input size for
the algorithm increases.
✓It is a mathematical representation of its complexity.
32
Asymptotic Notation
▪ In asymptotic notation:
✓When we want to represent the complexity of an algorithm, we use only
the most significant terms in the complexity of that algorithm and ignore
least significant terms in the complexity of that algorithm
• Here, complexity can be Space Complexity or Time Complexity.
▪ Example: T(n) = ck nk + ck-1 nk-1 + ck-2 nk-2 + … + c1n + co
✓too complicated
✓too many terms
✓Difficult to compare two expressions, each with 10 or 20 terms
▪ Do we really need that many terms?
33
Asymptotic Notation
▪ Keep just one term!
✓the fastest growing term (dominates the runtime)
▪ No constant coefficients are kept
✓Constant coefficients affected by machines, languages, etc.
▪ Asymptotic behavior (as n gets large) is determined entirely by the leading
term.
✓Example. T(n) = 10n3 + n2 + 40n + 800
• If n = 1,000, then T(n) = 10,001,040,800
• error is 0.01% if we drop all but the n3 term leading term.
34
Asymptotic Notation
▪ Majorly we use three types of asymptotic notations to represent
the complexity of an algorithms.
35
Asymptotic Notation
▪ Big-Oh Notation (O):
✓is used to define the upper bound of an algorithm in terms of Time
Complexity.
✓is always indicates the maximum time required by an algorithm for all
input values.
✓is describes the worst case of an algorithm time complexity.
▪ Consider function f(n) as time complexity of an algorithm and g(n)
is the most significant term.
✓If f(n) <= c.g(n) for all n >= n0, c > 0 and n0 >= 1.
✓Then we can represent f(n) as O(g(n)).
✓f(n)=O(g(n))
36
Asymptotic Notation
▪ Example:
▪ Consider the following f(n) and g(n)
✓f(n) = 3n + 2
✓g(n) = n
▪ If we want to represent f(n) as O(g(n)) then it must satisfy
✓f(n) <= c.g(n) for all n >= n0, c > 0 and n0 >= 1
✓f(n) <= c.g(n) ⇒3n + 2 <= cn
▪ Above condition is always TRUE for all values of
✓c = 4 and n >= 2.
▪ By using Big - Oh notation we can represent the time complexity as follows...
✓3n + 2 = O(n)
37
Asymptotic Notation
▪ Big-Omega Notation (Ω):
✓ is used to define the lower bound of an algorithm in terms of Time
Complexity.
✓is always indicates the minimum time required by an algorithm for all
input values.
✓describes the best case of an algorithm time complexity.
✓If f(n) >= c.g(n) for all n >= n0, c > 0 and n0 >= 1.
✓Then we can represent f(n) as Ω(g(n)).
✓f(n) = Ω(g(n))
38
Asymptotic Notation
▪ Example:
▪ Consider the following f(n) and g(n)
✓f(n) = 3n + 2
✓g(n) = n
▪ If we want to represent f(n) as Ω(g(n)) then it must satisfy
✓f(n) >= c.g(n) for all values of c > 0 and n0>= 1
✓f(n) >= c.g(n) ⇒3n + 2 >= C n
✓c = 1 and n >= 1.
▪ By using Big - Omega notation we can represent the time complexity as
follows
✓3n + 2 = Ω(n)
39
Asymptotic Notation
▪ Big-Theta Notation (Θ):
✓is used to define the average bound of an algorithm in terms of Time
Complexity.
✓is always indicates the average time required by an algorithm for all input
values.
✓is describes the average case of an algorithm time complexity.
✓If c1.g(n) <= f(n) <= c2. g(n) for all n >= n0, c1 > 0, c2 > 0 and n0 >= 1.
✓Then we can represent f(n) as Θ(g(n)).
✓f(n) = Θ(g(n))
40
Asymptotic Notation
▪ Example:
▪ Consider the following f(n) and g(n).
✓f(n) = 3n + 2
✓g(n) = n
▪ If we want to represent f(n) as Θ(g(n)) then it must satisfy
✓c1.g(n) <= f(n) <= c2.g(n) for all values of c1 > 0, c2 > 0 and n0>= 1
✓c1.g(n) <= f(n) <= c2.g(n) ⇒ c1n <= 3n + 2 <= c2n
✓c1 = 1, c2 = 4 and n >= 2.
▪ By using Big - Theta notation we can represent the time complexity as
follows.
✓3n + 2 = Θ(n)
41
End of Chapter!

Chapter 1

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Chapter 1

Uploaded by

Copyright:

Available Formats

Data Structures and Algorithms

int x=0; 1 for the first assignment statement: x=0;

int i=0; 1 for the second assignment statement: i=0;

You might also like