Download as pdf or txt
Download as pdf or txt
You are on page 1of 40

Introduction - Data structure

Lecture-Module1

Data Structure

Data structures in computer science deal with


the problem of structuring or organizing data in computer so that it can be used efficiently and processed by algorithms.

To a certain extent, data structure ideas are independent of the language or notation used to express data structures. Often a carefully chosen data structure will allow a more efficient algorithm to be used.

Cont

Data organized in different ways require different algorithms (ex: searching). The type of data structures expressible in a given programming language are determined by the features of that language. Data structures are implemented using the
data types, references and operations on them provided by a programming language.

Cont

Different kinds of data structures are suited to different kinds of applications, and some are highly specialized to certain tasks. In the design of many types of programs,
the choice of data structures is a primary design consideration for the quality and performance of the final result.

After the data structures are chosen, the algorithms to be used often become relatively obvious

Algorithm

A step by step procedure or instruction for solving a problem by a computer. Alternatively, it is a finite set of instructions which if followed accomplish a particular task. Algorithms and data structures are two important building blocks of a program. Many data structures have associated algorithms for working on that particular data structure.

Cont

Algorithm must satisfy the following criteria: Input :


There are 0 or more quantities supplied externally

Output:
Atleast one quantity is produced

Definiteness :
Each instruction is clear and unambiguous

Finiteness:
It terminates after finite steps

Effectiveness:
Each instruction is simple to be carried out manually.

Cont

Program is implementation of an algorithm in a computer programming language. Basic difference between algorithm and program is that
algorithm can be written in pseudo language whereas program in written in PL. program may be non termination (OS).

Cont

Study of algorithms can be identified in four distinct areas: How to devise algorithms How to express algorithms How to validate algorithms How to analyze algorithms

Devising Algorithm

It requires study techniques such as

of

various

design

divide and conquer (binary search, merge sort), dynamic programming (based on decision).

Dynamic programming is not a specific algorithm, but rather a programming technique.

Expressing Algorithm

Good algorithms are expressed using the


principles of structured programming.

Structured programming has one entry and one exit point in/from code

Validation of Algorithm

Algorithms are validated for


correctness for all possible data For this purpose test data is used. Test data should have correct, incorrect, exceptional cases.

Analysis of Algorithm

Analysis of algorithm is a study of behavior pattern or performance profile. It can be calculated in terms of computing time and space requirement in machine.

Data Structures and Algorithms


Algorithm: Outline, the essence of a computational procedure, step-by-step instructions Program: an implementation of an algorithm in some programming language Data structure: Organization of data needed to solve the problem

Top Down Design Model


In the top-down model for evolving a system,


an overview of the system is formulated, without going into detail for any part of it. Each part of the system is then refined by designing it in more detail.

Each new part may then be refined again,


defining it in yet more detail until the entire specification is detailed enough to validate the model.

Top Down Concept in Problem Solving


This design model can also be applied while developing algorithm. So it basically refers to successive (stepwise) division (refinement) of problem (task) into sub problems (subtasks). Refinement is applied until we reach to the stage where the subtasks can be directly carried out.

Top Down Design


Main Task

subtask1

subtask2

subtask3

Bottom-up Design

In bottom-up design individual parts of the system are specified in detail. The parts are then linked together to form larger components, which are in turn linked until a complete system is formed. Strategies based on this bottom-up information flow seem potentially necessary and sufficient because they are based on the knowledge of all variables that may affect the elements of the system.

Bottom Up Design
Main Task

subtask1

subtask2

subtask3

Cont

Top-down programming is a programming style, the mainstay of traditional procedural languages, in which design begins by specifying complex pieces and then dividing them into successively smaller pieces.
Eventually, the components are specific enough to be coded and the program is written.

This is the exact opposite of the bottom-up programming approach which is common in
object-oriented languages such as C++ or Java where each objects are identified first.

Example

Given total sales of commodities in 100 districts, find the largest difference between sales of any two districts. 1st level division
1. Get the values of sales of 100 districts 2. Find the largest difference 3. Display the desired result

Steps 1 and 3 are fairly simple. We have to elaborate step 2 further.

Expansion of step 2
2.1 Compute the maximum sale in Max 2.2 Compute the minimum sale in Min 2.3 Find the difference = Max Min

Here step 2.3 is simple but we have to refine step 2.1 and 2.2 further.

Cont

Refinement of step 2.1


Let S1, S2 ,, S100 are sale values. Mathematically, we can write Maxi = MAX (Maxi-1, Si), i = 2,3,,100 Here Maxi is maximum out of 1 to i sales MAX is a function that calculates maximum of two values Initially Max1 = S1

Structured Programming

Structured Programming (SP) is a technique


for organizing and coding programs in which a hierarchy of modules is used. Each module has single entry and single exit point.

Control is passed downward through the structure without unconditional branches to higher levels of the structure.
Structured programming disallows the use of goto. This technique should be used at every level of refinement.

Three types of control flow in SP


Sequential (sequence)
entry T1 T2 exit

Cont.
Selection (test)
If cond then task1 N Cond Y task1

If cond then task1 else task2 task2 N Cond Y task1

Iteration (repetition)

Cond Y N exit
While Do while Repeat etc.

Programming Paradigm

A programming paradigm provides the view that the programmer has of the execution of the program. For instance,
in object-oriented programming, programmers can think of a program as a collection of interacting objects,

In functional programming a program can be


thought of as a sequence of stateless function evaluations.

Cont

The relationship between programming paradigms and programming languages can be complex
since a programming language can support multiple paradigms. For example, C++ is designed to support elements of procedural programming, objectbased programming, object-oriented programming, and generic programming.

In C++, One can write a purely


procedural program object-oriented program a program that contains elements of both paradigms.

Unstructured Programming

Unstructured programming is a programming paradigm where all code is contained in a single continuous block. This is contrary to structured programming,
where programmatic tasks are split into smaller sections (known as functions or subroutines) that can be called whenever they are required.

Unstructured programming languages have to rely on execution flow statements such as


Goto, used in many languages to jump to a specified section of code.

Cont

Unstructured source code is difficult


to read and debug,

Unstructured programming is still used in some scripting languages such as


Assembly language is mostly an unstructured language, because the underlying machine code never has structure.

Correctness of Algorithms

The algorithm is correct if for any legal input it terminates and produces the desired output.
Automatic proof of correctness is not possible

But there are practical techniques and rigorous formalisms that help to reason about the correctness of algorithms

Assertions

To prove correctness we associate a number of assertions (statements about the state of the execution) with specific checkpoints in the algorithm. E.g.,A[1],, A[k] form an increasing sequence
Preconditionsassertions that must be valid before the execution of an algorithm Postconditionsassertions that must be valid after the execution of an algorithm

Loop Invariants

Invariantsassertions that are valid any time they are reached. We must show three things about loop invariants:
Initialization it is true prior to the first iteration Maintenance if it is true before an iteration, it remains true before the next iteration Termination when loop terminates the invariant gives a useful property to show the correctness of the algorithm

Code for sorting a list-Insertion


for j 2 to length(A) { key A[j] i j-1 while i>0 and A[i]>key { A[i+1] A[i] i- } A[i+1] key }

Example of Loop Invariants


Invariant: at the start of each for loop, A[1j-1] consists of elements originally in A[1j-1] in sorted order.
Initialization: j = 2, the invariant trivially holds because A[1] is a sorted array. Maintenance: the inner while loop moves elements A[j-1], A[j-2], , A[j-k] one position right without changing their order. Then the former A[j] element is inserted into kth position so that A[k-1] A[k] A[k+1]. So sorted array A[1...j-1] +A[j] sorted array A[1...j]. Termination: the loop terminates, when j=n+1. Then the invariant states: A[1n] consists of elements originally in A[1n] in sorted order,

Find the roots of a quadratic equation of the form a*x2 + b*x + c = 0


Formula: x1 = [(-b) + (b2 4ac)]/2a x2 = [(-b) - (b2 4ac)]/2a Algorithm: input a, b, c; d = b*b 4*a*c; if (d 0) then { e = sqrt(d); r1 = (-b + e) / (2*a); r2 = (-b - e) / (2*a); i1=0; i2 =0 } else { e = sqrt(-d); r1 = (-b) / (2*a); r2 = r1; i1= e / (2*a); i2 = -e/(2*a); } output r1,r2,i1,i2

Problem 2: Write algorithm to find factorial of n


Formula:
fact = n * n-1 * n-2 * * 2 * 1

Algorithm:
input n; i = 0; initialization fact = 1; looping while (i n) { i = i + 1; fact = i * fact; } output fact

Execution Let n = 4; i = 0; fact = 1; while loop i = i +1;1 4; fact = 1 * 1; i = i +1;2 4; fact = 1 * 2; i = i +1;3 4; fact = 2 * 3; i = i +1;4 4; fact = 6 * 4 i = i +1;5 4 is false so exit the while loop; output fact = 24

Problem 3: Decimal to octal conversion


Execution
Algorithm: input num; quot = num; i = 0; repeat rem = quot mod 8; i = i +1; octal(i) = rem; quot = quot div 8; until (quot = 0) output octal(j), j= i down to 1 num = 245 quot =245; i = 0; repeat rem = 5; i = 1; oct(i) = 5; quot = 30; rem = 6; i = 2; oct(i) = 6; quot =3; rem = 3; i = 3; oct(i) = 3; quot = 0; exit of repeat Output: 365 in base 8 Hence (245)10 = (365)8

Problem 4: Reversing integer digits (3542 2453)


Execution
Algorithm: input num; rev = 0; while (num > 0) { rev = rev*10 + num mod 10; num = num div 10; } output rev num = 245; rev = 0; while loop rev = 0 * 10 + 5 = 5; num = 24; rev = 5 * 10 + 4 = 54; num = 2; rev = 54 * 10 + 2 = 542; num = 0; exit while output rev as 542

Problem5: Find maximum out of 100 numbers (+ve integers)


Algorithm: Here number is read in the loop. At the end of loop, read elements are not available. max = 0; i = 0; while (i < 100) { i = i+1; input num; if (num > max) then max = num; } output max

You might also like