Professional Documents
Culture Documents
10 Intro Data StructuresUpdated
10 Intro Data StructuresUpdated
10 Intro Data StructuresUpdated
— http://courses.cs.vt.edu/cs3114/Spring09/book.pdf
● Lecture Notes
— http://www.cis.upenn.edu/~matuszek/cit594-2014/
2
Learning Outcomes
3
Introduction to data structures
efficiently by algorithms
Data representations and their associated operations
●
4
Need for data structures (1/2)
— E.g. the total space available to store the data (main memory
and disk)
— E.g. the time allowed to perform each subtask
5
Need for data structures (2/2)
6
Choosing a data structure to solve a problem
particular problem?
By first analyzing the problem to determine the performance goals
●
7
Selecting the right data structure (1)
8
Selecting the right data structure (2)
Three concerns
●
9
Selecting the right data structure (3)
— Are all data items inserted into the data structure at the
beginning, or are insertions interspersed with other
operations?
— Can data items be deleted?
— Are all data items processed in some well-defined order, or is
search for specific data items allowed?
●Interspersing insertions with other operations (deletion), and
supporting search for data require more complex data
representations.
10
Example: A bank
A bank must support many types of transactions with its customers, but lets
consider a simple model where customers wish to open accounts, close
accounts, and add money or withdraw money from accounts.
The typical customer opens and closes accounts far less often than he/she
accesses the account. Customers are willing to wait many minutes while
accounts are created or deleted but are typically not willing to wait more than a
brief time for individual account transactions such as a deposit or withdrawal.
These observations can be considered as informal specifications for the time
constraints on the problem.
It is common practice for banks to provide human tellers or ATMs to support
customer access to account balances and updates such as deposits and
withdrawals. Special service representatives are typically provided (during
restricted hours) to handle opening and closing accounts. Teller and ATM
transactions are expected to take little time. Opening or closing an account can
take much longer.
(Continued)
11
Example: A bank (2)
ATM transactions do not modify the database significantly. For simplicity,
assume that if money is added or removed, this transaction simply changes the
value stored in an account record.
Adding a new account to the database is allowed to take several minutes.
Deleting an account need have no time constraint, because from the customer’s
point of view all that matters is that all the money be returned (equivalent to a
withdrawal). From the bank’s point of view, the account record might be removed
from the database system after business hours, or at the end of the monthly
account cycle.
When considering the choice of data structure to use in the database system
that manages customer accounts, we see that a data structure that has little
concern for the cost of deletion, but is highly efficient for search and moderately
efficient for insertion, should meet the resource constraints imposed by this
problem. Records are accessible by unique account number.
One data structure that meets these requirements is the hash table.
12
Primitive Types VS References
● Java’s types are divided into primitive types and reference types
The primitive types are boolean, byte, char, short, int, long, float and
●
double
● All non-primitive types are reference types.
A primitive-type variable can store exactly one value of its declared type
●
at a time.
●For example, an int variable can store one whole number (such as 7) at
a time. When another value is assigned to that variable, its initial value is
replaced.
● Primitive-type instance variables are initialized by default.
●Variables of types byte, char, short, int, long, float and double are
initialized to 0, and variables of type boolean are initialized to false.
13
Primitive Types VS References
Programs use variables of reference types to store the locations of
●
14
Data Types
A type is a collection of values (e.g. boolean, integer)
●
●A data item is a piece of information whose value is drawn from a type. A data
item is said to be a member of a type.
Composite type
●
15
Abstract Data Type
● What does ‘abstract’ mean?
— From Latin: to ‘pull out’—the essentials
— To defer or hide the details
Abstraction emphasizes essentials and defers the details, making
●
order to drive it
— What’s the car’s interface?
— What’s the implementation?
● Hiding the details of implementation is called encapsulation (data hiding)
16
E.g. Float
●You don't need to know how much about floating point arithmetic works
to use float
●Indeed, the details can vary depending on processor, even virtual
coprocessor
But the compiler hides all the details from you--some numeric ADTs are
●
built-in
All you need to know is the syntax and meaning of operators, +, -, *, /,
●
etc.
17
ADT = Properties + Operations
An ADT describes a set of objects sharing the same properties and
●
behaviors
The properties of an ADT are its data (representing the internal state of
●
each object
— double d;
The behaviors of an ADT are its operations or functions (operations on
●
each instance)
— sqrt(d) / 2;
● Thus, an ADT couples its data and operations
● OOP emphasizes data abstraction
18
Abstract Data Type and Data Structure
●An abstract data type (ADT) is the realization of a data type as a software
component. The interface of the ADT is defined in terms of a type and a set
of operations on that type. The behavior of each operation is determined by
its inputs and outputs.
●An ADT does not specify how the data type is implemented. These
implementation details are hidden from the user of the ADT and protected
from outside access (a concept referred to as encapsulation).
19
Array
●An array
— is the fundamental contiguously-allocated data structure
20
Arrays (contd)
Advantages Disadvantages
●Quick Insertion
●We cannot adjust the size in the
— We normally insert at the end
middle of a program (but we can use
of the array the concept of dynamic array to
●Constant time access given the circumvent this problem)
index ●Slow search
— Because the index of each
●Slow deletion
element maps directly to a
— Deleting an element from the
particular memory address, we
can access arbitrary data items middle of the array requires a
instantly provided we know the lot of effort to readjust the
index. positions of the elements that
come after the deleted element
●Space efficiency
21
Strings
List of characters
●
22
LinkedLists
●Linked lists are dynamic data structures whereby one item in the list
points to the next item in the list, using pointers.
●Pointers are connections that hold the pieces of a linked list together
23
Linked lists
Types of LinkedLists:
●
24
Overview of Data Structures
LinkedLists (contd)
Advantages (as compared to an
● ● Disadvantages
array) — Require extra space for
25
Doubly-linked lists
26
Stacks
27
Stacks
● Disadvantages
— Slow access to other items
28
Queues
29
Queue
●A linear data structure in which data can be added to one end and
retrieved from the other.
●Just like the queue of the real world, the data that goes first into the queue
important
●Examples include queues at counters in any real-life application, where the
●Advantages
● Disadvantages
— Slow access to other items
30
Priority queues
31
Priority Queue
●To model applications whereby tasks are processed in a specific order (neither
LIFO nor FIFO).
●Useful in simulations, particularly for maintaining a set of future events ordered by
time so that we can quickly retrieve what the next thing to happen is.
— They are called ``priority'' queues because they enable you to retrieve items
not by the insertion time (as in a stack or queue), but by item that has the
highest priority of retrieval.
●Implemented using an array
●Advantages
●Disadvantages
32
Deques
33
Maps
34
Sets
●It is an abstract data structure that can store certain values, without any
particular order, and no repeated values
●It is a computer implementation of the mathematical concept of a finite set
●Unlike most other collection types, rather than retrieving a specific element
set T
● Implementations
— Arrays, Linked-lists, trees, or hash tables
35
Hash Table
37
Hash Maps
38
Hash Maps (open addressing)
39
Binary Trees
40
Binary Search Trees
Represented graphically in the form of a tree (upside-down) with the root
●
41
Binary Search Trees (contd)
Dynamic data structure, which means, that its size is only limited by
●
Labels each node with a single key, such that for any node labeled x, all
●
nodes on the left have value < x, and all nodes on the right of x have value
>= x.
● Advantages
— Rapid search
● Disadvantages
— Deletion is complex
● Balanced versions of BST are top-down 2-3-4 tree and red-black tree
42
Heaps
43
Trees
44
Parse trees
45
Tries
46
Graphs
— Road network
— Internet
list
47
Graphs (contd)
● Advantages
● Disadvantages
48
Directed graphs (Digraphs)
49
Undirected graphs
50
Typical Graph Applications
distances.
●Modelling a water supply network. A cost might relate to current or a
function of capacity and length. As water flows in only 1 direction, from
higher to lower pressure connections or downhill, such a network is
inherently an acyclic directed graph.
Modelling the recent contacts of someone who has become ill with a
●
51
Typical Graph Applications (contd)
52
Data structures as tools
Arrays (or lists) alone
A more complete toolset
53