Physical Data Organization: Department of Computer Science

UVA DEPARTMENT OF COMPUTER SCIENCE
Physical Data Organization
Database design using logical model of the database

- appropriate level for users to focus on
- user independence from implementation details
Performance
- other major factor in user satisfaction
- depends on
- efficient data structures for data representation
- efficiency of system operation on those structures
Disk access
- one of the most critical factors in performance
- main memory is in general not big enough for entire DB
- recovery problem with main memory DB
- disk contains data files and system files including
data dictionary and index files
Physical-1
Storage Media Hierarchy
Storage medium: primary storage and secondary storage

- database is stored physically on some some storage medium
- primary storage: can be operated directly by CPU
--- main memory & cache
- secondary storage: larger capacity, lower cost, slower access;
cannot be operated directly by CPU; must be copied to primary
Hierarchy
- access speed, cost per unit of data, reliability
- cache: fastest and most costly
- main memory
- flash memory: limited number of writes (also slow)
non-volatile: disk-substitute in embedded systems
- magnetic disk and optical disk (CD-ROM)
- tape storage: sequential access; for backup and archival
Physical-2
Disk Access and Buffer Management
Disk
- direct access storage device (not sequential)
- arm movement involves seek time and latency time
- goal is to reduce # of disk access and seek time
- a block need not to be transferred every time
- buffer blocks: closely related with concurrency control
and recovery strategy of the database system
Buffer management
- goal is to increase hit ratio
- similar to virtual memory management in OS
- differences: forced writing for recovery and
MRU (most recently used first) replacement algorithm
- priority-based replacement: data dictionary and
index blocks have high priority
Physical-3
RAID
Redundant arrays of independent disks

- motivation: large # of small disks might be cost
effective; higher reliability and higher performance
Higher reliability by redundancy
- mirroring/shadowing: a logical disk consists of two
physical disks --- write on both
Higher performance by parallelism
- data striping: splitting data across multiple disks
- bit-level or block-level striping
- with n disks, block i will go to disk (i mod n) + 1
RAID levels
- to provide redundancy at lower cost using disk striping
combined with error-correcting bits, instead of mirroring
Physical-4
File Organization
File
- a sequence of records mapped unto disk blocks
- block: unit of data transfer between disk and memory
- block size ranges from 512 bytes to few Kbytes
- fixed-length records vs variable-length records
Fixed-length records
- size of each field is declared
- when delete, mark it to be ignored: searching for
deleted free space may not be efficient
- use pointer for free space: danger of dangling pointer
which no longer points to the desired record
- problem of interblock records: needs 2 accesses
... block i) (block i+1 ...
----------
record j
Physical-5
Variable-length Records
When such situations occur?

- multiple record types in one file
- record type allows variable length fields
- repeating groups (multiple values)
Methods to deal with them
- byte string representation: special end-of-record
symbol ( ) at the end of each record
- each record is a string of consecutive bytes
- difficulty in reusing the space of deleted record
- fixed-length representation:
1) reserved space for expected maximum length
- useful only if most are close to max. length
2) a list of fixed-length records chained by pointers
3) anchor block (first record of the chain) and overflow
block (all the others) chained by pointers
Physical-6
Mapping Data to Files
Relational database
- straight-forward
- in most cases, each relation in a separate file
File organization
- how to organize a given set of records in files
- heap file: any record can be placed anywhere (no ordering)
- sequential file: records are stored in a sequential order
- hashing file: hash function computes the specific block
for the record based some attribute value
- clustering file: records of different relations stored on
the same file/block for efficient processing
- related records can be read by one block read
- may be inefficient for other operations
Physical-7
Efficient Searching
Additional structures help searching

- associated with files to make the search for
records based on certain field more efficient
- for direct data locating w/o sequential search
- two approaches: indexing and hashing
Sequential file
- records are chained together by pointers
for fast retrieval in search key order
- records are stored physically in search key order
to minimize the number of block accesses
- difficult to maintain the physical sequential
order as records are inserted and deleted
- binary search for files can be done on the blocks
rather than on the records, if block address are
available in the file header
Physical-8
Index Structures
Index file
- index is usually defined on a single field
of a record (index field)
- index file is for fast random access
Dense index
- one index record for every search-key value
- faster access but higher overhead
Sparse index
- index records for only some of the records
- less faster but less overhead
(Brighton) (record: Brighton, ..) (Brighton)

(Downhill) (record: Downhill, ..)
(Marinion) (record: Marinion, ..) (Marinion)
dense index sparse index
Physical-9
Index Structures
Hierarchy of index
- multi-level index for a large index file
- index tree (search tree)
Primary and secondary index
- primary index is the one whose search key specifies
the sequential order of the file
- secondary index: index other than primary one
- secondary index improves the performance of queries
that use keys other than the primary search key
- modifying DB imposes a serious overhead on secondary
index (compared to the primary index)
- dense index is desirable than sparse index for
secondary index, since the file is not ordered
physically according to the secondary index
Physical-10
Clustering Index
Clustering field
- a non-key field that does not have a distinct value
for each record, on which records of a file are
physically ordered
Clustering index
- clustering index is to speed up retrieval of records
that have the same value for the clustering field
- differs from primary index which requires that ordering
field should have a distinct value for each record
Physical-11
Index File
Index file size

- index file for a primary index need substantially
fewer blocks than the data file
- why?
- fewer index entries: an entry exists for each
block of data file rather than for each record
- index entry is smaller in size than a data record:
only two fields (key value and block pointer)
Blocking factor (bfr)
- savings in disk block accesses
- bfr = block size (B) / record length (R)
Physical-12
Index File: Example
An ordered file with 30,000 records, B = 1 Kb, R = 100 bytes

- bfr = 10; data file needs 3000 blocks
- binary search would require (log2 Blocks) = 12 accesses
- with ordering key field of 9 bytes and block pointer

of 6 bytes, size of primary index entry = 15 bytes
- bfr = block size (B) / record length (R) = 68
- total # of index entries: 3000
- # of blocks needed for the index = (3000/68) = 45
- binary search on index file would require
(log2 Bi ) = (log2 45) = 6 accesses
- search for a record using the primary index
6 (for index) + 1 (for data) = 7 accesses
Physical-13
Search Tree
Disadvantage of indexed sequential file organization

- performance degradation as file grows
- file reorganization can avoid this performance
degradation with its own overhead
Search tree
- a special type of tree used to guide the search
for a record given the value of one of its fields
- in a search tree of order p, each node contains
at most p 1 search values and p pointers in the order
<P1 , K1 , ..., Pq 1 , Kq 1 , Pq >, where q p
Pi : pointer to a child node or null pointer
Ki : search key value from some ordered set of
values (all search key values are assumed to be unique)
for all values X in the subtree pointed by Pi , we have
Ki 1 <X<Ki for 1<i<q, X<Ki for i=1, and Ki 1 <X for i=q
Physical-14
B-tree Index Files
B-tree (balanced tree)

- a search tree with some additional constraints
for efficient insertion and deletion
- number of access is fixed
Formal definition
A B-tree of order n is a search tree that satisfies
1) the root has at least two children
2) all nodes other than root have at least n/2 children
3) all leaf nodes are at the same level (balanced)
Insertion and deletion
- insertion may need split when a node becomes too large
(more than n children)
- deletion may need combining if a node becomes too small
(less than n/2 pointers)
- balance property must be maintained
Physical-15
B-tree and B+-tree
Node structure of B-tree

<P1 , (K1 , Pr1 ), P2 , ..., (Kq 1 , Prq 1 ), Pq >
Pi : tree pointer to point another node
Ki : search key value
Pri : data pointer to point record whose search key
field value is Ki (or the data block containing it)
- within each node, K1 < K2 < .. <Kq 1
- for all values X in the subtree pointed by Pi , we have

Ki 1 <X<Ki for 1<i<q, X<Ki for i=1, and Ki 1 <X for i=q
- a node with q tree pointers, q p, has q 1 search
key field values, and hence q 1 data pointers
B+ -tree: a variation of B-tree data structure
- most widely used multi-level index implementation
<P1 , K1 , ..., Pq 1 , Kq 1 , Pq >, where q p
- at leaf node, it is <K1 , Pr1 , ..., Kq 1 , Prq 1 , Pnext >
where Pnext points to the next leaf node of the tree
Physical-16
B+-tree
Requirements for maintaining B+ -tree

- every node must contain at least n/2 pointers
except for the root (which should have at least 2)
- balanced: for ensuring good performance
Searching for key field value K
1) visit the root node, looking for the smallest key
value greater than K. Suppose the value is Ki .
2) follow pointer Pi to another node
- if K < K1 , then follow P1
- if K > Kmax , then follow Pmax
3) repeat step 2 until reaching a leaf node
Physical-17
Differences of B+-tree from B-tree
1. In B+ -tree, data pointers are stored only at the leaf nodes

- more entires can be packed into internal (non-leaf) nodes
of a B+ -tree than for a similar B-tree
- for the same block (node) size, the order p will be larger
for the B+ -tree than for the B-tree --- improved search time
- B-tree eliminates redundant storage of search key values
- faster search in some cases to find desired search key
values before reading a leaf node in B-tree
2. Leaf and non-leaf nodes are of the same size in B+ -tree,
while in B-tree, non-leaf nodes are larger
- complication in storage management for index structures
3. Deletion in B-tree is more complicated
- in B+ -tree, deleted entry always appears in a leaf
- in B-tree, it can be a non-leaf node, requiring
replacement by the proper value from the subtree
of the node containing the deleted entry
Physical-18

Physical Data Organization: Department of Computer Science

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Physical Data Organization: Department of Computer Science

Uploaded by

Copyright:

Available Formats

UVA DEPARTMENT OF COMPUTER SCIENCE

Physical Data Organization

Database design using logical model of the database

Storage Media Hierarchy

Storage medium: primary storage and secondary storage

Disk Access and Buffer Management

Redundant arrays of independent disks

When such situations occur?

Mapping Data to Files

Additional structures help searching

(Brighton) (record: Brighton, ..) (Brighton)

Index file size

Index File: Example

An ordered file with 30,000 records, B = 1 Kb, R = 100 bytes

- with ordering key field of 9 bytes and block pointer

Disadvantage of indexed sequential file organization

B-tree Index Files

B-tree (balanced tree)

B-tree and B+-tree

Node structure of B-tree

- for all values X in the subtree pointed by Pi , we have

Requirements for maintaining B+ -tree

Differences of B+-tree from B-tree

1. In B+ -tree, data pointers are stored only at the leaf nodes

You might also like