Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 116

UNIT III

DATA STORAGE AND QUERY PROCESSING


Overview of Physical Storage Media – Magnetic
Disks and Flash Storage – RAID – Tertiary
Storage – File Organization – Organization of
Records in Files – Indexing and Hashing: Basic
Concepts – Ordered Indices – B+ tree Index Files
– B+ tree Extensions – Static Hashing – Dynamic
Hashing – Query Processing Overview –
Selection Operation – Sorting – Join Operation.
Overview of Physical Storage Media
Overview of Physical Storage Media

Storage media are classified into different


types based on the following:
 Accessing Speed
 Cost per unit of data
 Reliability
Based on storage volatility they can be classified
into 2 types:
 Volatile storage: Loses the contents when the
power to the device is removed.
E.g.: Main memory and cache.
 Non-Volatile storage: Contents persist even when
the power is switched off.
E.g.: Secondary & Tertiary storage devices.
Storage Device Hierarchy
 Primary Storage
 Secondary Storage
 Tertiary Storage
Primary Storage:
 This category usually provides fast access to data, but has
limited storage capacity.
 It is volatile in nature.
 Example: Cache and main memory

Secondary Storage:
 These devices usually have a large capacity.
 Less cost and slower access to data
 It is non-volatile.
 E.g.: Magnetic disks
Tertiary Storage:
 This is in the lowest level of hierarchy.
 Non-volatile,
 Slow access time
 Example: Magnetic tape, optical storage
Cache Memory
It is the fastest and most costly form of storage.
 It is volatile in nature.

 It is managed by computer system hardware.

 Cache memory, lies in between CPU and the Main memory


Main Memory
Fast access, generally two small to store the entire database.
Too expensive.
Capacities of upto a few giga bytes widely used currently.
Ex: RAM
Flash Memory

It is present between primary storage and secondary


storage in the storage hierarchy.
 It is non volatile memory.

 Accessing speed is as fast as reading data from


main memory.
 Widely used in embedded devices such as digital
cameras, Video games, etc
Magnetic Disk

 Primary medium for long term storage of data, typically stores


entire database.
 Data must be moved from disk to main memory for access and
written back for storage.
 Much slower access than main memory.
 Capacities ranges upto 400 Gigabytes currently
 Much larger capacity than main and flash memory.
 Disk storage survives power failures and system crashes
 Disk failure can destroy data but is very rare.
Schematic diagram of a magnetic disk
Mechanism

Physically, disks are relatively simple. Each platter has a


flat circular shape.
 Its two surfaces are covered with a magnetic
material, and information is recorded on the
surfaces.
 Platters are made from rigid metal or glass
 When the disk is in use, a drive motor
spins it at a constant high speed.
 There is a read-write head positioned
just above the surface of the platter.
 The disk surface is logically divided
into tracks over 50,000 to 100,000 tracks
per platter and 1 to 5 platters per disk.
 Each track is divided into sectors. A
sector is a smallest unit of information
that can be read from or written to the
disk. Sector size typically 512 bytes.
 Inner tracks are of smaller
length (500 sectors per track)
and outer tracks contains more
sectors than inner tracks (1000
sectors per track).

 The read-write head stores


information on a sector
magnetically.

 The read-write head of all the


tracks are mounted on a single
assembly called a disk arm.
 The disk platters mounted on a
spindle and heads mounted on a
disk arm are together known as head
- disk assemblies
 Cylinder is consisting of the track
of all the platters
 A disk-controller interfaces
between the computer system and the
disk drive hardware.
 Disk controllers attach checksums to each sector to
verify that data is read back correctly. The checksum
is computed from the data written to the sector.
 Another task of disk controller is remapping of bad
sectors.
 There are number of common interfaces for
connecting disks to personal computers - (a) ATA
interface and (b) SCSI (small computer system
Interconnected) interface.
Performance Measures of Disks

Access time: The time it takes from when a read or write


request is issued to when data transfer begins.

 Seek time: Time it takes to reposition the arm over the correct
track. The seek time ranges from 2 to 30 milliseconds. The
average seek time is one-third the worst case seek time and one
half the maximum seek time. Average seek time currently ranges
between 4 to 10 milliseconds.
 Rotational latency: Time it takes for the sector
to be accessed to appear under the head. It
ranges from 4 to 11 milliseconds.

 Data-transfer rate: The rate at which data can


be retrieved from or stored to the disk. It ranges
from 25 to 100 megabytes per second.
 Mean time to failure (MTTF): The average time,
the disk is expected to run continuously without
any failure.
 It is the measure of the reliability of the disk

 Typically 3 to 5 years

 Probability of failure of new disks is quite low

 MTTF decreases as disk ages.


Optimization of Disk-Block Access

 Techniques used for accessing data from disk:

 Scheduling
 Disk arm scheduling algorithm order pending accesses to tracks
so that disk arm movement is minimized.
 Commonly used algorithm is elevator algorithm.
 Move disk arm in one direction (From outer to inner tracks or
vice versa), processing next request in that direction, till no
more requests in that direction, then reverse direction and repeat.
 File organization:
 Optimize block access time by organizing the blocks to
correspond to how data will be accessed.
 Eg: store related information’s on the same or nearby
cylinders.
 Sequential file may become fragmented, that is its blocks
become scattered all over the disk. Sequential access to a
fragmented file results in increased disk arm movement.
 Non-volatile write buffers: (NV-RAM)
To speed up disk writes by writing blocks to a non-
volatile RAM buffer immediately, the contents to
NV-RAM are not lost in power failure.

 Log disk
 A disk devoted to writing a sequential log to block
updates used exactly like non-volatile RAM. Write to
log disk is very fast since no seeks are required
Journaling file system write data in safe order to NV-
RAM or log disk.
RAID

 Redundant Array of Independent Disks

 Multiple secondary disks are connected together to


increase the performance, data redundancy or both.
 Need:
 To increase the performance
 Increased reliability
 To give greater throughput
 Data are restored
RAID – Level 0

Data is stripped into multiple drives


RAID – Level 0

 Data is broken down into blocks and these blocks


are stored across all the disks.
 Thus striped array of disks is implemented.

 There is no duplication of data in this level so


once a block is lost then there is no way recover it.
 It has good performance.
RAID – Level 1

Mirroring of data in drive 1 to drive 2. It


offers 100% redundancy as array will
continue to work even if either disk fails.
RAID – Level 1

 uses mirroring techniques

 All data in the drive is duplicated to another


drive.
 It provides 100% redundancy in case of a
failure.
 Advantage: Fault Tolerance
RAID 10, also known as RAID 1+0, is a RAID configuration that
combines disk mirroring and disk striping to protect data. It
requires a minimum of four disks and stripes data across
mirrored pairs.
RAID – Level 2
RAID – Level 2
 Use of mirroring as well as stores Error Correcting
codes for its data striped on different disks.
 Each data bit in a word is recorded on a separate disk
and ECC codes of the data words are stored on a
different set disks.
 Due to its complex structure and high cost, RAID 2 is
not commercially available.
RAID – Level 3

One dedicated drive is


used to store the parity
information and in case of
any drive failure the
parity is restored using
this extra drive.
RAID – Level 3

 It consists of byte level stripping with dedicated


parity. In this level, the parity information is stored
for each disk section and written to dedicated parity
drive.
 Parity is a technique that checks whether data has
been lost or written over when it is moved from one
place in storage to another.
RAID – Level 3

 In the case of disk failure, the parity disk is


accessed and data is reconstructed from the
remaining devices.
 Once the failed disk is replaced, the missing
data can be restored on the new disk.
RAID – Level 4
This level is very much
similar to RAID 3 apart from
the feature where RAID 4
uses block level stripping
rather than byte level.
RAID – Level 4

 It consists of block level stripping with a


parity disk.
RAID – Level 5
Parity information is written
to a different disk in the array
for each stripe. In case of
single disk failure data can be
recovered with the help of
distributed parity without
affecting the operation and
other read write operations.
RAID – Level 5

 RAID 5 writes whole data blocks onto


different disks, but the parity bits generated for
data block stripe are distributed among all the
data disks rather than storing them on a
different dedicated disk.
RAID – Level 6

This level is an enhanced version of RAID 5 adding


extra benefit of dual parity. This level uses block
level stripping with DUAL distributed parity
RAID – Level 6
 RAID 6 is a extension of Level 5.

 In this level, two independent parities are generated


and stored in distributed fashion among multiple disks.
 Two parities provide additional fault tolerance.

 This level requires at least four disk drives to


implement RAID.
 The factors to be taken into account in
choosing a RAID level are:
 Performance requirements in terms of number of
I/O operation.
 Performance when a disk has failed.
 Performance during rebuild.
File Organization
 A method of arranging records in a file when the file is
stored on disk.
 A file is organized logically as a sequence of records.

 Record is a sequence of fields.

 Each file is also logically partitioned into fixed-length


storage units called blocks, which are the units of both
storage allocation and data transfer.
What is File Organization?

In simple terms, Storing the files in certain order is called file


Organization. File Structure refers to the format of the label and
data blocks and of any logical control record.

Types of File Organizations

• Sequential File Organization


• Heap File Organization
• Hash File Organization
• B+ Tree File Organization
• Clustered File Organization
Sequential File Organization –
• The easiest method for file Organization is Sequential method.
In this method the file are stored one after another in a
sequential manner. There are two ways to implement this
method:
1. Pile File Method
2. Sorted File Method
Pile File Method – This method is quite simple, in which we
store the records in a sequence i.e one after other in the
order in which they are inserted into the tables.
Insertion of new record –

Let the R1, R3 and so on upto R5 and R4 be four records in the


sequence. Here, records are nothing but a row in any table.
Suppose a new record R2 has to be inserted in the sequence,
then it is simply placed at the end of the file.
• Sorted File Method –
In this method, As the name itself suggest whenever a new record has
to be inserted, it is always inserted in a sorted (ascending or
descending) manner. Sorting of records may be based on any primary
key or any other key.

Insertion of new record –


Let us assume that there is a preexisting sorted sequence of four
records R1, R3, and so on upto R7 and R8. Suppose a new record
R2 has to be inserted in the sequence, then it will be inserted at
the end of the file and then it will sort the sequence .
Heap File Organization
 Heap File Organization works with data blocks. In this
method records are inserted at the end of the file, into the
data blocks. No Sorting or Ordering is required in this method.
 If a data block is full, the new record is stored in some other
block, Here the other data block need not be the very next
data block, but it can be any block in the memory. It is the
responsibility of DBMS to store and manage the new records.
• Insertion of new record –
Suppose we have four records in the heap R1, R5, R6, R4
and R5 and suppose a new record R2 has to be inserted
in the heap then, since the last data block i.e data block 3
is full it will be inserted in any of the data blocks selected
by the DBMS, lets say data block 1.

If we want to search, delete or


update data in heap file
Organization the we will traverse
the data from the beginning of the
file till we get the requested
record. Thus if the database is very
huge, searching, deleting or
updating the record will take a lot
of time.
Hashing
In a database management system, When we want to
retrieve a particular data, It becomes very inefficient to
search all the index values and reach the desired data. In
this situation, Hashing technique comes into picture.

• Hashing is an efficient technique to directly search the


location of desired data on the disk without using index
structure. Data is stored at the data blocks whose
address is generated by using hash function. The
memory location where these records are stored is called
as data block or data bucket.
Hash File Organization:

• Data bucket – Data buckets are the memory locations


where the records are stored. These buckets are also
considered as Unit Of Storage.

• Hash Function – Hash function is a mapping function that


maps all the set of search keys to actual record address.
Generally, hash function uses the primary key to generate
the hash index – address of the data block. Hash function
can be simple mathematical function to any complex
mathematical function.

• Hash Index-The prefix of an entire hash value is taken as a


hash index. Every hash index has a depth value to signify
how many bits are used for computing a hash function.
Static Hashing:
• In static hashing, when a search-key value is provided, the
hash function always computes the same address. For
example, if we want to generate an address for STUDENT_ID =
104 using mod (5) hash function, it always results in the same
bucket address 4. There will not be any changes to the bucket
address here. Hence a number of data buckets in the memory
for this static hashing remain constant throughout.

Insertion – When a new record is inserted into the table, The


hash function h generates a bucket address for the new
record based on its hash key K.

Bucket address = h(K)


Searching – When a record needs to be searched, The same hash function is
used to retrieve the bucket address for the record. For Example, if we want to
retrieve the whole record for ID 104, and if the hash function is mod (5) on
that ID, the bucket address generated would be 4. Then we will directly got to
address 4 and retrieve the whole record for ID 104. Here ID acts as a hash key.

Deletion – If we want to delete a record, Using the hash function we will first
fetch the record which is supposed to be deleted. Then we will remove the
records for that address in memory.

Updation – The data record that needs to be updated is first searched using
hash function, and then the data record is updated.
• Now, If we want to insert some new records into the file But
the data bucket address generated by the hash function is not
empty or the data already exists in that address. This becomes
a critical situation to handle. This situation in the static
hashing is called bucket overflow.

To overcome this situation Some commonly used methods are


discussed below:
Open Hashing – In Open hashing method, next available data
block is used to enter the new record, instead of overwriting
the older one. This method is also called linear probing. For
example, D3 is a new record that needs to be inserted, the
hash function generates the address as 105. But it is already
full. So the system searches next available data bucket, 123
and assigns D3 to it.
Closed hashing – In Closed hashing method, a new data
bucket is allocated with same address and is linked it after the
full data bucket. This method is also known as overflow
chaining. For example, we have to insert a new record D3 into
the tables. The static hash function generates the data bucket
address as 105. But this bucket is full to store the new data. In
this case is a new data bucket is added at the end of 105 data
bucket and is linked to it. Then new record D3 is inserted into
the new bucket.
• Quadratic probing : Quadratic probing is very much similar to
open hashing or linear probing. Here, The only difference
between old and new bucket is linear. Quadratic function is
used to determine the new bucket address.

• Double Hashing : Double Hashing is another method similar


to linear probing. Here the difference is fixed as in linear
probing, but this fixed difference is calculated by using
another hash function. That’s why the name is double
hashing.
Dynamic Hashing –
• The drawback of static hashing is that it does not expand or shrink
dynamically as the size of the database grows or shrinks. In Dynamic
hashing, data buckets grows or shrinks (added or removed dynamically) as
the records increases or decreases. Dynamic hashing is also known
as extended hashing. In dynamic hashing, the hash function is made to
produce a large number of values.
For example:
• Consider the following grouping of keys into buckets, depending on the
prefix of their hash address:
The last two bits of 2 and 4 are 00. So it will go into bucket B0.
The last two bits of 5 and 6 are 01, so it will go into bucket B1.
The last two bits of 1 and 3 are 10, so it will go into bucket B2.
The last two bits of 7 are 11, so it will go into B3.
Insert key 9 with hash address 10001 into the above structure:
• Since key 9 has hash address 10001, it must go into the first bucket. But bucket B1 is full, so it will
get split.
• The splitting will separate 5, 9 from 6 since last three bits of 5, 9 are 001, so it will go into bucket
B1, and the last three bits of 6 are 101, so it will go into bucket B5.
• Keys 2 and 4 are still in B0. The record in B0 pointed by the 000 and 100 entry because last two
bits of both the entry are 00.
• Keys 1 and 3 are still in B2. The record in B2 pointed by the 010 and 110 entry because last two
bits of both the entry are 10.
• Key 7 are still in B3. The record in B3 pointed by the 111 and 011 entry because last two bits of
both the entry are 11.
Cluster File Organization
• In cluster file organization, two or more related
tables/records are stored within same file known as
clusters. These files will have two or more tables in the
same data block and the key attributes which are used
to map these table together are stored only once.
• Thus it lowers the cost of searching and retrieving
various records in different files as they are now
combined and kept in a single cluster.
For example we have two tables or relation Employee
and Department. These table are related to each
other.
Therefore these table are allowed to combine using a join operation and can be
seen in a cluster file.
If we have to insert, update or delete any record we can directly
do so. Data is sorted based on the primary key or the key with
which searching is done. Cluster key is the key with which
joining of the table is performed.
Types of Cluster File Organization – There are two ways to
implement this method:
• Indexed Clusters –
In Indexed clustering the records are group based on the
cluster key and stored together. The above mentioned
example of the Employee and Department relationship is an
example of Indexed Cluster where the records are based on
the Department ID.
• Hash Clusters –
This is very much similar to indexed cluster with only
difference that instead of storing the records based on cluster
key, we generate hash key value and store the records with
same hash key value.
Indexing

 An index is a data structure that organizes data


records on the disk to make the retrieval of data
efficient.
Indexes are created using a few database columns.
• The first column is the Search key that contains a
copy of the primary key or candidate key of the
table. These values are stored in sorted order so that
the corresponding data can be accessed quickly.
Note: The data may or may not be stored in sorted
order.
• The second column is the Data
Reference or Pointer which contains a set of
pointers holding the address of the disk block where
that particular key value can be found.
Ordered Indices:

Based on sorted ordering values.

The indices are usually sorted to make


searching faster. The indices which are sorted
are known as ordered indices
Primary Index
• If the index is created on the basis of the primary
key of the table, then it is known as primary
indexing. These primary keys are unique to each
record and contain 1:1 relation between the
records.
• As primary keys are stored in sorted order, the
performance of the searching operation is quite
efficient.
The primary index can be classified into two types:
• Dense index and
• Sparse index.
Dense index

• The dense index contains an index record for every


search key value in the data file. It makes searching
faster.

• In this, the number of records in the index table is


same as the number of records in the main table.

• It needs more space to store index record itself. The


index records have the search key and a pointer to
the actual record on the disk.
 Sparse Index:
 In the data file, index record appears only for a few
items. Each item points to a block.
 In this, instead of pointing to each record in the main
table, the index points to the records in the main table
in a gap.
Clustering Index
• A clustered index can be defined as an ordered data
file. Sometimes the index is created on non-primary
key columns which may not be unique for each record.

• In this case, to identify the record faster, we will group


two or more columns to get the unique value and create
index out of them. This method is called a clustering
index.

• The records which have similar characteristics are


grouped, and indexes are created for these group.
Secondary Index

• In the sparse indexing, as the size of the table grows,


the size of mapping also grows.
• These mappings are usually kept in the primary
memory so that address fetch should be faster. Then
the secondary memory searches the actual data
based on the address got from mapping. If the
mapping size grows then fetching the address itself
becomes slower.
• In this case, the sparse index will not be efficient. To
overcome this problem, secondary indexing is
introduced.
 In secondary indexing, to reduce the size of
mapping, another level of indexing is introduced.

 In this method, the huge range for the columns is


selected initially so that the mapping size of the
first level becomes small. Then each range is
further divided into smaller ranges.

 The mapping of the first level is stored in the


primary memory, so that address fetch is faster.

 The mapping of the second level and actual data


are stored in the secondary memory (hard disk).
 Single level Indexing:
 The index is usually specified on one field of the
file.
 Types of single level indexing can be primary
indexing, clustering index or secondary indexing.

Search Key Pointer to Record


 Single level Indexing:

Roll No. Name Age

Search Key Address 101 Aa 25

101 102 bb 20

120
130

Roll No. Name Age


130 Xx 32
131 Yy 28
132 zz 30
Multi level Indexing:
 Multilevel index is stored on the disk along with the
actual database files.
 Multi-level Index helps in breaking down the index into
several smaller indices in order to make the outermost level
so small that it can be saved in a single disk block, which
can easily be accommodated anywhere in the main
memory.
 Multi level Indexing:
2
5

2 24
8 29
15
24 35
36
2 35
35 39 51
55 44 53
85 51
55
55
61
63
71 80
80 82
B+ Tree

• The B+ tree is a balanced binary search tree. It follows a


multi-level index format.
• B+ Tree is a Storage method in tree like structure.
• B+ Tree has on root, any number of intermediate node and a
leaf node
• In the B+ tree, leaf nodes denote actual data pointers and leaf
node will Have actual records stored in the sorted order.
• Intermediate node will have only pointer to the leaf node it
has no data.
• B+ tree ensures that all leaf nodes remain at the same height.
• In the B+ tree, the leaf nodes are linked using a link list.
Therefore, a B+ tree can support random access as well as
sequential access.
Structure of B+ Tree
In the B+ tree, every leaf node is at equal distance
from the root node.

The B+ tree is of the order n where n is fixed for every


B+ tree.

It contains an internal node and leaf node.


Internal node

• An internal node of the B+ tree can contain at least n/2 record


pointers except the root node.
• At most, an internal node of the tree contains n pointers.

Leaf node

• The leaf node of the B+ tree can contain at least n/2 record
pointers and n/2 key values.
• At most, a leaf node contains n record pointer and n key values.
• Every leaf node of the B+ tree contains one block pointer P to
point to next leaf node.
Searching a record in B+ Tree
• Suppose we have to search 55 in the below B+ tree structure.
First, we will fetch for the intermediary node which will direct
to the leaf node that can contain a record for 55.

• So, in the intermediary node, we will find a branch between


50 and 75 nodes. Then at the end, we will be redirected to the
third leaf node. Here DBMS will perform a sequential search
to find 55.
B+ Tree Insertion
• Suppose we want to insert a record 60 in the below structure.
It will go to the 3rd leaf node after 55. It is a balanced tree,
and a leaf node of this tree is already full, so we cannot insert
60 there.
• In this case, we have to split the leaf node, so that it can be
inserted into tree without affecting the fill factor, balance and
order.
• The 3rd leaf node has the values (50, 55, 60, 65, 70) and its
current root node is 50. We will split the leaf node of the tree
in the middle so that its balance is not altered. So we can
group (50, 55) and (60, 65, 70) into 2 leaf nodes.
• If these two has to be leaf nodes, the intermediate node
cannot branch from 50. It should have 60 added to it, and
then we can have pointers to a new leaf node.
B+ Tree Deletion
• Suppose we want to delete 60 from the above example.
In this case, we have to remove 60 from the intermediate
node as well as from the 4th leaf node too. If we remove
it from the intermediate node, then the tree will not
satisfy the rule of the B+ tree. So we need to modify it to
have a balanced tree.
• After deleting node 60 from above B+ tree and re-
arranging the nodes, it will show as follows:
Query Processing
Query Processing

 Parsing and Translation:


 The system must translate the query into a usable form.
 A more useful internal representation is one based on
the extended relational algebra.
 The parser checks the syntax of the user’s query,
verifies that the relation names appearing in the query
are names of the relations in the database.
Query Processing

 Parsing and Translation - Example:

 select salary from instructor where salary <


75000;
 This query can be translated into either of the
following relational-algebra expressions:
Query Processing
 Optimization:
 During this process the query evaluation plan is prepared from
all the relational algebraic expressions.
 The query cost for all the evaluation plans is calculated.

 Amongst all equivalent evaluation plans the one with lowest


cost is chosen.
 Cost is estimated using statistical information from the
database catalog, such as size of tuples, etc.
Query Processing

 Evaluation:
 The query execution engine takes a query evaluation,
executes that plan and returns the answers to the query.
Measures of Query Cost
 Many factors contribute to time cost
 Disk accesses, CPU, or even network communication

 Typically disk access is the predominant cost, and is


also relatively easy to estimate. Measured by taking into
account
 Number of seeks * average-seek-cost

 Number of blocks read * average-block-read-cost

 Number of blocks written * average-block-write-cost


Measures of Query Cost
 Cost to write a block is greater than cost to read a block
 data is read back after being written to ensure that the write was successful.

 For simplicity we just use the number of block transfers from


disk and the
 number of seeks as the cost measures

 tT – time to transfer one block

 tS – time for one seek

 Cost for b block transfers plus S seeks: b * tT + S * tS


Algorithm for Selection Operation

 File scan – search algorithms that locate and


retrieve records that fulfill a selection
condition.
 Algorithm A1 (linear search). Scan each file
block and test all records to see whether they
satisfy the selection condition.
Algorithm for Selection Operation

 Algorithm A1 (linear search):


 Cost estimate = br block transfers + 1 seek

 br denotes number of blocks containing records from relation r.

 If selection is on a key attribute, can stop on finding record

 cost = (br /2) block transfers + 1 seek

 Linear search can be applied regardless of

 selection condition or ordering of records in the file, or


availability of indices
Algorithm for Selection Operation

 Algorithm A2 (binary search):


 Applicable if selection is an equality comparison on the
attribute on which file is ordered.
 Assume that the blocks of a relation are stored contiguously.

 Cost estimate (number of disk blocks to be scanned):

 cost of locating the first tuple by a binary search on the blocks

 [log2(br)] * (tT + tS)


Algorithm for Selection Operation

 Algorithm A2 (binary search):


 If there are multiple records satisfying selection
Add transfer cost of the number of blocks containing
records that satisfy selection condition
Algorithm for JOIN Operation

 Nested Loop Join - To compute the theta join r


for each tuple tr in r do begin
 for each tuple ts in s do begin

 test pair (tr,ts) to see if they satisfy the join condition q

 if they do, add tr • ts to the result.

end
end
Algorithm for JOIN Operation

 Nested Loop Join:


 r is called the outer relation and s the inner relation of the
join.
Requires no indices and can be used with any kind of join
condition.
Expensive since it examines every pair of tuples in the two
relations.
Algorithm for JOIN Operation

 Nested Loop Join:


 In the worst case, if there is enough memory only to hold one
block of each relation, the estimated cost is nr * bs + br
 block transfers, plus nr + br

 seeks

 If the smaller relation fits entirely in memory, use that as the


inner relation.
 Reduces cost to br + bs block transfers and 2 seeks
Algorithm for JOIN Operation

 Nested Loop Join:


 Assuming worst case memory availability cost estimate is
with depositor as outer relation:
 5000 * 400 + 100 = 2,000,100 block transfers,

 5000 + 100 = 5100 seeks

with customer as the outer relation


 10000 * 100 + 400 = 1,000,400 block transfers and 10,400 seeks
Algorithm for JOIN Operation

 Block Nested Loop Join


 Variant of nested loop join in which every block of inner
relation is paired with every block of outer relation.
Algorithm for JOIN Operation

 Merge Join
 Sort both relations on their join attribute (if not already sorted
on the join attributes).
 Merge the sorted relations to join them.

 Can be used only for equijoins and natural joins

 the cost of merge join is: br + bs block transfers + [br / bb]+


[bs / bb] seeks
 + the cost of sorting if relations are unsorted.
Algorithm for JOIN Operation

 Hash Join:
 The hash function is used h is used to partition tuples of
both the relations.
 Cost: 3(br+bs)+4 x n block transfers + 2 ( [br/bb]+[bs/bb])
seeks.
Query Optimization

 Heuristic Estimation:
 Heuristic is a rule that leads to least cost in most of cases.
 Systems may use heuristic to reduce the number of choices
that must be made in a cost-based fashion.
 Heuristic optimization transforms the query-tree by using
a set of rules that typically t improve execution
performance.
Query Optimization

 Query Tree:
 SELECT schedule, room FROM Student NATURAL
JOIN Enroll NATURAL JOIN Class WHERE
Major='Math'
Query Optimization

 Query Tree:
Query Optimization

 Heuristic Estimation – Rules:


 Perform selection early (reduces the number of tuples)
 Perform projection early (reduces the number of
attributes)
 Perform most restrictive selection and join operations
before other similar operations.
Query Optimization

 Heuristic Estimation – Steps:


 Scanner and parser generate initial query representation
 Representation is optimized according to heuristic rules
 Query execution plan developed.
Query Optimization
 Cost based Estimation:
 Look at all of the possible ways or scenarios in which a
query can be executed
 Each scenario will be assigned a ‘cost’, which indicates how
efficiently that query can be run
 Pick the scenario that has the least cost and execute the
query using that scenario, because that is most efficient way
to run the query.
Query Optimization

 Cost based Estimation:


 Scope of query optimization is a query block. Global
query optimization involves multiple query blocks.
 Cost Components: Access cost to secondary storage, Disk
Storage cost, Computation cost, memory usage cost and
Communication cost
Query Optimization
 Cost based Estimation:
 Information stored in DBMS catalog and used by optimizer:

 File Size

 Organization

 Number of levels of each multilevel index

 Number of distinct values of an attribute

 Attribute selectivity.

You might also like