Professional Documents
Culture Documents
Query Optimization
Query Optimization
SELECT (symbol: (sigma))
PROJECT (symbol: (pi))
RENAME (symbol: (rho))
Relational Algebra Operations From Set Theory
UNION ( ), INTERSECTION ( ), DIFFERENCE (or MINUS, – )
CARTESIAN PRODUCT ( x )
Binary Relational Operations
JOIN (several variations of JOIN exist)
Additional Relational Operations
OUTER JOINS
AGGREGATE FUNCTIONS (These compute summary of information:
for example, SUM, COUNT, AVG, MIN, MAX) ℱ
Example: To list each employee’s first and last name and salary, the following is
used:
LNAME, FNAME,SALARY(EMPLOYEE)
DEPT_MGR DEPARTMENT MGRSSN=SSN EMPLOYEE
MGRSSN=SSN is the join condition
Combines each department record with the employee who
manages the department
The join condition can also be specified as
DEPARTMENT.MGRSSN= EMPLOYEE.SSN
Programmers/user
Database
System
DBMS
Software
Software: Query Processing
Database
Database
Definition
Query
Scanning
Parsing
Validating
Intermediate form of Query
(query Tree)
Query
Optimizer
Catalog
Execution Plan
Query Code
Generator
Compile
d Query Executable Code
Code
Execution in
Runtime
processor
Query
Scanning
Parsing
Validating
Intermediate form of Query
(query Tree)
Query
Optimizer
Catalog
Execution Plan
Query Code
Generator
Compile
d Query Executable Code
Code
Execution in
Runtime
processor
Query
Scanning
Parsing
Validating
Intermediate form of Query
(query Tree)
Query
Optimizer
Catalog
Execution Plan
Query Code
Generator
Compile
d Query Executable Code
Code
Execution in
Runtime
processor
Query
Scanning
Parsing
Validating
Intermediate form of Query
(query Tree)
Query
Optimizer
Catalog
Execution Plan
Query Code
Generator
Compile
d Query Executable Code
Code
Execution in
Runtime
processor
R(A,B,C)
S(C,D,E)
SELECT B, D
FROM R, S
WHERE R.C=S.C AND
R.A = "c" AND
S.E = 2
WHERE...
R.A=“c” S.E=2 R.C=S.C
x
FROMR,S
R S
Plan II
B,D
natural join
R.A = “c” S.E = 2
R S
a 1 10 A B C C D E 10 x 2
b 1 20 c 2 10 10 x 2 20 y 2
c 2 10 20 y 2 30 z 2
d 2 35 30 z 2 40 x 1
e 3 45 50 y 3
B,D
Copyright © 2016 Ramez Elmasri and Shamkant B. Navathe
Questions for Query Optimization
Which relational algebra expression, equivalent to the given
query, will lead to the most efficient solution plan?
SQL query
parse
parse tree
convert
answer
logical query plan
execute
apply laws
Pi
“improved” l.q.p
estimate result sizes statistics
pick best
l.q.p. +sizes {P1,C1>...}
consider physical plans estimate costs
{P1,P2,…..}
Copyright © 2016 Ramez Elmasri and Shamkant B. Navathe
Processing Steps
50 tuples in Branch.
~ 5 London branches
Again requires (1000+50) disk accesses to read from Staff and Branch
Joins Staff and Branch on branchNo with 1000 tuples
(1 employee : 1 branch )
Requires (1000) disk access to read in joined relation and check predicate
Consider if Staff and Branch relations were 10x size? 100x? !!!
Copyright © 2016 Ramez Elmasri and Shamkant B. Navathe
Heuristic Optimization
GOAL:
Use relational algebra equivalence rules to improve the
expected performance of a given query tree.
(5)p( R S ) = R S Visual of 4
p
q q^p
p
x =
R S R S
Note : The above is an incomplete List! For a complete list see the text.
Move down the query tree for the earliest possible execution
(reduce number of tuples processed).
Break apart and move as far down the tree as possible lists of
projection attributes, create new projections where possible
(reduce tuple widths early).
SELECT p.ticketno
FROM Flight f , Passenger p, Crew c
WHERE f.flightNo = p.flightNo AND Canonical Relational Algebra Expression
f .flightNo = c.flightNo AND
f.date = ’01-01-06’ AND
f.to = ’FRA’ AND
p.name = c.name AND
c.job = ’Pilot’
Query Processing
Query Optimization