Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 27

QUERY PROCESSING AND

QUERY OPTIMIZATION
What is Query Processing?

 It is a 3 step process that transforms a high


level query (sql) into an equivalent and more
efficient lower-level query (of relational
algebra).
Query

Query

 Query is the statement written by the user in high language using SQL.
Parser & Translator

Parser
Query &
Translato
r

 Parser: Checks the syntax and verifies the relation.

 Translator: Translates the query into an equivalent


relational algebra.

Example:
SQL> select name from customer;

RA:=∏name(customer)
Relational Algebra

Parser Relational
Query & Algebra
Translato
r

 It is the query converted in algebraic form from pl/ sql by translator.

 Example:
SQL>SELECT ENAME FROM EMP,ASG
WHERE EMP.ENO=ASG.ENO AND
DUR>37;

RA:1) ΠENAME(σDUR>37∧EMP.ENO=ASG.ENO(EMP × ASG))


2) ΠENAME(EMP ENO (σDUR>37(ASG)))
Optimizer

Parser Relational
Query & Algebra
Translato
r

 It will select the query which has low cost.


Optimizer

Example:
1)ΠENAME(σDUR>37∧EMP.ENO=ASG.ENO(EMP × ASG))
2) ΠENAME(EMP ENO (σDUR>37(ASG)))

Optimizer will select Expression2 as it avoids the expensive


and large intermediate Cartesian product, and therefore
typically is better.
Comparison of two relational queries
 ΠENAME(σDUR>37∧EMP.ENO=ASG.ENO  ΠENAME(EMP ENO(σDUR>37(ASG)))
(EMP × ASG))

ΠENAME ΠENAME

σDUR>37 ∧temp

EMP ENO
Temp as
EMP.ENO=ASG.ENO

EMP x ASG
ENO(σDUR>37(ASG)
Statistical Data

Parser Relational
Query & Algebra
Translato
r

 A Statical Data is a
Optimizer
database used for
statistical analysis
purposes.
 It is an OLAP(Online Statistical
Data
Analytical Processing),
instead of OLTP(Online
Transaction Processing)
system
Evaluation Plan

Parser Relational
Query & Algebra
Translato
r

 Relational Algebra annotated with instructions Optimizer


on how to evaluate it is called an evaluation
primitive.
 Sequence of primitive operations that can be
Evaluation Statistical
used to evaluate a query is a query evaluation Plan Data
plan.
EVALUATION &
DATA
Parser Relational
Query & Algebra
Translato
r

Optimizer

 The evaluation engine


takes the evaluation
Evaluation Statistical
plan as Evaluation
Plan Data
condition and
applies it on the  The information on
data. which the query has
to be performed is called data.
Data
OUTPU
T
Parser Relational
Query & Algebra
Translato
r
 After the evaluation of plan
on data, processed Optimizer
information is showed in
output.

Evaluation Evaluation Statistical


Output
Plan Data

Data
Diagram of Query Processing

Parser Relational
Query & Algebra
Translato
r

Optimizer

Evaluation Evaluation Statistical


Output
Plan Data

Data
Measures of Query Cost

The cost of query evaluation can be measured in


terms of different resources, including:
 disk accesses
 CPU time to execute a query in a
distributed or parallel database system
 the cost of communication.
Query Optimization

 It is the process of selecting the most efficient query-


evaluation plan from among the many strategies usually
possible for processing a given query, especially if the
query is complex.
there are many different ways to get an answer from a given
query. The result would be same in all scenarios.
DBMS strive to process the query in the most efficient (in
terms of ‘Time’) to produce the answer
Cost=Time Needed to get all aswers
Query optimization…….
• Query Optimization: A single query can be executed through different algorithms or re-
written in different forms and structures. Hence, the question of query optimization comes
into the picture – Which of these forms or pathways is the most optimal? The query
optimizer attempts to determine the most efficient way to execute a given query by
considering the possible query plans.
• Importance: The goal of query optimization is to reduce the system resources required to
fulfill a query, and ultimately provide the user with the correct result set faster.
it provides the user with faster results, which makes the application seem faster to the user.
it allows the system to service more queries in the same amount of time, because each
request takes less time than unoptimized queries.
reduces the amount of wear on the hardware (e.g. disk drives), and allows the server to run
more efficiently (e.g. lower power consumption, less memory usage).
Steps for Query Optimization

• Query optimization involves three steps namely:


 query tree generation
plan generation
query plan code generation.
Relational Algebra
Relational Query Languages
Query languages: Allow manipulation and retrieval of data from a database.
Relational model supports simple, powerful QLs:
◦ Strong formal foundation based on logic.
◦ Allows for much optimization.

Query Languages != programming languages!


◦ QLs not expected to be “Turing complete”.
◦ QLs not intended to be used for complex calculations.
◦ QLs support easy, efficient access to large data sets.
Formal Relational Query Languages
Two mathematical Query Languages form the basis for “real” languages
(e.g. SQL), and for implementation:
Relational Algebra: More operational, very useful for representing
execution plans.
Relational Calculus: Lets users describe what they want, rather than how
to compute it. (Non-operational, declarative.)

 Understanding Algebra & Calculus is key to


 understanding SQL, query processing!
Preliminaries
A query is applied to relation instances, and the result of a query is also a
relation instance.
◦ Schemas of input relations for a query are fixed (but query will run regardless of
instance!)
◦ The schema for the result of a given query is also fixed! Determined by definition of
query language constructs.

Positional vs. named-field notation:


◦ Positional notation easier for formal definitions, named-field notation more
readable.
◦ Both used in Relational Algebra and SQL
R1 sid bid day
Example Instances 22
58
101 10/10/96
103 11/12/96
“Sailors” and “Reserves” relations
for our examples. S1
sid sname rating age
We’ll use positional or named 22 dustin 7 45.0
field notation, assume that names 31 lubber 8 55.5
of fields in query results are 58 rusty 10 35.0
`inherited’ from names of fields in
query input relations.
S2 sid sname rating age
28 yuppy 9 35.0
31 lubber 8 55.5
44 guppy 5 35.0
58 rusty 10 35.0
Relational
Basic operations:
Algebra
◦ 
Selection ( ) Selects a subset of rows from relation.
◦ 
Projection ( ) Deletes unwanted columns from relation.


Cross-product ( ) Allows us to combine two relations.



Set-difference ( ) Tuples in reln. 1, but not in reln. 2.
Union (  ) Tuples in reln. 1 and in reln. 2.

Additional operations:
◦ Intersection, join, division, renaming: Not essential, but (very!) useful.

Since each operation returns a relation, operations can be composed! (Algebra is


“closed”.)
sname rating
Projection yuppy
lubber
9
8
Deletes attributes that are not in guppy 5
projection list.
rusty 10
Schema of result contains exactly the
fields in the projection list, with the  sname,rating(S2)
same names that they had in the (only)
input relation.
Projection operator has to eliminate
duplicates! (Why??) age
◦ Note: real systems typically don’t do duplicate
elimination unless the user explicitly asks for it.
35.0
(Why not?) 55.5
 age(S2)
sid sname rating age
Selection 28 yuppy 9 35.0
58 rusty 10 35.0
Selects rows that satisfy
selection condition.  rating 8(S2)
No duplicates in result!
(Why?)
Schema of result identical to
schema of (only) input sname rating
relation.
yuppy 9
Result relation can be the
input for another relational
rusty 10
algebra operation! (Operator
composition.)  sname,rating( rating 8(S2))
Union, Intersection, Set-Difference
sid sname rating age
All of these operations take two 22 dustin 7 45.0
input relations, which must be 31 lubber 8 55.5
union-compatible: 58 rusty 10 35.0
◦ Same number of fields.
◦ `Corresponding’ fields have the same
44 guppy 5 35.0
type. 28 yuppy 9 35.0
What is the schema of result? S1 S2

sid sname rating age


sid sname rating age 31 lubber 8 55.5
22 dustin 7 45.0 58 rusty 10 35.0
S1 S2 S1 S2
Figure 14-10 Select operation

 Write the relational algebra expression to achieve the


following:
Figure 14-11 Project operation
 Write the relational algebra expression to achieve the
following:

You might also like