Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 27

Distributed DBMS

M. T. zsu & P. Valduriez


Ch.7/1
Outline
Introduction
Background
Distributed Database Design
Database Integration
Semantic Data Control
Distributed Query Processing
Overview
Query decomposition and localization
Distributed query optimization
Multidatabase query processing
Distributed Transaction Management
Data Replication
Parallel Database Systems
Distributed Object DBMS
Peer-to-Peer Data Management
Web Data Management
Current Issues
Distributed DBMS
M. T. zsu & P. Valduriez
Ch.7/2
Step 1 Query Decomposition
Input : Calculus query on global relations
Normalization
manipulate query quantifiers and qualification
Analysis
detect and reject incorrect queries
possible for only a subset of relational calculus
Simplification
eliminate redundant predicates
Restructuring
calculus query algebraic query
more than one translation is possible
use transformation rules
Distributed DBMS
M. T. zsu & P. Valduriez
Ch.7/3
Normalization
Lexical and syntactic analysis
check validity (similar to compilers)
check for attributes and relations
type checking on the qualification
Put into normal form
Conjunctive normal form
(p
11
v p
12
v v p
1n
) . . (p
m1
v p
m2
v v p
mn
)
Disjunctive normal form
(p
11
. p
12
. . p
1n
) v v (p
m1
. p
m2
. . p
mn
)
OR's mapped into union
AND's mapped into join or selection
Distributed DBMS
M. T. zsu & P. Valduriez
Ch.7/4
Analysis
Refute incorrect queries
Type incorrect
If any of its attribute or relation names are not defined in the global schema
If operations are applied to attributes of the wrong type
Semantically incorrect
Components do not contribute in any way to the generation of the result
Only a subset of relational calculus queries can be tested for correctness
Those that do not contain disjunction and negation
To detect
connection graph (query graph)
join graph
Distributed DBMS
M. T. zsu & P. Valduriez
Ch.7/5
Analysis Example
SELECT ENAME,RESP
FROM EMP, ASG, PROJ
WHERE EMP.ENO = ASG.ENO
AND ASG.PNO = PROJ.PNO
AND PNAME = "CAD/CAM"
AND DUR 36
AND TITLE = "Programmer"
Query graph
Join graph
DUR36
PNAME=CAD/CAM
ENAME
EMP.ENO=ASG.ENO ASG.PNO=PROJ.PNO
RESULT
TITLE =
Programmer
RESP
ASG.PNO=PROJ.PNO EMP.ENO=ASG.ENO
ASG
PROJ
EMP
EMP PROJ
ASG
Distributed DBMS
M. T. zsu & P. Valduriez
Ch.7/6
Analysis
If the query graph is not connected, the query may be wrong or
use Cartesian product
SELECT ENAME,RESP
FROM EMP, ASG, PROJ
WHERE EMP.ENO = ASG.ENO
AND PNAME = "CAD/CAM"
AND DUR > 36
AND TITLE = "Programmer"
PNAME=CAD/CAM
ENAME
RESULT
RESP
ASG
PROJ EMP
Distributed DBMS
M. T. zsu & P. Valduriez
Ch.7/7
Simplification
Why simplify?
Remember the example
How? Use transformation rules
Elimination of redundancy
idempotency rules
p
1
. ( p
1
) false
p
1
. (p
1
( p
2
) p
1
p
1
. false p
1

Application of transitivity
Use of integrity rules
Distributed DBMS
M. T. zsu & P. Valduriez
Ch.7/8
Simplification Example
SELECT TITLE
FROM EMP
WHERE EMP.ENAME = "J. Doe"
OR (NOT(EMP.TITLE = "Programmer")
AND (EMP.TITLE = "Programmer"
OR EMP.TITLE = "Elect. Eng.")
AND NOT(EMP.TITLE = "Elect. Eng."))

SELECT TITLE
FROM EMP
WHERE EMP.ENAME = "J. Doe"
Distributed DBMS
M. T. zsu & P. Valduriez
Ch.7/9
Restructuring
Convert relational calculus to relational
algebra
Make use of query trees
Example
Find the names of employees other than
J. Doe who worked on the CAD/CAM
project for either 1 or 2 years.
SELECT ENAME
FROM EMP, ASG, PROJ
WHERE EMP.ENO = ASG.ENO
AND ASG.PNO = PROJ.PNO
AND ENAME "J. Doe"
AND PNAME = "CAD/CAM"
AND (DUR = 12 OR DUR = 24)
H
ENAME

DUR=12 OR DUR=24

PNAME=CAD/CAM

ENAMEJ. DOE
PROJ ASG EMP
Project
Select
Join

PNO

ENO
Distributed DBMS
M. T. zsu & P. Valduriez
Ch.7/10
Restructuring Transformation
Rules
Commutativity of binary operations
R S S R
R S S R
R S S R
Associativity of binary operations
( R S) T R (S T)
(R S) T R (S T)
Idempotence of unary operations
H
A
(H
A
(R)) H
A
(R)
o
p
1
(A
1
)
(o
p
2
(A
2
)
(R)) o
p
1
(A
1
).p
2
(A
2
)
(R)
where R[A] and A' _ A, A" _ A and A' _ A"
Commuting selection with projection
Distributed DBMS
M. T. zsu & P. Valduriez
Ch.7/11
Restructuring Transformation
Rules
Commuting selection with binary operations
o
p(A)
(R S) (o
p(A)
(R)) S
o
p(A
i
)
(R
(A
j
,B
k
)
S) (o
p(A
i
)
(R))
(A
j
,B
k
)
S
o
p(A
i
)
(R T) o
p(A
i
)
(R) o
p(A
i
)
(T)
where A
i
belongs to R and T
Commuting projection with binary operations
H
C
(R S) H
A
(R) H
B
(S)
H
C
(R
(A
j
,B
k
)
S) H
A
(R)
(A
j
,B
k
)
H
B
(S)
H
C
(R S) H
C
(R) H
C
(S)
where R[A] and S[B]; C = A' B' where A' _ A, B' _ B
Distributed DBMS
M. T. zsu & P. Valduriez
Ch.7/12
Example
Recall the previous example:
Find the names of employees other
than J. Doe who worked on the
CAD/CAM project for either one or
two years.

SELECT ENAME
FROM PROJ, ASG, EMP
WHERE ASG.ENO=EMP.ENO
AND ASG.PNO=PROJ.PNO
AND ENAME "J. Doe"
AND PROJ.PNAME="CAD/CAM"
AND (DUR=12 OR DUR=24)
H
ENAME
o
DUR=12 . DUR=24
o
PNAME=CAD/CAM
o
ENAMEJ. DOE
PROJ ASG EMP
Project
Select
Join

PNO

ENO
Distributed DBMS
M. T. zsu & P. Valduriez
Ch.7/13
Equivalent Query
H
ENAME
o
PNAME=CAD/CAM . (DUR=12 . DUR=24) .ENAMEJ. Doe

PROJ
ASG
EMP

PNO,ENO
Distributed DBMS
M. T. zsu & P. Valduriez
Ch.7/14
EMP
H
ENAME
o
ENAME

"J. Doe"
ASG PROJ
H
PNO,ENAME
o
PNAME
=
"CAD/CAM"
H
PNO
o
DUR
=12.DUR=24
H
PNO,ENO
H
PNO,ENAME
Restructuring

PNO

ENO
Distributed DBMS
M. T. zsu & P. Valduriez
Ch.7/15
Step 2 Data Localization
Input: Algebraic query on distributed relations
Determine which fragments are involved
Localization program
substitute for each global query its materialization program
optimize
Distributed DBMS
M. T. zsu & P. Valduriez
Ch.7/16
Example
Assume
EMP is fragmented into EMP
1
, EMP
2
,
EMP
3
as follows:
EMP
1
= o
ENOE3
(EMP)
EMP
2
= o
E3<ENOE6
(EMP)
EMP
3
= o
ENOE6

(EMP)
ASG fragmented into ASG
1
and ASG
2

as follows:
ASG
1
= o
ENOE3
(ASG)
ASG
2
= o
ENO>E3
(ASG)
Replace EMP by (EMP
1
EMP
2
EMP
3
)
and ASG by (ASG
1
ASG
2
) in any query
H
ENAME
o
DUR=12 .DUR=24
o
PNAME=CAD/CAM
o
ENAMEJ. DOE
PROJ


EMP
1
EMP
2
EMP
3
ASG
1
ASG
2

PNO

ENO
Distributed DBMS
M. T. zsu & P. Valduriez
Ch.7/17
Provides Parallellism
EMP
3
ASG
1
EMP
2
ASG
2
EMP
1
ASG
1

EMP
3
ASG
2

ENO

ENO

ENO

ENO
Distributed DBMS
M. T. zsu & P. Valduriez
Ch.7/18
Eliminates Unnecessary Work
EMP
2
ASG
2
EMP
1
ASG
1
EMP
3
ASG
2

ENO

ENO

ENO
Distributed DBMS
M. T. zsu & P. Valduriez
Ch.7/19
Reduction for PHF
Reduction with selection
Relation R and F
R
={R
1
, R
2
, , R
w
} where R
j
=o
p
j
(R)
o
p
i
(R
j
)=C if x in R: (p
i
(x). p
j
(x))
Example
SELECT *
FROM EMP
WHERE ENO="E5"
o
ENO=E5
EMP
1
EMP
2
EMP
3
EMP
2
o
ENO=E5

Distributed DBMS
M. T. zsu & P. Valduriez
Ch.7/20
Reduction for PHF
Reduction with join
Possible if fragmentation is done on join attribute
Distribute join over union
(R
1
R
2
)S (R
1
S) (R
2
S)
Given R
i
=o
p
i
(R) and R
j
= o
p
j
(R)
R
i
R
j
=C if x in R
i
, y in R
j
: (p
i
(x). p
j
(y))
Distributed DBMS
M. T. zsu & P. Valduriez
Ch.7/21
Reduction for PHF
Assume EMP is fragmented as
before and
ASG
1
: o
ENO "E3"
(ASG)
ASG
2
: o
ENO > "E3"
(ASG)
Consider the query
SELECT *
FROM EMP,ASG
WHERE EMP.ENO=ASG.ENO
Distribute join over unions
Apply the reduction rule


EMP
1
EMP
2
EMP
3
ASG
1
ASG
2

ENO

EMP
1
ASG
1
EMP
2
ASG
2
EMP
3
ASG
2

ENO

ENO

ENO
Distributed DBMS
M. T. zsu & P. Valduriez
Ch.7/22
Reduction for VF
Find useless (not empty) intermediate relations
Relation R defined over attributes A = {A
1
, ..., A
n
} vertically fragmented
as R
i
=H
A'
(R) where A'_ A:
H
D,K
(R
i
) is useless if the set of projection attributes D is not in A'
Example: EMP
1
=H
ENO,ENAME
(EMP); EMP
2
=H
ENO,TITLE
(EMP)
SELECT ENAME
FROM EMP
EMP
1
EMP
1
EMP
2
H
ENAME

ENO
H
ENAME
Distributed DBMS
M. T. zsu & P. Valduriez
Ch.7/23
Reduction for DHF
Rule :
Distribute joins over unions
Apply the join reduction for horizontal fragmentation
Example
ASG
1
: ASG
ENO
EMP
1
ASG
2
: ASG
ENO
EMP
2
EMP
1
: o
TITLE=Programmer
(EMP)
EMP
2
: o
TITLE=Programmer
(EMP)
Query
SELECT *
FROM EMP, ASG
WHERE ASG.ENO = EMP.ENO
AND EMP.TITLE = "Mech. Eng."
Distributed DBMS
M. T. zsu & P. Valduriez
Ch.7/24
Generic query
Selections first
Reduction for DHF

ASG
1
o
TITLE=Mech. Eng.
ASG
2
EMP
1
EMP
2

ASG
1
ASG
2
EMP
2
o
TITLE=Mech. Eng.

ENO

ENO
Distributed DBMS
M. T. zsu & P. Valduriez
Ch.7/25
Joins over unions
Reduction for DHF
Elimination of the empty intermediate relations
(left sub-tree)

ASG
1
EMP
2
EMP
2
o
TITLE=Mech. Eng.
ASG
2
o
TITLE=Mech. Eng.
ASG
2
EMP
2
o
TITLE=Mech. Eng.

ENO

ENO

ENO
Distributed DBMS
M. T. zsu & P. Valduriez
Ch.7/26
Reduction for Hybrid
Fragmentation
Combine the rules already specified:
Remove empty relations generated by contradicting selections on horizontal
fragments;
Remove useless relations generated by projections on vertical fragments;
Distribute joins over unions in order to isolate and remove useless joins.
Distributed DBMS
M. T. zsu & P. Valduriez
Ch.7/27
Reduction for HF
Example
Consider the following hybrid
fragmentation:
EMP
1
= o
ENO"E4"
(H
ENO,ENAME
(EMP))
EMP
2
= o
ENO>"E4"
(H
ENO,ENAME
(EMP))
EMP
3
= o
ENO,TITLE
(EMP)
and the query
SELECT ENAME
FROM EMP
WHERE ENO="E5" EMP
1
EMP
2

EMP
3
o
ENO=E5
H
ENAME
EMP
2
o
ENO=E5
H
ENAME

ENO

You might also like