The document discusses ER diagrams, converting ER diagrams to relational tables, functional dependencies, normalization, and lossless join decomposition. It provides examples of converting 1:1, 1:n, and weak entity relationships to tables. It also covers determining functional dependencies, computing closures of attribute sets, finding minimal covers, and ensuring decompositions do not lose information through lossless join properties.
Copyright:
Attribution Non-Commercial (BY-NC)
Available Formats
Download as PPT, PDF, TXT or read online from Scribd
The document discusses ER diagrams, converting ER diagrams to relational tables, functional dependencies, normalization, and lossless join decomposition. It provides examples of converting 1:1, 1:n, and weak entity relationships to tables. It also covers determining functional dependencies, computing closures of attribute sets, finding minimal covers, and ensuring decompositions do not lose information through lossless join properties.
The document discusses ER diagrams, converting ER diagrams to relational tables, functional dependencies, normalization, and lossless join decomposition. It provides examples of converting 1:1, 1:n, and weak entity relationships to tables. It also covers determining functional dependencies, computing closures of attribute sets, finding minimal covers, and ensuring decompositions do not lose information through lossless join properties.
Copyright:
Attribution Non-Commercial (BY-NC)
Available Formats
Download as PPT, PDF, TXT or read online from Scribd
The document discusses ER diagrams, converting ER diagrams to relational tables, functional dependencies, normalization, and lossless join decomposition. It provides examples of converting 1:1, 1:n, and weak entity relationships to tables. It also covers determining functional dependencies, computing closures of attribute sets, finding minimal covers, and ensuring decompositions do not lose information through lossless join properties.
Copyright:
Attribution Non-Commercial (BY-NC)
Available Formats
Download as PPT, PDF, TXT or read online from Scribd
Zachary G. Ives University of Pennsylvania CIS 550 Database & Information Systems
October 6, 2005
Some slide content courtesy of Susan Davidson & Raghu Ramakrishnan 2 Examples of ER Diagrams Please interpret these ER diagrams: COURSES STUDENTS Takes COURSES STUDENTS Takes STUDENTS COURSES Takes 3 Converting ER Relationship Sets to Tables: 1:n Relationships CREATE TABLE Teaches( fid INTEGER, serno CHAR(15), semester CHAR(4), PRIMARY KEY (serno), FOREIGN KEY (fid) REFERENCES PROFESSORS, FOREIGN KEY (serno) REFERENCES Teaches) CREATE TABLE Teaches_Course( serno INTEGER, subj VARCHAR(30), cid CHAR(15), fid CHAR(15), when CHAR(4), PRIMARY KEY (serno), FOREIGN KEY (fid) REFERENCES PROFESSORS) 1 entity = key of relationship set: Or embed relationship in many entity set: COURSES PROFESSORS Teaches 4 1:1 Relationships If you borrow money or have credit, you might get:
What are the table options? CreditReport Borrower delinquent? ssn name debt Describes rid 5 ISA Relationships: Subclassing (Structurally) Inheritance states that one entity is a special kind of another entity: subclass should be member of base class name ISA People id Employees salary 6 But How Does this Translate into the Relational Model? Compare these options: Two tables, disjoint tuples Two tables, disjoint attributes One table with NULLs Object-relational databases (allow subclassing of tables) 7 Weak Entities A weak entity can only be identified uniquely using the primary key of another (owner) entity. Owner and weak entity sets in a one-to-many relationship set, 1 owner : many weak entities Weak entity set must have total participation People Feeds Pets ssn name weeklyCost name species 8 Translating Weak Entity Sets Weak entity set and identifying relationship set are translated into a single table; when the owner entity is deleted, all owned weak entities must also be deleted CREATE TABLE Feed_Pets ( name VARCHAR(20), species INTEGER, weeklyCost REAL, ssn CHAR(11) NOT NULL, PRIMARY KEY (pname, ssn), FOREIGN KEY (ssn) REFERENCES Employees, ON DELETE CASCADE) 9 N-ary Relationships Relationship sets can relate an arbitrary number of entity sets: Student Project Advisor Indep Study 10 Summary of ER Diagrams One of the primary ways of designing logical schemas CASE tools exist built around ER (e.g. ERWin, PowerBuilder, etc.) Translate the design automatically into DDL, XML, UML, etc. Use a slightly different notation that is better suited to graphical displays Some tools support constraints beyond what ER diagrams can capture Can you get different ER diagrams from the same data? 11 Schema Refinement & Design Theory ER Diagrams give us a start in logical schema design Sometimes need to refine our designs further Theres a system and theory for this Focus is on redundancy of data Causes update, insertion, deletion anomalies 12 Not All Designs are Equally Good Why is this a poor schema design?
And why is this one better? Stuff(sid, name, serno, subj, cid, exp-grade) Student(sid, name) Course(serno, cid) Subject(cid, subj) Takes(sid, serno, exp-grade) 13 Focus on the Bad Design
Certain items (e.g., name) get repeated Some information requires that a student be enrolled (e.g., courses) due to the key
sid name serno subj cid exp-grade 1 Sam 570103 AI 520 B 23 Nitin 550103 DB 550 A 45 Jill 505103 OS 505 A 1 Sam 505103 OS 505 C 14 Functional Dependencies Describe Key-Like Relationships A key is a set of attributes where: If keys match, then the tuples match A functional dependency (FD) is a generalization: If an attribute set determines another, written X ! Y then if two tuples agree on attribute set X, they must agree on X:
sid ! name
What other FDs are there in this data? FDs are independent of our schema design choice 15 Formal Definition of FDs Def. Given a relation schema R and subsets X, Y of R: An instance r of R satisfies FD X Y if, for any two tuples t1, t2 2 r, t1[X ] = t2[X] implies t1[Y] = t2[Y] For an FD to hold for schema R, it must hold for every possible instance of r
(Can a DBMS verify this? Can we determine this by looking at an instance?) 16 General Thoughts on Good Schemas We want all attributes in every tuple to be determined by the tuples key attributes, i.e. part of a superkey (for key X Y, a superkey is a non-minimal X) What does this say about redundancy? But: What about tuples that dont have keys (other than the entire value)? What about the fact that every attribute determines itself?
17 Armstrongs Axioms: Inferring FDs Some FDs exist due to others; can compute using Armstrongs axioms: Reflexivity: If Y _ X then X Y (trivial dependencies) name, sid name Augmentation: If X Y then XW YW serno subj so serno, exp-grade subj, exp-grade Transitivity: If X Y and Y Z then X Z serno cid and cid subj so serno subj
18 Armstrongs Axioms Lead to Union: If X Y and X Z then X YZ Pseudotransitivity: If X Y and WY Z then XW Z Decomposition: If X Y and Z _ Y then X Z
Lets prove these from Armstrongs Axioms 19 Closure of a Set of FDs Defn. Let F be a set of FDs. Its closure, F + ,
is the set of all FDs: {X Y | X Y is derivable from F by Armstrongs Axioms} Which of the following are in the closure of our Student-Course FDs? name name cid subj serno subj cid, sid subj cid sid 20 Attribute Closures: Is Something Dependent on X? Defn. The closure of an attribute set X, X + , is: X + = {Y | X Y e F + } This answers the question is Y determined (transitively) by X?; compute X + by:
Does sid, serno subj, exp-grade? closure := X; repeat until no change { if there is an FD U V in F such that U is in closure then add V to closure} 21 Equivalence of FD sets Defn. Two sets of FDs, F and G, are equivalent if their closures are equivalent, F + = G +
e.g., these two sets are equivalent: {XY Z, X Y} and {X Z, X Y}
F + contains a huge number of FDs (exponential in the size of the schema) Would like to have smallest representative FD set 22 Minimal Cover Defn. A FD set F is minimal if: 1. Every FD in F is of the form X A, where A is a single attribute 2. For no X A in F is: F {X A } equivalent to F 3. For no X A in F and Z c X is: F {X A } {Z A } equivalent to F Defn. F is a minimum cover for G if F is minimal and is equivalent to G. e.g., {X Z, X Y} is a minimal cover for {XY Z, X Z, X Y} in a sense, each FD is essential to the cover we express each FD in simplest form 23 More on Closures If F is a set of FDs and X Y e F +
then for some attribute A e Y, X A e F +
Proof by counterexample. Assume otherwise and let Y = {A 1 ,..., A n } Since we assume X A 1 , ..., X A n are in F +
then X A 1 ... A n is in F + by union rule, hence, X Y is in F + which is a contradiction
24 Why Armstrongs Axioms? Why are Armstrongs axioms (or an equivalent rule set) appropriate for FDs? They are: Consistent: any relation satisfying FDs in F will satisfy those in F +
Complete: if an FD X Y cannot be derived by Armstrongs axioms from F, then there exists some relational instance satisfying F but not X Y
In other words, Armstrongs axioms derive all the FDs that should hold 25 Proving Consistency We prove that the axioms definitions must be true for any instance, e.g.: For augmentation (if X Y then XW YW):
If an instance satisfies X Y, then: For any tuples t 1 , t 2 er, if t 1 [X] = t 2 [X] then t 1 [Y] = t 2 [Y] by defn.
If, additionally, it is given that t 1 [W] = t 2 [W], then t 1 [YW] = t 2 [YW] 26 Proving Completeness Suppose X Y e F + and define a relational instance r that satisfies F + but not X Y: Then for some attribute A e Y, X A e F +
Let some pair of tuples in r agree on X + but disagree everywhere else: x 1 x 2 ... x n a 1,1 v 1 v 2 ... v m w 1,1 w 2,1 ... x 1 x 2 ... x n a 1,2 v 1 v 2 ... v m w 1,2 w 2,2 ... X A X + X R X + {A} 27 Proof of Completeness contd Clearly this relation fails to satisfy X A and X Y. We also have to check that it satisfies any FD in F + . The tuples agree on only X + . Thus the only FDs that might be violated are of the form X Y where X _ X + and Y contains attributes in R X + {A}. But if X Ye F + and X _ X + then Y _ X + (reflexivity and augmentation). Therefore X Y is satisfied. 28 Decomposition Consider our original bad attribute set
We could decompose it into
But this decomposition loses information about the relationship between students and courses. Why? Stuff(sid, name, serno, subj, cid, exp-grade) Student(sid, name) Course(serno, cid) Subject(cid, subj) 29 Lossless Join Decomposition R 1 , R k is a lossless join decomposition of R w.r.t. an FD set F if for every instance r of R that satisfies F, [ R 1 (r) ... [ R k (r) = r Consider:
What if we decompose on (sid, name) and (serno, subj, cid, exp-grade)? sid name serno subj cid exp-grade 1 Sam 570103 AI 570 B 23 Nitin 550103 DB 550 A 30 Testing for Lossless Join R 1 , R 2 is a lossless join decomposition of R with respect to F iff at least one of the following dependencies is in F+ (R 1 R 2 ) R 1 R 2
(R 1 R 2 ) R 2 R 1 So for the FD set: sid name serno cid, exp-grade cid subj
Is (sid, name) and (serno, subj, cid, exp-grade) a lossless decomposition? 31 Dependency Preservation Ensures we can easily check whether a FD X Y is violated during an update to a database:
The projection of an FD set F onto a set of attributes Z, F Z is {X Y | X Y e F + , X Y _ Z} i.e., it is those FDs local to Zs attributes A decomposition R 1 , , R k is dependency preserving if F + = (F R 1 ... F R k ) +
The decomposition hasnt lost any essential FDs, so we can check without doing a join 32 Example of Lossless and Dependency-Preserving Decompositions Given relation scheme R(name, street, city, st, zip, item, price) And FD set name street, city street, city st street, city zip name, item price Consider the decomposition R 1 (name, street, city, st, zip) and R 2 (name, item, price) Is it lossless? Is it dependency preserving? What if we replaced the first FD by name, street city? 33 Another Example Given scheme: R(sid, fid, subj) and FD set: fid subj sid, subj fid Consider the decomposition R 1 (sid, fid) and R 2 (fid, subj)
Is it lossless? Is it dependency preserving? 34 FDs and Keys Ideally, we want a design s.t. for each nontrivial dependency X Y, X is a superkey for some relation schema in R We just saw that this isnt always possible Hence we have two kinds of normal forms 35 Two Important Normal Forms Boyce-Codd Normal Form (BCNF). For every relation scheme R and for every X A that holds over R, either A e X (it is trivial) ,or or X is a superkey for R Third Normal Form (3NF). For every relation scheme R and for every X A that holds over R, either A e X (it is trivial), or X is a superkey for R, or A is a member of some key for R 36 Normal Forms Compared BCNF is preferable, but sometimes in conflict with the goal of dependency preservation Its strictly stronger than 3NF
Lets see algorithms to obtain: A BCNF lossless join decomposition A 3NF lossless join, dependency preserving decomposition 37 BCNF Decomposition Algorithm (from Korth et al.; our book gives recursive version) result := {R} compute F+ while there is a schema R i in result that is not in BCNF { let A B be a nontrivial FD on R i
s.t. A R i is not in F+ and A and B are disjoint
result:= (result R i ) {(R i - B), (A,B)} } 38 3NF Decomposition Algorithm by Phil Bernstein, now @ MS Research Let F be a minimal cover i:=0 for each FD A B in F { if none of the schemas R j , 1s j s i, contains AB { increment i R i := (A, B) } } if no schema R j , 1 s j s i contains a candidate key for R { increment i R i := any candidate key for R } return (R 1 , , R i ) Build dep.- preserving decomp. Ensure lossless decomp. 39 Summary We can always decompose into 3NF and get: Lossless join Dependency preservation But with BCNF we are only guaranteed lossless joins BCNF is stronger than 3NF: every BCNF schema is also in 3NF The BCNF algorithm is nondeterministic, so there is not a unique decomposition for a given schema R