The Relational Model: Text: Chapter 3

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 45

The Relational Model

„ Text: Chapter 3

© G. Tsiknis; Based on Ramakrishnan & Gehrke, DB Management Systems


Learning Goals

„ given an ER model, design a minimum number of correct


tables that captures the information in it
„ given an ER model with inheritance relations, weak
entities and aggregations, design the right tables for it
„ given a table design, create correct tables for this design
in SQL, including primary and foreign key constraints
„ compare different table designs for the same problem,
identify errors and provide corrections

Unit 3 2
Historical Perspective
„ Introduced by Edgar Codd (IBM) in 1970
„ Most widely used model today.
¾ Vendors: IBM, Informix, Microsoft, Oracle, Sybase, etc.
„ “Legacy systems” are usually hierarchical or network
models (i.e., not relational)
¾ e.g., IMS, IDMS, …
„ Competitor: object-oriented model
¾ ObjectStore, Versant, Ontos
¾ A synthesis emerging: object-relational model
o Informix Universal Server, UniSQL, O2, Oracle, DB2
„ Recent competitor: XML data model

Unit 3 3
Main Characteristics of the Relational Model

„ Exceedingly simple to understand


¾ main abstraction is represented as a table
„ Simple query language separate from application
language
„ Lots of bells and whistles to do complicated things

Unit 3 4
Structure of Relational Databases

„ Relational database: a set of relations


„ Relation: made up of 2 parts:
¾ Schema : specifies name of relation, plus name and domain
(type) of each field (or column or attribute).
o e.g., Student (sid: string, name: string, address: string,
phone: string, major: string).
¾ Instance : a table, with rows and columns.
#Rows = cardinality,
¾ #fields = dimension / arity / degree
„ Can think of a relation as a set of rows or tuples (i.e., all
rows are distinct)
„ Relational Database Schema: collection of schemas in
the database
„ Database Instance: a collection of instances of its
relations
Unit 3 5
attribute,
Example of a Relation Instance column name,
field name
Student
relation
sid name address phone major
name 1234 W. 12th
99111120 G. Jones 889-4444 CPSC
Ave., Van.
2020 E. 18th St.,
92001200 G. Smith 409-2222 MATH
tuple, Van
row, 2020 E. 18th St.,
94001020 A. Smith 222-2222 CPSC
record Van
Campeau
94001150 null null null
J.
domain
value attribute
column,
‰ Cardinality = 4, degree = 5, all rows distinct field
‰ Order of rows isn’t important
‰ Order of fields may be important
Unit 3 6
Formal Structure
„ Formally, an n-ary relation r is a set of n-tuples
(a1, a2,…,an)
where ai is in Di , the domain (set of allowed values) of the
i-th attribute.
„ Attribute values are usually atomic
¾ i.e. integers, floats, strings.
„ A domain contains a special value null indicating that the
value is not known.
„ If A1,,…, An are attributes with domains D1,. ..Dn, then
(A1:D1, …, An :Dn)
is a relation schema that defines a relation type
„ r(R) is a relation over the schema R ( or, of type R).
Unit 3 7
Relational Query Languages

„ A major strength of the relational model: supports simple,


powerful querying of data.
„ Queries can be written intuitively, and the DBMS is
responsible for efficient evaluation.
¾ Precise semantics for relational queries.
¾ Allows the optimizer to extensively re-order operations, and still
ensure that the answer does not change.

Unit 3 8
The SQL Query Language

„ Developed by IBM (System R) in the 1970s


„ Standards:
¾ SQL-86
¾ SQL-89 (minor revision)
¾ SQL-92 (major revision, current standard)
¾ SQL-99 (major extensions)

Unit 3 9
The SQL Query Language
Student
sid name address phone major
1234 W. 12 th
99111120 G. Jones 889-4444 CPSC
Ave., Van.
2020 E. 18 th St.,
92001200 G. Smith 409-2222 MATH
Van
2020 E. 18 th St.,
94001020 A. Smith 222-2222 CPSC
Van
„ To find the id’s, names and phones of all CPSC students, we can write:

SELECT sid, name, phone sid name phone


FROM Student 99111120 G. Jones 889-4444
WHERE major=“CPSC” 94001020 A. Smith 222-2222

„To select whole rows , replace “SELECT sid, name, phone ”


with “SELECT * ”
Unit 3 10
Querying Multiple Tables
Student Grade
sid name address phone major sid dept course# mark
99111120 G. Jones … … CPSC 99111120 CPSC 122 80

… … …. … … … … …. …

„ To select id and names of the students who have taken


some CPSC course, we write:

SELECT sid, name


FROM Student, Grade
WHERE Student.sid = Grade.sid AND dept = ‘CPSC’

Unit 3 11
Creating Relations in SQL/DDL

„ The statement on right CREATE TABLE Student


¾ creates Students relation (sid CHAR(8),
¾ observe that the type (domain) of
each field is specified, and is
name CHAR(20),
enforced by the DBMS whenever address CHAR(30),
tuples are added or modified phone CHAR(13),
major CHAR(4))

CREATE TABLE Grade


„ The statement on right
¾ creates Grade relation
(sid CHAR(8),
¾ information about courses that a dept CHAR(4),
student takes course# CHAR(3),
mark INTEGER)
Unit 3 12
Destroying and Altering Relations

DROP TABLE Student

„ Destroys the relation Student. The schema


information and the tuples are deleted.

ALTER TABLE Student


ADD COLUMN gpa REAL;

„ The schema of Students is altered by adding a new


field; every tuple in the current instance is extended
with a null value in the new field.
Unit 3 13
Adding and Deleting Tuples

„ Can insert a single tuple using:


INSERT
INTO Student (sid, name, address, phone, major)
VALUES (‘52033688’, ‘G. Chan’, ‘1235 W. 33, Van’,
‘882-4444’, ‘PHYS’)
„ Can delete all tuples satisfying some condition (e.g.,
name = ‘Smith’):
DELETE
FROM Student
WHERE name = ‘Smith’

Powerful variants of these commands are available; more later


Unit 3 14
Integrity Constraints (ICs)

„ IC: condition that must be true for any instance of the


database; e.g., domain constraints
¾ ICs are specified when schema is defined
¾ ICs are checked when relations are modified
„ A legal instance of a relation is one that satisfies all
specified ICs
¾ DBMS should not allow illegal instances
¾ Avoids data entry errors, too!
„ The types of IC’s depend of the data model.
¾ Next we discuss constraints for relational databases

Unit 3 15
Keys Constraints (for Relations)

„ Similar to those for entity sets in the ER model


„ A set S={S1, S2, …, Sm} of fields in an n-ary relation (1 ≤
m ≤ n) is a key (or candidate key) for a relation if :
1. No two distinct tuples can have the same values in all
key fields, and
2. No subset of S is itself a key (according to (1)).
(If such a subset exists, then S is a superkey and not a key.)
„ One of the possible keys is chosen (by the DBA) to be the
primary key.
„ e.g.
¾ {sid, gpa} is a superkey
¾ sid is a key and the primary key for Students
Unit 3 16
Keys Constraints in SQL

„ Candidate keys are specified using the UNIQUE


constraint
¾ values for a group of fields must be unique (if they are not null)
¾ these fields can be null
„ A PRIMARY KEY constraint specifies a table’s primary
key
¾ values for primary key must be unique
¾ a primary key field cannot be null
„ Key constraints are checked when
¾ new values are inserted
¾ values are modified

Unit 3 17
Keys Constraints in SQL (cont’)

™ (Ex.1- Normal) “For a CREATE TABLE Grade


given student and (sid CHAR(8)
course, there is a single dept CHAR(4), course# CHAR(3),
grade.” mark INTEGER,
PRIMARY KEY (sid,dept,course#) )
vs.
™ (Ex.2 - Silly) “Students CREATE TABLE Grade2
can take a course once, (sid CHAR(8)
and receive a single dept CHAR(4), course# CHAR(3),
grade for that course; mark CHAR(2),
further, no two students PRIMARY KEY (sid,dept,course#),
in a course receive the UNIQUE (dept,course#,mark) )
same grade.”
Unit 3 18
Foreign Keys Constraints

„ Foreign key : Set of fields in one relation that is used to


‘reference’ a tuple in another relation.
¾ Must correspond to the primary key of the other relation.
¾ Like a ‘logical pointer’.
„ E.g. consider
Grade(sid string, dept string, course# string, mark integer)
¾ sid is a foreign key referring to Student:
¾ (dept, course#) is a foreign key referring to Course

„ Referential integrity: All foreign keys reference existing


entities.
¾ i.e. there are no dangling references
¾ all foreign key constraints are enforced,.
¾ Example of a data model without RI:
o Links in HTML
Unit 3 19
Foreign Keys in SQL

„ Only students listed in the Students relation should be


allowed to have grades for courses.
CREATE TABLE Grade
(sid CHAR(8), dept CHAR(4), course# CHAR(3), mark INTEGER,
PRIMARY KEY (sid,dept,course#),
FOREIGN KEY (sid) REFERENCES Student,
FOREIGN KEY (dept, course#) REFERENCES Course )
Student Grade
sid name address Phone major sid dept course# mark
53666 G. Jones …. … … 53666 CPSC 101 80
53688 J. Smith …. … … 53666 RELG 100 45
53650 G. Smith …. … … 53650 MATH 200 null
53666 HIST 201 60
Unit 3 20
Enforcing Referential Integrity

„ Consider Students and Grade; sid in Grade is a foreign key


that references Student.
„ What should be done if an Grade tuple with a non-existent
student id is inserted? (Reject it!)
„ What should be done if a Student tuple is deleted?
¾ Also delete all Grade tuples that refer to it?
¾ Disallow deletion of this particular Student tuple?
¾ Set sid in Grade tuples that refer to it, to a default sid.
¾ Set sid in Grade tuples that refer to it, to null, (the special value denoting
`unknown’ or `inapplicable’.)
o problem if sid is part of the primary key
„ Similar if primary key of a Student tuple is updated

Unit 3 21
Referential Integrity in SQL/92

„ SQL/92 supports all 4 options


CREATE TABLE Grade
on deletes and updates. (sid CHAR(8), dept CHAR(4),
¾ Default is NO ACTION course# CHAR(3), mark INTEGER,
(delete/update is rejected) PRIMARY KEY (sid,dept,course#),
¾ CASCADE (also FOREIGN KEY (sid)
updates/deletes all tuples REFERENCES Student
that refer to the ON DELETE CASCADE
updated/deleted tuple)
ON UPDATE CASCADE
¾ SET NULL / SET DEFAULT
(referencing tuple value is FOREIGN KEY (dept, course#)
set to the default foreign REFERENCES Course
key value ) ON DELETE SET DEFAULT
ON UPDATE CASCADE );

Unit 3 22
More IC’s in SQL
„ SQL supports two types of IC’s
¾ table constraints
¾ assertions
„ Table constraints
¾ are part of the table definition
¾ domain, key, referential constraints, etc
¾ Checked when a table is created or modified
„ Assertions
¾ Involve more than one table
¾ Checked when any of the tables is modified
„ SQL constraints can be named
¾ CONSTRAINT studentKey PRIMARY KEY (sid)
„ Constraints can be set to be
¾ DEFERRED (is applied at the end of the transaction
¾ IMMEDIATE (is applied at the end of an SQL statement)
„ More on constraints when we discuss SQL
Unit 3 23
Where do ICs Come From?

„ ICs are based upon the semantics of the real-world enterprise


being described (in the database relations).
„ We can check a database instance to verify an IC, but we cannot
tell what the IC’s are just by looking at the instance.
¾ For example, even if all student names differ, we cannot
assume that name is a key.
¾ An IC is a statement about all possible instances.
„ All constraints must be identified during the conceptual design.
„ Some constraints can be explicitly specified in the conceptual
model
¾ Key and foreign key ICs are shown on ER diagrams.
„ Others are written in a more general language.

Unit 3 24
Logical DB Design: ER to Relational
Entity sets to tables.
„ Each strong entity set is mapped to a table.
¾ entity attributes become table attributes
¾ entity keys become table keys

name CREATE TABLE Employee


ssn salary (ssn CHAR(11),
name CHAR(20),
Employee salary FLOAT,
PRIMARY KEY (ssn))

Unit 3 25
Relationship Sets to Tables: Simple Case
since
name dname ssn and did
ssn salary did budget CANNOT
be null

Employee Works_In Department

„ Relationship set has no constraints


CREATE TABLE Works_In(
(i.e. is many-to-many)
„ In this case, it is mapped to a new
ssn CHAR(11),
table. did INTEGER,
„ Attributes of the table must include: since DATE,
¾ Keys for each participating PRIMARY KEY (ssn, did),
entity set as foreign keys.
FOREIGN KEY (ssn)
o This set of attributes forms
a key for the relation. REFERENCES Employee,
¾ All descriptive attributes. FOREIGN KEY (did)
Unit 3
REFERENCES Department)26
Relationship Sets to Tables (cont’)

„ In some cases, we need to use the


course# roles:
dept title
CREATE TABLE Prerequisite(
course_dept CHAR(4),
Course course_no CHAR(3),
prereq_dept CHAR(4),
course prereq
prereq_no CHAR(3),
Prerequisite PRIMARY KEY (course_dept, course_no,
prereq_dept, prereq_no),
FOREIGN KEY (course_dept, course_no)
REFERENCES Course(dept, course#),
FOREIGN KEY (prereq_dept, prereq_no)
REFERENCES Course(dept, course#))
Unit 3 27
Relationship Sets with Key Constraints

since
name dname
ssn salary did budget

Employee Manages Department

„ Each dept has at most one manager, according to the key


constraint on Manages.
„ How should we represent Manages

Unit 3 28
Translating ER Diagrams with Key Constraints

„ WRONG WAY:
CREATE TABLE Manages(
¾ Create a separate table
ssn CHAR(11),
for Manages:
did INTEGER,
¾ Note that did is the key
since DATE,
now!
PRIMARY KEY (did),
¾ Create separate tables
FOREIGN KEY (ssn) REFERENCES Employee,
for Employee and
FOREIGN KEY (did) REFERENCES Department)
Department.

„ RIGHT WAY
CREATE TABLE Dept_Mgr(
¾ Since each department did INTEGER,
has a unique manager, dname CHAR(20),
we can combine budgetREAL,
Manages and ssn CHAR(11),
Department into one since DATE,
table. PRIMARY KEY (did),
¾ Create another table for FOREIGN KEY (ssn) REFERENCES Employee)
Employee
Unit 3 29
Translating Participation Constraints

name since dname


ssn salary did budget

Employee Manages Department

„ Every department must have a manager.


¾ Every did value in Department table must appear in a row of the
Manages table (with a non-null ssn value)
„ How can we express that in SQL?

Unit 3 30
Participation Constraints in SQL
„ Using the right method for Manages (add Manages relation in the Department
table), we can capture participation constraints by
¾ ensuring that each did is associated with a ssn that is not null
¾ not allowing to delete a manager before he/she is replaced

CREATE TABLE Dept_Mgr(


did INTEGER,
dname CHAR(20),
budget REAL,
ssn CHAR(11) NOT NULL,
since DATE,
PRIMARY KEY (did),
FOREIGN KEY (ssn) REFERENCES Employee,
ON DELETE NO ACTION
ON UPDATE CASCADE)
„ Note: We cannot express this if the wrong method was used for Manages.
Unit 3 31
Participation Constraints in SQL (cont’)

name since dname


ssn salary did budget

Employee Works-In Department

„ How can we express that


“every employee works is a department and every department has
some employee in it”?
„ Neither foreign-key nor not-null constraints in Works_In can do that.
„ We need assertions (later)

Unit 3 32
Translating Weak Entity Sets

name
cost pname birthday
ssn salary

Employee Policy Dependent

„ A weak entity is identified by considering the primary key of the owner


(strong) entity.
¾ Owner entity set and weak entity set participates in a one-to-many
identifying relationship set.
¾ Weak entity set has total participation.
„ What is the best way to translate it?

Unit 3 33
Translating Weak Entity Sets(cont’)
„ Weak entity set and its identifying relationship set are translated into a
single table.
¾ Primary key would consist of the owner’s primary key and week entity’s
partial key
¾ When the owner entity is deleted, all owned weak entities must also be
deleted.

CREATE TABLE Dep_Policy (


pname CHAR(20),
age INTEGER,
cost REAL,
ssn CHAR(11) NOT NULL,
PRIMARY KEY (ssn, pname),
FOREIGN KEY (ssn) REFERENCES Employees,
ON DELETE CASCADE,
ON UPDATE CASCADE)
Unit 3 34
Translating ISA Hierarchies to Tables
name
ssn salary

Employee

hourly_rate hours

contractid

Hourly_Emp Contract_Emp

„ What is the best way to translate this into tables?

Unit 3 35
First Attempt : Totally unsatisfactory
name
ssn salary

Employee „ Safe but with


lots of
hourly_rate hours duplication
contractid
„ Not a good
answer!
Hourly_Emp Contract_Emp

One table per entity. Each has all attributes:


Employee(sin, name, salary)
Hourly_Emp(sin, name, salary, hourly_rate, hours)
Contract_Emp(sin, name, salary, contractid)

Unit 3 36
Method 1 : One table
name
ssn
„ Wastes a lot of
salary space for null
values
Employee „ Worse when there
are many
hourly_rate
subclasses with
hours
many attributes
contractid „ Excellent method
if subclasses do
not have any new
Hourly_Emp Contract_Emp attributes and
relationships
„ One table which has all attributes and a type field which can have values
to identify the type of the employee:

Employee(sin, name, salary, hourly_rate, hours, contractid, type)

Unit 3 37
Method 2: Tables for superclass and subclasses
name
ssn salary
• Works well for queries
Employee involving only
superclass attributes or
hourly_rate hours only the extra subclass
attributes
contractid
• Have to combine two
tables to get all
Hourly_Emp Contract_Emp
attributes for a
subclass
„ Superclass table contains al superclass attributes
„ Each subclass table contains primary key of superclass and the
subclass attributes
Employee(sin, name, salary)
Hourly_Emp(sin, hourly_rate, hours)
Contract_Emp(sin, contractid)
Unit 3 38
Method 3: Only subclass tables
name
ssn salary
‰ If ISA-relation is partial, it
Employee cannot be applied (may
loose entities)
hourly_rate hours ‰ Works poorly if the
superclass participates in
contractid
any relationship ( should
not applied in that case)
Hourly_Emp Contract_Emp
‰ If ISA-relation is not
„ No table for superclass disjoint, it duplicates info
„ One table per subclass containing
¾ all superclass and subclass attributes

Hourly_Emp(sin, name, salary, hourly_rate, hours)


Contract_Emp(sin, name, salary, contractid)
Unit 3 39
Summary for Translating ISA to Tables

a) If subclasses have no new attributes and relationships or if they


all have the same attributes and relationships
¾ One table with all attributes and an extra category attribute for
the subclasses

b) If ISA is partial or overlapping and subclasses have new and


different attributes
¾ A table for the superclass and one for each subclass

c) If ISA is total and disjoint and subclasses have different


attributes or different relationships
¾ No table for the superclass
¾ One table for each subclass

Unit 3 40
Translating Aggregation
ssn name
bname type

Person Visits Bar

„ Use standard mapping of


relationship sets date
Drinks
„ Tables for our example :
¾ Visits(ssn, bname)
Drink dname
¾ Drinks(ssn, bname, dname, date)
„ Special Case:
¾ If Drinks is total on Visits and
Visits has no descriptive attributes
we could keep only the Drinks table
(discard Visits).
Unit 3 41
Views
„ A view is just a virtual table: We only store the view’s
definition, rather than the rows of the data.

CREATE VIEW CourseWithFails(dept, course#, title, mark) AS


SELECT C.dept, C.course#, title, mark
FROM Course C, Enrolled E
WHERE C.dept = E.dept AND
C.course# = E.course# AND mark<50

„ Views can be used as any other table.


„ System usually evaluates them on the fly.
„ Views can be dropped using the DROP VIEW command.
Unit 3 42
Views and Security

„ Views can be used to present necessary information


(or a summary), while hiding details in underlying
relation(s).
¾ Given CourseWithFails, but not Course or Enrolled, we can
find the course in which some students failed, but we can’t find
the students who failed.

Unit 3 43
View Updates
„ View updates must occur at the base tables.
¾ Ambiguous
¾ Difficult
„ DBMS’s restrict view updates only to some simple views
on single tables (called updatable views)
„ How to handle DROP TABLE if there’s a view on the
table?
„ DROP TABLE command has options to prevent a table
from being dropped if views are defined on it:
¾ DROP TABLE Student RESTRICT
o drops the table, unless there is a view on it
¾ DROP TABLE Student CASCADE
o drops the table, and recursively drops any view referencing it

Unit 3 44
Relational Model: Summary

„ A tabular representation of data.


„ Simple and intuitive, currently the most widely used.
„ Integrity constraints can be specified, based on
application semantics. DBMS checks for violations.
¾ Important ICs: primary and foreign keys
¾ Additional constraints can be defines with assertions
(but are expensive to check)
„ Powerful and natural query languages exist.
„ Rules to translate ER to relational model

Unit 3 45

You might also like