Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 67

The Relational Data Model

(Based on Chapter 5)

1
1. Relational Model Concepts

BASIS OF THE MODEL


 
 
• The relational Model of Data is based on the concept of a
Relation.
 
• A Relation is a mathematical concept based on the ideas of sets.
 
• The strength of the relational approach to data management comes
from the formal foundation provided by the theory of relations.
 
 
2
INFORMAL DEFINITIONS

• RELATION: A table of values


 
• A relation may be thought of as a set of rows.
 
• A relation may alternately be though of as a set of
columns.
 
• Each row of the relation may be given an identifier.
 
• Each column typically is called by its column name or
column header or attribute name.
3
FORMAL DEFINITIONS

• A Relation may be defined in multiple ways.


 
• The Schema of a Relation: R (A1, A2, .....An)
Relation R is defined over attributes A1, A2, .....An
 
For Example -
CUSTOMER (Cust-id, Cust-name, Address, Phone#)
 
Here, CUSTOMER is a relation defined over the four attributes
Cust-id, Cust-name, Address, Phone#, each of which has a
domain or a set of valid values.
4
For example, the domain of Cust-id is 6 digit numbers.
• A tuple is an ordered set of values
 
• Each value is derived from an appropriate domain.
 
• Each row in the CUSTOMER table may be called as a tuple in the
table and would consist of four values.
 
<632895, "John Smith", "101 Main St. Atlanta, GA 30332",
"(404) 894-2000"> is a triple belonging to the CUSTOMER
relation.
 
• A relation may be regarded as a set of tuples (rows).
• Columns in a table are also called as attributes of the relation.
5
FORMAL DEFINITIONS (contd..)
• The relation is formed over the cartesian product of the sets; each
set has values from a domain; that domain is used in a specific
role which is conveyed by the attribute name.
• For example, attribute Cust-name is defined over the domain of
strings of 25 characters. The role these strings play in the
CUSTOMER relation is that of the name of customers.
• Formally, Given R(A1, A2, .........., An)
r(R) subset-of dom (A1) X dom (A2) X ....X dom(An)
 
R: schema of the relation
r of R: a specific "value" or population of R.
 

6
R is also called the intension of a relation
r is also called the extension of a relation

Let S1 = {0,1}
Let S2 = {a,b,c}
  Let R be a subset-of S1 X S2
 
for example: r(R) = {<0.a> , <0,b> , <1,c> }

7
DEFINITION SUMMARY

Informal Terms Formal Terms

Table Relation
Column Attribute/Domain
Row Tuple
Values in a column Domain
Table Definition Schema of Relation
Populated Table Extension

8
Figure 7.1 The attributes and tuples of a relation STUDENT.

9
2 Characteristics of Relations

Ordering of tuples in a relation r(R): The tuples are not


considered to be ordered, even though they appear to be
in the tabular form.
 
Ordering of attributes in a relation schema R (and of
values within each tuple): We will consider the attributes
in R(A1, A2, ..., An) and the values in t=<v1, v2, ..., vn>
to be ordered .

 
 

10
Values in a tuple: All values are considered atomic
(indivisible). A special null value is used to represent
values that are unknown or inapplicable to certain
tuples.

Notation:
- We refer to component values of a tuple t by
t[Ai] = vi (the value of attribute Ai for tuple t).
- Similarly, t[Au, Av, ..., Aw] refers to the subtuple of t
containing the values of attributes Au, Av, ..., Aw,
respectively.

11
Figure 7.2 The relation STUDENT from Figure 7.1, with a different
order of tuples

12
3 Relational Integrity Constraints

Constraints are conditions that must hold on all


valid relation instances. There are three main
types of constraints:

• Key constraints
• Entity integrity constraints,
• Referential integrity constraints

13
3.1 Key Constraints
Superkey of R: A set of attributes SK of R such
that no two tuples in any valid relation instance
r(R) will have the same value for SK. That is,
for any distinct tuples t1 and t2 in r(R), t1[SK]
<> t2[SK].

Key of R: A "minimal" superkey; that is, a superkey K


such that removal of any attribute from K results in a
set of attributes that is not a superkey.

14
Example: The CAR relation schema:
CAR(State, Reg#, SerialNo, Make, Model, Year)
has two keys Key1 = {State, Reg#}, Key2 = {SerialNo},
which are also superkeys. {SerialNo, Make} is a superkey
but not a key.

If a relation has several candidate keys, one is chosen


arbitrarily to be the primary key.
The primary key attributes are underlined.

15
Figure 7.4 The CAR relation with two candidate keys: LicenseNumber
and EngineSerialNumber.

16
Figure 7.5 Schema diagram for the COMPANY relational database
schema; the primary keys are underlined.

17
Figure 7.5 Schema diagram for the COMPANY relational database
schema; the primary keys are underlined.

18
Figure 7.6 (continued)

19
3.2 Entity Integrity
Relational Database Schema: A set S of relation schemas
that belong to the same database. S is the name of the
database.
S = {R1, R2, ..., Rn}
 
Entity Integrity: The primary key attributes PK of each relation
schema R in S cannot have null values in any tuple of r(R). This is
because primary key values are used to identify the individual
tuples.
t[PK] <> null for any tuple t in r(R)
 
Note: Other attributes of R may be similarly constrained to disallow
null values, even though they are not members of the primary key.
20
3.3 Referential Integrity
A constraint involving two relations (the previous constraints
involve a single relation).
 
Used to specify a relationship among tuples in two relations: the
referencing relation and the referenced relation.
 
Tuples in the referencing relation R1 have attributes FK (called
foreign key attributes) that reference the primary key attributes
PK of the referenced relation
R2. A tuple t1 in R1 is said to reference a tuple t2 in R2 if t1[FK] =
t2[PK].
 
A referential integrity constraint can be displayed in a relational
database schema as a directed arc from R1.FK to R2.
21
Figure 7.7 Referential integrity constraints displayed on the COMPANY relational
database schema diagram.

22
Figure 7.6 One possible relational database state corresponding to the
company schema.

23
4 Update Operations on Relations
- INSERT a tuple.
- DELETE a tuple.
- MODIFY a tuple.
 
- Integrity constraints should not be violated by the update
operations.
 
- Several update operations may have to be grouped together.
 
- Updates may propagate to cause other updates
automatically. This may be necessary to maintain integrity
constraints.
 
-
24
In case of integrity violation, several actions can be
taken:
- cancel the operation that causes the violation
(REJECT optiom)
- perform the operation but inform the user of the
violation
- trigger additional updates so the violation is
corrected (CASCADE option, SET NULL option)
- execute a user-specified error-correction routine

25
5 The Relational Algebra

- Operations to manipulate relations.


- Used to specify retrieval requests (queries).
- Query result is in the form of a relation.
 
Relational Operations:
5.1 SELECT  and PROJECT  operations.
5.2 Set operations: These include UNION U,
INTERSECTION | |,
DIFFERENCE -, CARTESIAN PRODUCT X.
5.3 JOIN operations X.
5.4 Other relational operations: DIVISION, OUTER JOIN,
AGGREGATE FUNCTIONS.
26
5.1 SELECT  and PROJECT 

SELECT operation (denoted bys ):


 
- Selects the tuples (rows) from a relation R that satisfy a
certain selection condition c
- Form of the operation:  c(R)
 
- The condition c is an arbitrary Boolean expression on the
attributes of R
 
- Resulting relation has the same attributes as R
 
- Resulting relation includes each tuple in r(R) whose attribute
values satisfy the condition ‘c’
 
27
Examples:
 DNO=4 (EMPLOYEE)
  SALARY>30000 (EMPLOYEE)
(DNO=4 AND SALARY>25000) OR DNO=5(EMPLOYEE)

28
PROJECT operation (denoted by ):
 
- Keeps only certain attributes (columns) from a relation R
specified in an attribute list L
 
- Form of operation:  L(R)
 
- Resulting relation has only those attributes of R specified
in L
 
Example:  FNAME,LNAME,SALARY(EMPLOYEE)
 
- The PROJECT operation eliminates duplicate tuples in
the resulting relation so that it remains a mathematical set (no
duplicate elements)
29
Example: SEX,SALARY(EMPLOYEE)

If several male employees have salary 30000, only a


single tuple <M, 30000> is
kept in the resulting relation.

Duplicate tuples are eliminated by the  operation.

30
Sequences of operations:
 
- Several operations can be combined to form a relational
algebra expression (query)

Example: Retrieve the names and salaries of employees who work in


department 4:

 FNAME,LNAME,SALARY ( DNO=4(EMPLOYEE) )
 
- Alternatively, we specify explicit intermediate relations for
each step:
DEPT4_EMPS <- DNO=4(EMPLOYEE)
R <- FNAME,LNAME,SALARY(DEPT4_EMPS)
 
-
31
Attributes can optionally be renamed in the
resulting left-hand-side relation (this may
be required for some operations that will be
presented later):
DEPT4_EMPS <- DNO=4(EMPLOYEE)

R(FIRSTNAME,LASTNAME,SALARY)
<-
 FNAME,LNAME,SALARY(DEPT4_EMPS)

32
Figure 7.8 Results of SELECT and PROJECT operations.
(a) (DNO=4 AND SALARY>25000) OR (DNO=5 AND SALARY>30000)(EMPLOYEE).
(b) LNAME, FNAME, SALARY(EMPLOYEE). (c) SEX, SALARY(EMPLOYEE).

33
Class Number – CS 304

Class Name -
DBMS

Instructor –
Sanjay Madria

Lesson Title – Relational Algebra –


3rd July
34
5.2 Set Operations
- Binary operations from mathematical set theory:
UNION: R1 U R2,
INTERSECTION: R1 | | R2,
SET DIFFERENCE: R1 - R2,
CARTESIAN PRODUCT: R1 X R2.
 
-ForU, | |, -, the operand relations R1(A1, A2, ..., An) and R2(B1,
B2, ..., Bn) must have the same number of attributes, and the
domains of corresponding attributes must be compatible; that is,
dom(Ai)=dom(Bi) for i=1, 2, ..., n. This condition is called union
compatibility.
 
The resulting relation for U, | |, or - has the same attribute names
as the first operand relation R1 (by convention). 35
Figure 7.10 Query result after the UNION operation:
RESULT  RESULT1  RESULT2

36
Figure 7.11 Illustrating the set operations union, intersection and difference.
(a) Two union compatible relations.
(b) STUDENT  INSTRUCTOR. (c) STUDENT  INSTRUCTOR
(d) STUDENT - INSTRUCTOR (e) INSTRUCTOR - STUDENT

37
CARTESIAN PRODUCT

R(A1, A2, ..., Am, B1, B2, ..., Bn) <-


R1(A1, A2, ..., Am) X R2 (B1, B2, ..., Bn)
 
- A tuple t exists in R for each combination of tuples t1
from R1 and t2 from R2 such that:
t[A1, A2, ..., Am]=t1 and t[B1, B2, ..., Bn]=t2
 
- If R1 has n1 tuples and R2 has n2 tuples, then R will have
n1*n2 tuples.
 
- CARTESIAN PRODUCT can combine related tuples
from two relations if followed by the appropriate SELECT
operation .
  38
Example: Combine each DEPARTMENT tuple with
the EMPLOYEE tuple of the manager.
DEP_EMP <-DEPARTMENT X EMPLOYEE
DEPT_MANAGER <-MGRSSN=SSN(DEP_EMP)

39
5.3 JOIN Operations
THETA JOIN: Similar to a CARTESIAN PRODUCT followed by a
SELECT.
The condition c is called a join condition.
R(A1, A2, ..., Am, B1, B2, ..., Bn) <-R1(A1, A2, ..., Am) X c R2 (B1,
B2, ..., Bn)
Here c can be <, >, =, <=, >=
 
EQUIJOIN: The join condition c includes one or more equality
comparisons involving attributes from R1 and R2. That is, c is of
the form:
(Ai=Bj) AND ... AND (Ah=Bk); 1<i,h<m, 1<j,k<n
 
In the above EQUIJOIN operation:
Ai, ..., Ah are called the join attributes of R1
Bj, ..., Bk are called the join attributes of R2
  40
Example of using EQUIJOIN:

Retrieve each DEPARTMENT's name and its manager's


name:
T <-DEPARTMENT X MGRSSN=SSN EMPLOYEE
RESULT <-DNAME,FNAME,LNAME(T)

41
NATURAL JOIN (*):
In an EQUIJOIN R <- R1 X c R2, the join attribute of R2 appear
redundantly in the result relation R. In a NATURAL JOIN, the
redundant join attributes of R2 are eliminated from R. The
equality condition is implied and need not be specified.
R <- R1 *(join attributes of R1),(join attributes of R2) R2

Example: Retrieve each EMPLOYEE's name and the name of the


DEPARTMENT he/she
works for:
T<- EMPLOYEE *(DNO),(DNUMBER) DEPARTMENT
RESULT <-FNAME,LNAME,DNAME(T)
 

42
If the join attributes have the same names in both
relations, they need not be specified and we can
write R <- R1 * R2.

Example: Retrieve each EMPLOYEE's name and the


name of his/her SUPERVISOR:
SUPERVISOR(SUPERSSN,SFN,SLN)<-
SSN,FNAME,LNAME(EMPLOYEE)
T<-EMPLOYEE * SUPERVISOR
RESULT <-FNAME,LNAME,SFN,SLN(T)

43
Figure 7.13 Illustrating the JOIN operation.

44
Figure 7.14 An illustration of the NATURAL JOIN operation.
(a) PROJ_DEPT  PROJECT * DEPT.
(b) DEPT_LOCS  DEPARTMENT * DEPT_LOCATIONS.

45
Note: In the original definition of NATURAL JOIN, the
join attributes were required to have the same names in
both relations.
 
There can be a more than one set of join attributes with a
different meaning between the same two relations. For
example:
 
JOIN ATTRIBUTES RELATIONSHIP
EMPLOYEE.SSN= EMPLOYEE manages
DEPARTMENT.MGRSSN the DEPARTMENT
EMPLOYEE.DNO= EMPLOYEE works for
DEPARTMENT.DNUMBERthe DEPARTMENT
  46
A relation can have a set of join attributes to join it with
itself :
 
JOIN ATTRIBUTES RELATIONSHIP
EMPLOYEE(1).SUPERSSN= EMPLOYEE(2) supervises
EMPLOYEE(2).SSN EMPLOYEE(1)
 
- One can think of this as joining two distinct copies of the
relation, although only one relation actually exists
 
- In this case, renaming can be useful
 

47
Figure 7.15 Illustrating the division operation.
(a) Dividing SSN_PNOS by SMITH_PNOS. (b) T  R  S.

48
Complete Set of Relational Algebra Operations:
 
- All the operations discussed so far can be described as a
sequence of only the operations SELECT, PROJECT, UNION,
SET DIFFERENCE, and CARTESIAN PRODUCT.
 
- Hence, the set { , , U, - , X } is called a complete set
of relational algebra operations. Any query language equivalent to
these operations is called relationally complete.
 
- For database applications, additional operations are needed
that were not part of the original relational algebra. These
include:
1. Aggregate functions and grouping.
2. OUTER JOIN.
49
5.4 Additional Relational Operations

AGGREGATE FUNCTIONS
- Functions such as SUM, COUNT, AVERAGE, MIN, MAX
are often applied to sets of values or sets of tuples in database
applications
<grouping attributes> F<function list> (R)
 
- The grouping attributes are optional
 
Example 1: Retrieve the average salary of all employees (no grouping
needed):
R(AVGSAL) <- F AVERAGE SALARY (EMPLOYEE)
 
50
Example 2: For each department, retrieve the department
number, the number of employees, and the average salary
(in the department):
R(DNO,NUMEMPS,AVGSAL) <-
DNO F COUNT SSN, AVERAGE SALARY (EMPLOYEE)

 
DNO is called the grouping attribute in the above example

51
Figure 7.16 An illustration of the AGGREGATE FUNCTION operation.
(a) R(DNO, NO_OF_EMPLOYEES, AVERAGE_SAL)  DNO  COUNT SSN, AVERAGE
SALARY (EMPLOYEE). (b) DNO  COUNT SSN, AVERAGE SALARY(EMPLOYEE).
(C)  COUNT SSN, AVERAGE SALARY(EMPLOYEE).

52
OUTER JOIN
 
- In a regular EQUIJOIN or NATURAL JOIN operation,
tuples in R1 or R2 that do not have matching tuples in the other
relation do not appear in the result
 
- Some queries require all tuples in R1 (or R2 or both) to
appear in the result
 
- When no matching tuples are found, nulls are placed for
the missing attributes
 
-

53
LEFT OUTER JOIN: R1 X R2 lets every tuple in R1
appear in the result
 
- RIGHT OUTER JOIN: R1 X R2 lets every tuple in
R2 appear in the result
 
- FULL OUTER JOIN: R1 X R2 lets every tuple in R1
or R2 appear in the result

54
Figure 7.18 The LEFT OUTER JOIN operation.

55
RENAME Operator

S(B1, B2,…Bn) (R) – Renaming relation R as S and renaming


attributes of R as Bi’s.
 

S(R) – Renaming R as S
 
(B1, B2,…Bn) (R) – Renaming attributes of R as Bi’s.
 

56
Some Queries
 
Q. Retrieve the SSNs of all the employees who either work in
dept. 5 or supervise an employee who works in dept 5.
 
Dept-Emps DNO = 5 (Employee)
Result1  SSN (Dept-Emps)
Result 2(SSN)  SuperSSN (Dept-Emps)
 
Result = Result 1  Result 2
57
Q. Find for each female employee, a list of names of her dependents.
 
Female-Emp SEX = F (Employee)
Empname  FNAME, LNAME, SSN (Female-Emp)
Emp-dep  Empname  Dependent
Actual-dep SSN = ESSN (Emp-dep)
Result  FNAME, LNAME, Dependent-name (Actual-dep)
 
Q. Find the name of the manager of each department.
 
Dept-mgr  Department MGRSSN = SSN (Employee)
Result  DNAME, LNAME, FNAME (Dept-mgr)

58
• List names of managers who have atleast
one dependent
• Find names of employees who have no
dependents
• List names of all employees with two or
more dependents

59
Figure 7.12 An illustration of the CARTESIAN PRODUCT operation.

60
Figure 7.12 (continued)

61
The Insert Operation
1. Insert <‘Cecilia’, ‘F”, “Kolonsky’, null, ‘1960-04-
05’, ‘6357 Windy Lane, Katy, TX’, F, 28000, null,
4> into EMPLOYEE
This insertion violates the entity integrity constraint
(null for the primary key SSN), so it is rejected.
2. Insert <‘Alicia’, ‘J’, ‘Zelaya’, 999887777, 1960-
04-05’, ‘6357 Windy Lane, Katy, TX’, F, 28000,
987654321’, 4> into EMPLOYEE
This insertion violates the key constraint because
another tuple with the same SSN Value already
exists in the EMPLOYEE relation, and so it is
rejected.

62
The Insert Operation (contd.)
3. Insert <‘Cecilia’, ‘F’, ‘Kolonsky’, ‘67768989’, 1960-
04-05’, ‘6357 Windswept, katy, TX’, F, 28000,
‘987654321’, 7> into EMPLOYEE
This insertion violates the referential integrity
constraint specified on DNO because no
DEPARTMENT tuple exists with DNUMBER = 7
4. Insert <‘Cecilia’, ‘F’, ‘Kolonsky’, ‘67768989’, 1960-
04-05’, ‘6357 Windy Lane, Katy, TX’, F, 28000, null,
4> into EMPLOYEE.
This insertion satisfies all constraints, so it is
acceptable.
63
The Delete Operation
1. Delete the WORKS_ON tuple with ESSN =
‘999887777’ and PNO = 10.
This deletion is acceptable

2. Delete the EMPLOYEE tuple with SSN =


‘999887777’
This deletion is not acceptable, because tuples in
WORKS_ON refer to this tuple. Hence, if the
tuple is deleted, referential integrity violations will
result.

64
The Delete Operation (contd.)
3. Delete the EMPLOYEE tuple with SSN= ‘333445555’
This deletion will result in even worse referential integrity
violations, because the tuple involved is referenced by tuples
from the EMPLOYEE, DEPARTMENT, WORKS_ON and
DEPENDENT relations.
• Options
• Reject deletion
• Propagate deletion by deleting other tuples
• Modify the referencing attributes that cause the violations
• Or combination of three above

65
The Update Operation
1. Update the SALARY of the EMPLOYEE tuple
with SSN = ‘999887777; to 28000
Acceptable
2. Update the DNO of the EMPLOYEE tuple with
SSN = ‘999887777’ to 1
Acceptable
3. Update the DNO of the EMPLOYEE tuple
with SSN = ‘999887777’ to 7
Unacceptable, because it violates referential
integrity.

66
The Update Operation (contd.)
4. Update the SSN of the EMPLOYEE tuple with
SSN = ‘999887777’ to ‘987654321’
Unacceptable, because it violates primary key
and referential integrity constraints.

Updating an attribute which is neither primary key or


foreign key usually causes no problems, new
value should be of correct data type and domain.

67

You might also like