Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 35

A Relational Model of Data for

Large Shared Data Banks


E.F.CODD
IBM Research Laboratory,
San Jose,California

T.M.Hennayake
I.Willarachchige
G.M.Ekanayake
Agenda
Abstract
This Paper is concerned with…
Introduction
Data Dependencies in Present Systems
A Relational View of Data
Normal Form
Some Linguistic Ability
Expressible, Named And Stored Relations
Redundancy and Consistency

2
Abstract

In using large data banks we do not need to know the internal
representation of data


Changes in data representation often be needed as a result of changes in,
Query
Update
Report traffic
Natural growth in the types of stored information


When the internal or external representation of data changes, it will
affect to the
Activities of users at terminals
Most Application programs

3
This Paper is concerned with…
 Section 1:
Inadequacies of these models are discussed
 A model based on n-ary relations, a normal form for
database relations and concept of a universal data
sublanguage are introduced

 Section 2: Certain operations on relations are discussed


 operations are applied to the problems of redundancy and
consistency

4
Introduction
This paper concerned with the application of elementary
relation theory to systems which provide shared access to
large banks of formatted data

Problems Identified
 Data independence
 Independence of application programs

 Changes in data representation

 Data inconsistency
 Expected to become troublesome even in non-deductive

systems
5
Introduction Cont’d...
Why ER Model became superior….

Describing data with its natural structure only

Provides a basis for a high level data language

Provides sound basis for treating derivability,

redundancy, and consistency of relations


Permits a clearer evaluation of the scope and logical

limitations of present formatted data systems

6
Data Dependencies in Present
Systems
Mainly 3 types
Ordering Dependence
Indexing Dependence
Access Path Dependence

7
Ordering Dependence
Data in a databank can be stored in many different
ways. Elements can be stored in one order or in
different sorted orders

And there's no proper way


Ex:- Records of a file concerning parts might be stored in
ascending order by part serial number

Some well known applications fails because of this


sorted ordering

8
Indexing Dependence
Performance Oriented

Is a redundant component of data representation

Making indexes is useful to improve the performance


 Improves response to queries & updates

Slow downs response to insertions & deletions

9
Indexing Dependence Cont’d…
Problem :-
 “ Can the application programs/terminals remain
invariant as indices come and go…? “

Solution:-
Have taken different approaches in addressing the
problem
 TDMS :- Indexing on all attributes
 IMS :- Provides user with a choice for each file

 IDS :- Permits the file designers to select attributes to be


indexed
10
Access Path Dependence
 Path chosen by the system to retrieve data after a structured query language
(SQL) request is executed

 There can be more than one path/way to access a database

 Access path selection can make a tremendous impact on the overall


performance of the system

 Problem:-
 Regardless of the stored representation, terminal activities & programs become
dependent on user access paths

 Solution
 Defined user access paths not made obsolete until, all programs which uses this
become obsolete. This is Not Practical

11
A Relational View of Data
R(s1, s2, s3, ……….., sn )

According to the degree of relation we are having several types of


relations;
Degree 1 :: Unary
Degree 2 :: Binary
Degree 3 :: Ternary
Degree n :: N-ary

12
A Relational View of Data Cont’d…

N- ary Relation
Each row represents an n tuple of R

The ordering of rows is immaterial

All rows are distinct

The ordering of columns is significant

The significance of each column is partially conveyed by labeling it


with the name of the corresponding column
13
A Relational View of Data Cont’d…

Active Domain
Primary Key
Foreign Key
Simple Domain
Non simple Domain

14
A Relational View of Data Cont’d…
PK:- one domain of a given relation has values which
uniquely identify each element of that relation.
part part quantity
1 5 9
Component 2 5 7
3 5 2

FK:- A common requirement is for elements of a


relation to cross-reference other elements of the same
relation or elements of a different relation.

15
A Relational View of Data Cont’d…
Active Domain:-Set of values represented at some
instance is called as the active domain.
Simple Domain:-Domains whose elements are
nondecomposable(atomic).
NonSimple Domain:-Domains whose elements are
decomposable.

16
Employee

Normal Form Jobhistory


Children
Salaryhistory

ee man# name birthday jobhistory

jobhistory jobdate title salaryhistory

Salaryhistory salarydate salary

Unnormalized set
Children childname birthyear

employee man# name birthday jobhistory children

jobhistory man# jobdate title salaryhistory

Salaryhistory man# jobdate salarydate salary

Children man# childname birthyear


Normalized set
17
Normalization Steps
Starting with the relation at the top of the tree, take its
primary key and expand each of the immediately
subordinate relations by inserting this primary key
domain or domain combination.
The primary key of each expanded relation consists of
the primary key before expansion augmented by the
primary key copied down from the parent relation.
Strike out from the parent relation all nonsimple
domains, remove the top node of the tree,
Repeat the same sequence of operations on each
remaining subtree.

18
Some Linguistic Ability
Normalized schema has very high linguistic
power.
descriptive ability
sensible

19
Expressible, named, and stored relations
The named set
 Relations that the community of users can
identify by means of a simple name (or identifier)

The expressible set


 The expressible set is the total collection of
relations that Can be designated by expressions in
the data language.

20
Expressible, named, and
stored relations….
Problem
Determining the class of stored representations to
be supported.
Variety of permitted data arrange and continual
reinterpretation of descriptions for the structures
currently in effect.

21
Redundancy and Consistency

Redundancy
Data redundancy is a data organization issue that allows
the unnecessary duplication of data within the database

Consistency
Consistency is that only valid data will be written to the
database

22
Operations on Relations
Since relations are sets, all of the usual set
operations are applicable to them

Can use these to overcome from the Redundancy


and Consistency
Permutation
Projection
Join
Composition
Restriction

23
Redundancy and Consistency
Operations on Relations…..
Permutations
Permutation of a set of values is an arrangement of
those values into a particular order
Example :
permutation of the set consist with 3 columns can be;
 3!= 6
{1,2,3} = [1,2,3] , [1,3,2] , [2,1,3] , [2,3,1] , [3,1,2] and
[3,2,1]

24
Operations on Relations Cont’d...

Projection
Striking out the others and then remove from the resulting
array any duplication in the rows
The final array represents a relation which is said to be a
projection of the given relation
Person πAge,Weight(Person)
Name Age Weight
Age Weight
Harry 34 80
34 80
Sally 28 64
28 64
George 29 70
29 70
Helena 54 54
54 54
Peter 34 80
25
Operations on Relations Cont’d…

Join

Combine two binary relations which have common


domains

The result of the natural join is the set of all


combinations of tuples in R and S that are equal
on their common attribute names

26
Operations on Relations Cont’d…
Example : Natural Join
Employee Dept
DeptName Manager
Name EmpId DeptName
Finance George
Harry 3415 Finance
Sales Harriet
Sally 2241 Sales
Production Charles
George 3401 Finance
Harriet 2202 Sales

Employee join Dept
DeptNam
Name EmpId e Manager

Harry 3415 Finance George


Sally 2241 Sales Harriet
George 3401 Finance George
Harriet 2202 Sales Harriet

27
Operations on Relations cont’d…
Composition
Thus, two relations are composable if and only if they
are joinable

 The composition of  binary relation is a concept of


forming a new relation S & R from two given
relations R and S

28
Operations on Relations Cont’d…
Restriction
A subset of a relation is a relation.

29
Redundancy
Data redundancy is a data organization issue
that allows the unnecessary duplication of
data

StrongRedundancy
Weak Redundancy

  30
Strong Redundancy
Least one relation that possesses a
projection which is derivable from other
projections of relations in the set

31
Weak Redundancy
A collection of relations is weakly redundant if it
contains a relation that has a projection

which is not derivable from other members but is


at all times a projection of some join of other
projections of relations in the collection

32
Consistency

Consistency states that only valid data will


be written to the database

33
Consistency
Ways to detect and respond to
inconsistencies

Checks for possible inconsistency whenever an insertion,


deletion, or key update occurs
Problem:-Naturally, such checking will slow these
operations down
If an inconsistency has been generated, details are logged
Internally, al, either the user or someone responsible for the
security and integrity of the data is notified
checking as a batch operation once a day or less frequently.
Track inputs which course inconconsistencies
34
Thank You!

You might also like