Professional Documents
Culture Documents
Chapter 8 Database CS 9618
Chapter 8 Database CS 9618
Originally all data were held in files. A typical file would consist of a large
number of records each of which would consist of a number of fields. This led
to very large files that were difficult to process.
This means that a match is very difficult because the data is inconsistent.
Another problem is that each time a new product is added to the database both
the Name and address of the supplier must be entered.
Fig. Shows how data can be proliferated when each department keeps its own files.
File containing
Programs to Customer name and
Accounts record accounts address, amount
Department of customers owing, dates of orders,
Fig.1 etc.
This method of keeping data uses flat files. Flat files have the following
limitations.
Duplication of data
Data dependence
A flat file database is a database that stores data in a plain text file. Each
line of the text file holds one record, with fields separated by delimiters,
such as commas or tabs. While it uses a simple structure, a flat file
database cannot contain multiple tables like a relational database can.
Or
Advantage Notes
Control of data redundancy Flat files have a great deal of data redundancy that
is removed using a RDB.
Consistency of data There is only one copy of the data so there is less
chance of data inconsistencies occurring.
Data sharing The data belongs to the whole organisation, not to
individual departments.
More information Data sharing by departments means that
departments can use other department's data to find
information.
Improved data integrity Data is consistent and valid.
2. Relational Database
In the relational database model, each item of data is stored in a relation
which is a special type of table. The strange choice of name comes from a
mathematical theory. A relational database is a collection of relational
tables. When a table is created in a relational database it is first given a
name and then the attributes are named. In a database design, a table
would be given a name with the attribute names listed in brackets aft er
the table name. For example
Entity
Database entity is a thing, person, place, unit, object or any item about which
the data should be captured or stored in the form of properties or attribute.
Properties or attributes are required (because entity without properties is not
an entity).
Table
A database table is a structure that organises data into rows and columns
– forming a grid
Primary key:
Candidate key:
Secondary key:
A candidate key that has not been chosen as the primary key
Foreign key:
An attribute in one table that refers to the primary key in another table
Primary key
The key used to uniquely identify a record or tuple is called the primary
key.
In some cases, more than one attribute, or group of attributes, could act as
the primary key. Suppose we have the relation
Example
Clearly, EmpID could act as the primary key. However, NINumber could
also act as the primary key as it is unique for each employee. In this case
we say that EmpID and NINumber are candidate keys.
Secondary key
A field which indexed other than primary key. A Primary is always indexed by
default.
USES
Example
We see that MID occurs in CINEMA and is the primary key in MANAGER.
In CINEMA we say that MID is the foreign key.
Foreign key
Referential integrity:
the use of a foreign key to ensure that a value can only be entered in one table
when the same value already exists in the referenced table
Example
CUSTOMER(Num, CustName, City)
CITY_COUNTRY(City, Country)
Let’s discuss the use of a foreign key using the database design shown in
Figure above. When the database is being created, the CITY_COUNTRY is
created first. City is chosen as the primary key because unique City name can
be guaranteed. Then the CUSTOMER table is created. Num is defined as the
primary key and the attribute City is identified as a foreign key referencing the
primary key in the CITY_COUNTRY. Once this relationship between primary
and foreign keys has been established, the DBMS will prevent any entry for
City in the CUSTOMER table being made if the corresponding value does not
exist in the CITY_COUNTRY table. This provides referential integrity which
is another reason why the relational database model helps to ensure data
integrity
Data integrity:
Indexing
An index -
Pointer
A special variable holds only the address of a memory location.
3 Types of relationships
The statements show two types of relationship. There are in fact four
altogether. These are
one-to-one represented by
one-to-many represented by
many-to-one represented by
many-to-many represented by
In relational database tables are linked to each other using common fields
(Primary and foreign keys)
Consider the following paper-based delivery note from Easy Fasteners Ltd.
1 Table
2 Desk
3 Chair
Fig 2
In a relational databases.
ProdID and Description are put inside parentheses because they form a
repeating group. In tabular form the data may be represented by Fig. below
Repeating Groups
This again shows the repeating group. We say that this is in un-normalised form
(UNF). To put it into 1st normal form (1NF) we complete the table and identify a key
that will make each tuple unique. This is shown in Fig. Fig. below
To make each row unique we need to choose Num together with ProdID as the key.
To indicate the key, we simply underline the attributes that make up the key.
Because we have identified a key that uniquely identifies each tuple, we have
removed the repeating group.
Definition of 1NF
Let us now see how to move from 1NF to 2NF and on to 3NF.
Definition of 2NF
Note the keys (underlined) for each relation. DEL_PROD needs a compound key
Definition of 3NF
Let us now use the data above and see what happens to it as the relations are
normalised.
1NF
CUSTOMER
Num CustName City Country ProdID Description
005 Bill Jones London England 1 Table
005 Bill Jones London England 2 Desk
005 Bill Jones London England 3 Chair
008 Mary Hill Paris France 2 Desk
008 Mary Hill Paris France 7 Cupboard
014 Anne Smith New York USA 5 Cabinet
002 Tom Allen London England 7 Cupboard
002 Tom Allen London England 1 Table
002 Tom Allen London England 2 Desk
Convert to
2NF
CUSTOMER PRODUCT
Num CustName City Country ProdID Description
005 Bill Jones London England 1 Table
008 Mary Hill Paris France 2 Desk
014 Anne Smith New York USA 3 Chair
002 Tom Allen London England 7 Cupboard
5 Cabinet
CUS_PROD
Num ProdID
005 1
005 2
005 3
008 2
008 7
014 5
002 7
002 1
002 2
Convert to
3NF
CUSTOMER CUS_PROD
Num CustName City Num ProdID
005 Bill Jones London 005 1
008 Mary Hill Paris 005 2
014 Anne Smith New York 005 3
002 Tom Allen London 008 2
008 7
014 5
002 7
002 1
002 2
PRODUCT CITY_COUNTRY
ProdID Description City Country
1 Table London England
2 Desk Paris France
3 Chair New York USA
7 Cupboard
5 Cabinet
UNF
1NF
2NF
3NF
Example 2
Films are shown at many cinemas, each of which has a manager. A manager may
manage more than one cinema. The takings for each film are recorded for each
cinema at which the film was shown.
Converting this to 1NF can be achieved by 'filling in the blanks' to give the relation
Therefore 2NF is
FILM(FID, Title)
CINEMA(CID, Cname, Loc, MID, MName)
TAKINGS(FID, CID, Takings)
In Cinema, the non-key attribute MName is dependent on MID. This means that it is
transitively dependent on the primary key. So we must move this out to get the 3NF
relations
FILM(FID, Title)
CINEMA(CID, Cname, Loc, MID)
TAKINGS(FID, CID, Takings)
MANAGER(MID, MName)
The statements show two types of relationship. There are in fact four altogether.
These are
one-to-one represented by
one-to-many represented by
many-to-one represented by
many-to-many represented by
Fig. below is the E-R diagram showing the relationships between CUSTOMER,
CITY_COUNTRY, PRODUCT and DEL_PROD.
CUSTOMER
CITY_COUNTRY CUS_PROD
If the relations are in 3NF, the E-R diagram will not contain any many-to-many
relationships. If there are any one-to-one relationships, one of the entities can be
removed and its attributes added to the entity that is left.
Let us now look at our solution to the cinema problem which contained the relations
FILM(FID, Title)
CINEMA(CID, Cname, Loc, MID)
TAKINGS(FID, CID, Takings)
MANAGER(MID, MName)
in 3NF.
takes
FILM TAKINGS
is for
connected by FID
takes
CINEMA is for TAKINGS
connected by CID
manages
MANAGER CINEMA
managed
by
connected by MID
CINEMA
MANAGER TAKINGS
FILM
CINEMA FILM
If you now look at Fig. above, you will see that the link entity is TAKINGS.
EXTERNAL
LEVEL User 1 User 2 User 3
(Individual
users)
CONCEPTUAL
LEVEL Company Level
(Integration of
all user views)
INTERNAL DISK/FILE
LEVEL organisation
(Storage view)
This is the individual view of the database. In a multi user database there
will generally be several different external schema representing each
user view according to their needs and access right.
At the external level there are many different views of the data.
Each view consists of an abstract representation of part of the total database.
Application programs will use a data manipulation language (DML) to create
these views.
The overall view of the entire database, including entities, attributes and
relationships as designed by the database designer.
This describes how the data will be stored and is concerned with file
organisation and access methods. It is generally transparent to the user.
A person who uses the DBMS to customise the database to suit user
and programmer requirements
Developer interface:
Query processor:
Query:
The query is the mechanism for extracting and manipulating data from
the database.
or
Report generator
The other feature likely to be provided by the DBMS is the capability for
creating a report to present formatted output.
Data dictionary
It contains metadata about the data. This includes details of all the
definitions of tables, attributes and so on but also of how the physical
storage is organised.
Metadata is defined as the data providing information about one or more aspects of
the data; it is used to summarize basic information about data which can make
tracking and working with specific data easier. Some examples include: Means of
creation of the data. Purpose of the data. Time and date of creation.
Index:
A small secondary table used for rapid searching which contains one
attribute from the table being searched and pointers to the tuples in that
table
Backup facilities
The integrity of the data in the database is a key concern. One potential
cause of problems occurs when a transaction is started but a system
problem prevents its completion. The result would be a database in an
undefined state.
The DBMS should have a built-in feature that prevents this from
happening. As with all systems, regular backup is a requirement. The
DBA will be responsible for backup of the stored data.
Access Rights
Sometimes it must not be possible for a user to access the data in a database.
In this case the one user will be able to change the contents of the
database while the other will only be allowed to query the database.
Example 1
In a banking system, accounts must be updated with the day's transactions. While this
is taking place users must not be able to access the database. Thus, at certain times of
the day, users will not be able to use a cash point.
Levelled Passwords
Hardware restrictions
A manager can access the record from any terminal while workers can
access from their terminals only.
This is a hardware method of preventing access. All terminals have a unique address
on their network cards. This means that the DBMS can decide which data can be
supplied to a terminal.
Data Definition Language (DDL) is the part of SQL provided for creating
or altering tables.
These commands only create the structure. They do not put any data
into the database.
The following are some examples of DDL that could be used in creating
the database
There are three categories of use for Data Manipulation Language (DML)
• The insertion of data into the tables when the database is created
• The modification or removal of data in the database
• The reading of data stored in the database
The following illustrate the two possible ways that SQL can be written to populate a
table with data:
The main use of DML is to obtain data from a database using a query. A query always
starts with the SELECT command. Two examples are:
The simplest form for a query has the attributes for which values are to be listed as
output identified aft er SELECT and the table name identified aft er FROM. For
example:
which uses * to indicate all attributes. Note that in the first example the attributes are
separated by commas but no parentheses are needed.
examples:
which produces a single output for each band that has headlined. Note how a query
can have several component parts which are best presented on separate lines.
This is a special case because there is no need to specify the attribute. An example
using a specific attribute would be:
A query can be based on a ‘join condition’ between data in two tables. The most
frequently used is an inner join which is illustrated by:
The SQL uses the full definitive name for each attribute with the table name and
attribute name separated by a dot. The query contains two conditions. The way that
the query works is as follows.
• Then there is a search of the Booking table to find the examples of tuples having this
value for BookingID.
• For each one found the VenueName and Date are presented in the output.
Some versions of SQL require the explicit use of INNER JOIN. The following is a
possible generic syntax:
The other use of DML is to modify the data stored in the database. The UPDATE
command is used to change the data. If the band ComputerKidz recruited an extra
member the following SQL would make the change needed.
Note the use of the WHERE clause. If you forgot to include this the UPDATE
command would change the number of band members to 6 for all of the bands.
The DELETE command is used to remove data from the database. This has to be done
with care. If the ITWizz band decided to disband the following SQL would remove
the name from
the database.
Note that if an attempt was made to carry out the deletion from Band first there would
be an error. This is because BandName is a foreign key in Band-Booking. Any entry
for BandName in Band-Booking must have a corresponding value in Band.
A company employs engineers who service many products. A customer may own
many products but a customer's products are all serviced by the same engineer. When
engineers service products they complete a repair form, one form for each product
repaired. The form contains details of the customer and the product that has been
repaired as well as the Engineer's ID. Each form has a unique reference number.
When a repair is complete, the customer is given a copy of the repair form.
A company employs engineers who service many products. A customer may own
many products but a customer's products are all serviced by the same engineer. When
engineers service products they complete a repair form, one form for each product
repaired. The form contains details of the customer and the product that has been
repaired as well as the engineer's ID. Each form has a unique reference number.
When an engineer has repaired a product, the customer is given a copy of the repair
form.
engineer
product
customer
repair form
A company employs engineers who service many products. A customer may own
many products but a customer's products are all serviced by the same engineer. When
engineers service products they complete a repair form, one form for each product
repaired. The form contains details of the customer and the product that has been
repaired as well as the Engineer's ID. Each form has a unique reference number.
When a repair is complete, the customer is given a copy of the repair form.
services completes
ENGINEER
serviced completed
by by
PRODUCT FORM
owned by given to
CUSTOMER
owns is given
services completes
ENGINEER
ENG_PROD
serviced completed
by by
PRODUCT FORM
owned by given to
CUST_PROD
CUSTOMER
owns is given
This suggests the following relations (not all the attributes are given).
ENGINEER(EngineerID, Name, … )
ENG_PROD(EngineerID, ProductID)
PRODUCT(ProductID, Description, … )
CUST_PROD(CustomerID, ProductID)
CUSTOMER(CustomerID, Name, … )
FORM(FormID, CustomerID, EngineerID, … )
These are in 3NF, but you should always check that they are.
Example Questions
3. Every student in a school belongs to a form. Every form has a form tutor and all
the form tutors are members of the teaching body.
Draw an entity relationship diagram to show the relationships between the four
entities STUDENT, FORM, TUTOR, TEACHERS (6)
A.
STUDENT FORM
TEACHERS TUTOR