Chapter 8 Database CS 9618

Computer Science 9618/12
8.1 Database Concepts

1. Flat file
Originally all data were held in files. A typical file would consist of a large
number of records each of which would consist of a number of fields. This led
to very large files that were difficult to process.
Field Name Data Type

Description String
Cost Price Currency
Selling Price Currency
Number in Stock Integer
Reorder Level Integer
Supplier Name String
Supplier Address String
Further ad hoc enquiries are virtually impossible.
This means that a match is very difficult because the data is inconsistent.
Another problem is that each time a new product is added to the database both
the Name and address of the supplier must be entered.
This leads to redundant data or data duplication.
Fig. Shows how data can be proliferated when each department keeps its own files.
File containing Stock

Programs to
Code, Description,
Purchasing place orders
Re-order level, Cost
Department when stocks are
Price, Sale Price
low
Supplier name and
address, etc
File containing Stock

Programs to Code, Description,
Sales record orders Number sold, Sale
Department from customers Price, Customer name
and address, etc.
File containing
Programs to Customer name and
Accounts record accounts address, amount
Department of customers owing, dates of orders,
Fig.1 etc.
SAHIR MASROOR KHAN 03009543074

This method of keeping data uses flat files. Flat files have the following
limitations.
 Separation and isolation of data
As data is held in separate files so difficult find related data.
 Duplication of data
Same data is held in more than two one files
 Duplication is wasteful as it costs time and money. Data has to be

entered more than once; therefore, it takes up time and more space.
 Duplication leads to loss of data integrity.
 Data dependence
Data formats are defined in the application programs. If there is a

need to change any of these formats, whole programs have to be
changed.
Different applications may hold the data in different forms, again

causing a problem. Incompatibility of files
 Fixed queries and the proliferation of application programs
File processing was a huge advance on manual processing of

queries. This led to end-users wanting more and more information.
To try to overcome the search problems of sequential files, two types of
database were introduced.
Data redundancy: the same data stored more than once

Flat file
A flat file database is a database that stores data in a plain text file. Each
line of the text file holds one record, with fields separated by delimiters,
such as commas or tabs. While it uses a simple structure, a flat file
database cannot contain multiple tables like a relational database can.
Or

A flat file database is a type of database that stores data in a single

table. This is unlike a relational database, which makes use of
multiple tables and relations. Flat file databases are generally in
plain-text form, where each line holds only one record. The fields in
the record are separated using delimiters such as tabs and commas.
Advantages of Using a Relational Database (RDB)
Advantage Notes
Control of data redundancy Flat files have a great deal of data redundancy that
is removed using a RDB.
Consistency of data There is only one copy of the data so there is less
chance of data inconsistencies occurring.
Data sharing The data belongs to the whole organisation, not to
individual departments.
More information Data sharing by departments means that
departments can use other department's data to find
information.
Improved data integrity Data is consistent and valid.
Improved security The database administrator (DBA) can define data

security – who has access to what. This is enforced
by the Database Management System (DBMS).
Enforcement of standards The DBA can set and enforce standards. These
may be departmental, organisational, national or
international.
Economy of scale Centralisation means that it is possible to
economise on size. One very large computer can
be used with dumb terminals or a network of
computers can be used.
Improved data accessibility This is because data is shared.
Increased productivity The DBMS provides file handling processes instead

of each application having to have its own
procedures.
Improved maintenance Changes to the database do not cause applications
to be re-written.
Improved back-up and recovery DBMSs automatically handle back-up and
recovery. There is no need for somebody to
remember to back-up the database each day, week
or month.
Disadvantages are not in the Specification for this Module.

2. Relational Database
In the relational database model, each item of data is stored in a relation
which is a special type of table. The strange choice of name comes from a
mathematical theory. A relational database is a collection of relational
tables. When a table is created in a relational database it is first given a
name and then the attributes are named. In a database design, a table
would be given a name with the attribute names listed in brackets aft er
the table name. For example
A fundamental principle of a relational database is that a tuple is a set of

‘atomic’ values; each attribute has one value or no value.
Entity
Database entity is a thing, person, place, unit, object or any item about which
the data should be captured or stored in the form of properties or attribute.
Properties or attributes are required (because entity without properties is not
an entity).
Table
A database table is a structure that organises data into rows and columns
– forming a grid
Relation: the special type of table which is used in a relational

database
Attribute: a column in a relation that contains values
Tuple: a row in a relation storing data for one instance of the
relation

Primary key:
An attribute or a combination of attributes for which there is a value in

each tuple and that value is unique
Candidate key:
A key that could be chosen as the primary key
Secondary key:
A candidate key that has not been chosen as the primary key
Foreign key:
An attribute in one table that refers to the primary key in another table
Primary key
The key used to uniquely identify a record or tuple is called the primary
key.
In some cases, more than one attribute, or group of attributes, could act as
the primary key. Suppose we have the relation
Composite / Compound key
A key may consist of two attributes, in which case it is called a composite

/ compound key.
Example
EMP(EmpID, NINumber, Name, Address)
Clearly, EmpID could act as the primary key. However, NINumber could
also act as the primary key as it is unique for each employee. In this case
we say that EmpID and NINumber are candidate keys.

The Purpose of Keys

A field which is candidate for primary key other than primary field is called
secondary key.
Secondary key
A field which indexed other than primary key. A Primary is always indexed by
default.
If we choose EmpID as the primary key, then NINumber is called a secondary

key.
USES
To make searches fast
Example
Now look at these two relations that we saw
CINEMA(CID, Cname, Loc, MID)

MANAGER(MID, MName)
We see that MID occurs in CINEMA and is the primary key in MANAGER.
In CINEMA we say that MID is the foreign key.
Foreign key
An attribute is a foreign key in a relation if it is the primary key in

another relation. Foreign key is used to link relations or to implement
one to many relationship between two tables.
Referential integrity:
the use of a foreign key to ensure that a value can only be entered in one table
when the same value already exists in the referenced table
Example
CUSTOMER(Num, CustName, City)
CITY_COUNTRY(City, Country)

Let’s discuss the use of a foreign key using the database design shown in
Figure above. When the database is being created, the CITY_COUNTRY is
created first. City is chosen as the primary key because unique City name can
be guaranteed. Then the CUSTOMER table is created. Num is defined as the
primary key and the attribute City is identified as a foreign key referencing the
primary key in the CITY_COUNTRY. Once this relationship between primary
and foreign keys has been established, the DBMS will prevent any entry for
City in the CUSTOMER table being made if the corresponding value does not
exist in the CITY_COUNTRY table. This provides referential integrity which
is another reason why the relational database model helps to ensure data
integrity
Data integrity:
A requirement for data to be accurate and up to date
Indexing
Indexing is a data structure technique which allows you to quickly retrieve

records from a database file. An Index is a small table having only two
columns. The first column comprises a copy of the primary or candidate key of
a table. Its second column contains a set of pointers for holding the address of
the disk block where that specific key value stored.
An index -
Takes a search key as input

Efficiently returns a collection of matching records.

Pointer
A special variable holds only the address of a memory location.
Index Table Student Table
3 Types of relationships
We now consider the relationships between the entities.
The statements show two types of relationship. There are in fact four
altogether. These are
one-to-one represented by
one-to-many represented by
many-to-one represented by
many-to-many represented by

In relational database tables are linked to each other using common fields
(Primary and foreign keys)

4 Relational Databases and Normalisation
Consider the following paper-based delivery note from Easy Fasteners Ltd.
Easy Fasteners Ltd

Old Park, The Square, Berrington, Midshire BN2 5RG
To: Bill Jones No.: 005

London Date: 14/08/01
England
Product No. Description
1 Table
2 Desk
3 Chair
Fig 2
In a relational databases.
 Instead of fields we call the columns attributes and
 The rows are called tuples.
 The files are called relations (or tables).
We write the details of our CUSTOMER in standard notation as
CUSTOMER(Num, CustName, City, Country, (ProdID, Description))
ProdID and Description are put inside parentheses because they form a
repeating group. In tabular form the data may be represented by Fig. below
Repeating Groups
Num CustName City Country ProdID Description

005 Bill Jones London England 1,2,3 Table,
Desk, Chair
This again shows the repeating group. We say that this is in un-normalised form
(UNF). To put it into 1st normal form (1NF) we complete the table and identify a key
that will make each tuple unique. This is shown in Fig. Fig. below


005 Bill Jones London England 1 Table
2 Desk
3 Chair

005 Bill Jones London England 2 Desk
005 Bill Jones London England 3 Chair
To make each row unique we need to choose Num together with ProdID as the key.
CUSTOMER(Num, CustName, City, Country, ProdID, Description)
To indicate the key, we simply underline the attributes that make up the key.
Because we have identified a key that uniquely identifies each tuple, we have
removed the repeating group.
Definition of 1NF
A relation with repeating groups removed is said to be in First

Normal Form (1NF). That is, a relation in which the
intersection of each tuple and attribute (row and column)
contains one and only one value.
Let us now see how to move from 1NF to 2NF and on to 3NF.
Definition of 2NF
A relation that is in 1NF and every non-primary key attribute is

fully dependent on the primary key is in Second Normal Form
(2NF). That is, all the incomplete dependencies have been
removed.
We now have three relations.
CUSTOMER(Num, CustName, City, Country)

PRODUCT(ProdID, Description)
CUS_PROD(Num, ProdID)
Note the keys (underlined) for each relation. DEL_PROD needs a compound key
Country depends on City not directly on Num. We need to move on to

3NF.

Definition of 3NF
A relation that is in 1NF and 2NF, and in which no non-primary key

attribute is transitively dependent on the primary key is in 3NF. That is,
all non-key elements are fully dependent on the primary key.
Removing this transitive functional determinacy, we have

Let us now use the data above and see what happens to it as the relations are
normalised.
1NF
CUSTOMER
005 Bill Jones London England 2 Desk
005 Bill Jones London England 3 Chair
008 Mary Hill Paris France 2 Desk
008 Mary Hill Paris France 7 Cupboard
014 Anne Smith New York USA 5 Cabinet
002 Tom Allen London England 7 Cupboard
002 Tom Allen London England 1 Table
002 Tom Allen London England 2 Desk
Convert to
2NF

CUSTOMER PRODUCT
008 Mary Hill Paris France 2 Desk
014 Anne Smith New York USA 3 Chair
002 Tom Allen London England 7 Cupboard
5 Cabinet
CUS_PROD
Num ProdID
005 1
005 2
005 3
008 2
008 7
014 5
002 7
002 1
002 2
Convert to
3NF
CUSTOMER CUS_PROD
Num CustName City Num ProdID
005 Bill Jones London 005 1
008 Mary Hill Paris 005 2
014 Anne Smith New York 005 3
002 Tom Allen London 008 2
008 7
014 5
002 7
002 1
002 2
PRODUCT CITY_COUNTRY
ProdID Description City Country
1 Table London England
2 Desk Paris France
3 Chair New York USA
7 Cupboard
5 Cabinet

Now we can see that redundancy of data has been removed.
In tabular form we have
UNF
CUSTOMER(Num, CustName, City, Country, (ProdID, Description))
1NF
CUSTOMER(Num, CustName, City, Country, ProdID, Description)
2NF
CUSTOMER(Num, CustName, City, Country)

3NF
CUSTPMER(Num, CustName, City)

Example 2
PRODUCT to find ProdID for Desk

CUS_PROD to find Num for this ProdID
DELNOTE to find City corresponding to Num
CITY_COUNTRY to find Country from City
Here is another example of normalisation.
Films are shown at many cinemas, each of which has a manager. A manager may
manage more than one cinema. The takings for each film are recorded for each
cinema at which the film was shown.
The following table is in UNF and uses the attribute names
FID Unique number identifying a film

Title Film title
CID Unique string identifying a cinema
Cname Name of cinema
Loc Location of cinema
MID Unique 2-digit string identifying a manager
MName Manager's name
Takings Takings for a film

FID Title CID Cname Loc MID MName Takings

15 Jaws TF Odeon Croyden 01 Smith £350
GH Embassy Osney 01 Smith £180
JK Palace Lye 02 Jones £220
23 Tomb Raider TF Odeon Croyden 01 Smith £430
GH Embassy Osney 01 Smith £200
JK Palace Lye 02 Jones £250
FB Classic Sutton 03 Allen £300
NM Roxy Longden 03 Allen £290
45 Cats & Dogs TF Odeon Croyden 01 Smith £390
LM Odeon Sutton 03 Allen £310
56 Colditz TF Odeon Croyden 01 Smith £310
NM Roxy Longden 03 Allen £250
Converting this to 1NF can be achieved by 'filling in the blanks' to give the relation
FID Title CID Cname Loc MID MName Takings

15 Jaws TF Odeon Croyden 01 Smith £350
15 Jaws GH Embassy Osney 01 Smith £180
15 Jaws JK Palace Lye 02 Jones £220
23 Tomb Raider TF Odeon Croyden 01 Smith £430
23 Tomb Raider GH Embassy Osney 01 Smith £200
23 Tomb Raider JK Palace Lye 02 Jones £250
23 Tomb Raider FB Classic Sutton 03 Allen £300
23 Tomb Raider NM Roxy Longden 03 Allen £290
45 Cats & Dogs TF Odeon Croyden 01 Smith £390
45 Cats & Dogs LM Odeon Sutton 03 Allen £310
56 Colditz TF Odeon Croyden 01 Smith £310
56 Colditz NM Roxy Longden 03 Allen £250
This is the relation
R(FID, Title, CID, Cname, Loc, MID, MName, Takings)
Title is only dependent on FID

Cname, Loc, MID, MName are only dependent on CID
Takings is dependent on both FID and CID
Therefore 2NF is
FILM(FID, Title)
CINEMA(CID, Cname, Loc, MID, MName)
TAKINGS(FID, CID, Takings)
In Cinema, the non-key attribute MName is dependent on MID. This means that it is
transitively dependent on the primary key. So we must move this out to get the 3NF
relations

FILM(FID, Title)
MANAGER(MID, MName)
5 Entity-Relationship (E-R) Diagrams
Entity-Relationship (E-R) diagrams can be used to illustrate the

relationships between entities. In the earlier example we had the four
relations

In an E-R diagram DELNOTE, CITY_COUNTRY, PRODUCT and DEL_PROD are

called entities. Entities have the same names as relations but we do not usually show
the attributes in E-R diagrams.
We now consider the relationships between the entities.
The statements show two types of relationship. There are in fact four altogether.
These are
one-to-one represented by
one-to-many represented by
many-to-one represented by
many-to-many represented by
Fig. below is the E-R diagram showing the relationships between CUSTOMER,
CITY_COUNTRY, PRODUCT and DEL_PROD.
CUSTOMER
CITY_COUNTRY CUS_PROD

PRODUCT
If the relations are in 3NF, the E-R diagram will not contain any many-to-many
relationships. If there are any one-to-one relationships, one of the entities can be
removed and its attributes added to the entity that is left.
Let us now look at our solution to the cinema problem which contained the relations
FILM(FID, Title)
MANAGER(MID, MName)
in 3NF.
We have the following relationships.
takes
FILM TAKINGS
is for
connected by FID
takes
CINEMA is for TAKINGS
connected by CID
manages
MANAGER CINEMA
managed
by
connected by MID
These produce the ERD shown in Fig. below
CINEMA
MANAGER TAKINGS
FILM
In this problem we actually have the relationship
CINEMA shows many FILMs
FILM is shown at many CINEMAs

That is

CINEMA FILM
But this cannot be normalised to 3NF because it is a many-to-many relationship.

Many-to-many relationships are removed by using a link entity as shown here.
CINEMA LINK_ENTITY FILM
If you now look at Fig. above, you will see that the link entity is TAKINGS.
8.2 Database management system
The database approach It is vital to understand that a database is not

just a collection of data. A database is an implementation according to
the rules of a theoretical model. The basic concept was proposed by
ANSI (American National Standards Institute) in its three-level model.
The three levels are:

• the external level
• the conceptual level
• the internal level.
1 Database Management System (DBMS)
Let us first look at the architecture of a DBMS as shown in Fig. below
EXTERNAL
LEVEL User 1 User 2 User 3
(Individual
users)
CONCEPTUAL
LEVEL Company Level
(Integration of
all user views)
INTERNAL DISK/FILE
LEVEL organisation
(Storage view)

External level / view/ schema
This is the individual view of the database. In a multi user database there
will generally be several different external schema representing each
user view according to their needs and access right.
 At the external level there are many different views of the data.
 Each view consists of an abstract representation of part of the total database.
Application programs will use a data manipulation language (DML) to create
these views.
Conceptual level /view /schema
The overall view of the entire database, including entities, attributes and
relationships as designed by the database designer.
 At the conceptual level there is one view of the data.

 This view is an abstract representation of the whole database.
Internal level/view/ schema
This describes how the data will be stored and is concerned with file
organisation and access methods. It is generally transparent to the user.
 The internal view of the data occurs at the internal level.

 This view represents the total database as actually stored.
 It is at this level that the data is organised into random access, indexed and
fully indexed files.
 This is hidden from the user by the DBMS.
Data management system (DBMS):
software that controls access to data in a database
Database administrator (DBA):
A person who uses the DBMS to customise the database to suit user
and programmer requirements

The facilities provided by a DBMS
Developer interface:
Gives access to software tools provided by a DBMS for creating tables

and attributes to be defined together with their data types. In addition,
the DBMS provides facilities for a programmer to develop a user
interface called forms to make data entry and viewing fast.
Query processor:
Software tools provided by a DBMS to allow creation and execution of a

query
Query:
The query is the mechanism for extracting and manipulating data from
the database.
or
Used to select data from a database subject to defined conditions
Report generator
The other feature likely to be provided by the DBMS is the capability for
creating a report to present formatted output.
DBMS functions likely to be used by a DBA
There are a number of features that can improve performance.
Data dictionary
An important feature of the DBMS is the data dictionary which is part of

the database that is hidden from view from everyone except the DBA.
It contains metadata about the data. This includes details of all the
definitions of tables, attributes and so on but also of how the physical
storage is organised.
Metadata (Data about database)
Metadata is defined as the data providing information about one or more aspects of
the data; it is used to summarize basic information about data which can make
tracking and working with specific data easier. Some examples include: Means of
creation of the data. Purpose of the data. Time and date of creation.

Index:
A small secondary table used for rapid searching which contains one
attribute from the table being searched and pointers to the tuples in that
table
The index can be on the primary key or on a secondary key. Searching

an index table is much quicker than searching the full table.
Backup facilities
The integrity of the data in the database is a key concern. One potential
cause of problems occurs when a transaction is started but a system
problem prevents its completion. The result would be a database in an
undefined state.
The DBMS should have a built-in feature that prevents this from
happening. As with all systems, regular backup is a requirement. The
DBA will be responsible for backup of the stored data.
Access Rights
Sometimes it must not be possible for a user to access the data in a database.
Different views for different user / Record Locks
In this case the one user will be able to change the contents of the
database while the other will only be allowed to query the database.
Example 1
In a banking system, accounts must be updated with the day's transactions. While this
is taking place users must not be able to access the database. Thus, at certain times of
the day, users will not be able to use a cash point.
Levelled Passwords
Different user is given different password to access the database.

Receptionist password allows her to access certain data while manager
password allows him to access his related data.

Hardware restrictions
A manager can access the record from any terminal while workers can
access from their terminals only.
This is a hardware method of preventing access. All terminals have a unique address
on their network cards. This means that the DBMS can decide which data can be
supplied to a terminal.
8.3 Structured Query Language (SQL)
SQL is the programming language provided by a DBMS to support all of

the operations associated with a relational database. Even when a
database package offers high-level software tools for user interaction,
they create an implementation using SQL
Data Definition Language (DDL)
Data Definition Language (DDL) is the part of SQL provided for creating
or altering tables.
These commands only create the structure. They do not put any data
into the database.
The following are some examples of DDL that could be used in creating
the database
CREATE DATABASE BandBooking;
CREATE TABLE Band ( BandName varchar(25), MembersNum

integer);
ALTER TABLE Band ADD PRIMARY KEY (BandName);
ALTER TABLE Band-Booking ADD FOREIGN KEY (BandName

REFERENCES Band(BandName);

Data Manipulation Language (DML)
There are three categories of use for Data Manipulation Language (DML)
• The insertion of data into the tables when the database is created
• The modification or removal of data in the database
• The reading of data stored in the database
The following illustrate the two possible ways that SQL can be written to populate a
table with data:
INSERT INTO Band (‘ComputerKidz’, 5);
INSERT INTO Band-Booking (BandName, BookingID) VALUES

(‘ComputerKidz’, ‘2016/023’);
The following are some points to note.
• Parentheses are used in both versions.

• A separate INSERT command has to be used for each tuple in the table.
• There is an order defined for the attributes.
• Although the SQL will have a list of INSERT commands the subsequent use of the
table has no concept of the tuples being ordered.
The main use of DML is to obtain data from a database using a query. A query always
starts with the SELECT command. Two examples are:
The simplest form for a query has the attributes for which values are to be listed as
output identified aft er SELECT and the table name identified aft er FROM. For
example:
SELECT BandName FROM Band;
Note that the components of the query are separated by spaces.

The Band table only has two attributes. To list the values for both there are two
options:
SELECT BandName, MembersNum FROM Band;
or
SELECT * FROM Band;
which uses * to indicate all attributes. Note that in the first example the attributes are
separated by commas but no parentheses are needed.

It is possible to include instructions in the SQL to control the presentation of the

output. The following uses ORDER BY to ensure that the output is sorted to show the
data with the band names in alphabetical order.
SELECT BandName, Members FROM Band ORDER BY BandName;
In this query there is no question of duplicate entries because BandName is the

primary key of the BandName table. However, in the Band-Booking table an
individual value for BandName will occur many times. If a query were being used to
find which bands already had a booking there would be repeated names in the output.
This can be prevented by the use of GROUP BY as shown here:
SELECT BandName FROM Band-Booking GROUP BY BandName;
An extension of the control of the output from a query is to include a condition to

limit the selected data. This is provided by a WHERE clause. The following are
examples:
SELECT BandName FROM Band-Booking WHERE Headlining = ‘Y’

GROUP BY BandName;
which produces a single output for each band that has headlined. Note how a query
can have several component parts which are best presented on separate lines.
SELECT BandName, MembersNum FROM Band WHERE MembersNum > 2

ORDER BY BandName;
which excludes any two bands.
It is possible to qualify the SELECT statement by using a function. SUM, COUNT

and AVG are examples of functions that work on data held in several tuples for a
particular attribute and return one value. For this reason, these functions are called
aggregate functions. As an example, the following code displays the number of
members in a band:
SELECT Count(*) FROM Band;
This is a special case because there is no need to specify the attribute. An example
using a specific attribute would be:
SELECT AVG(MembersNum) FROM Band;
another example is:
SELECT SUM(MembersNum) FROM Band;

A query can be based on a ‘join condition’ between data in two tables. The most
frequently used is an inner join which is illustrated by:
SELECT VenueName, Date FROM Booking

WHERE Band-Booking.BookingID = Booking.BookingID
AND Band-Booking.BandName = ‘ComputerKidz’;
The SQL uses the full definitive name for each attribute with the table name and
attribute name separated by a dot. The query contains two conditions. The way that
the query works is as follows.
• The Band-Booking table is searched for instances where the BandName is

ComputerKidz.
• For each instance the BookingID is noted.
• Then there is a search of the Booking table to find the examples of tuples having this
value for BookingID.
• For each one found the VenueName and Date are presented in the output.
Some versions of SQL require the explicit use of INNER JOIN. The following is a
possible generic syntax:
SELECT table1.column1, table2.column2... FROM table1 INNER JOIN table2

ON table1.common _ field = table2.common _ field;
The other use of DML is to modify the data stored in the database. The UPDATE
command is used to change the data. If the band ComputerKidz recruited an extra
member the following SQL would make the change needed.
UPDATE Band SET MembersNum = 6 WHERE BandName =

‘ComputerKidz’;
Note the use of the WHERE clause. If you forgot to include this the UPDATE
command would change the number of band members to 6 for all of the bands.
The DELETE command is used to remove data from the database. This has to be done
with care. If the ITWizz band decided to disband the following SQL would remove
the name from
the database.
DELETE FROM Band-Booking WHERE BandName = ‘ITWizz’; DELETE

FROM Band WHERE BandName = ‘ITWizz’;
Note that if an attempt was made to carry out the deletion from Band first there would
be an error. This is because BandName is a foreign key in Band-Booking. Any entry
for BandName in Band-Booking must have a corresponding value in Band.

Appendix: Designing Databases
Consider the following problem.
A company employs engineers who service many products. A customer may own
many products but a customer's products are all serviced by the same engineer. When
engineers service products they complete a repair form, one form for each product
repaired. The form contains details of the customer and the product that has been
repaired as well as the Engineer's ID. Each form has a unique reference number.
When a repair is complete, the customer is given a copy of the repair form.
Step 1: highlight all the fields
repaired as well as the engineer's ID. Each form has a unique reference number.
When an engineer has repaired a product, the customer is given a copy of the repair
form.
This suggests the following entities.
engineer
product
customer
repair form
Step 2: highlight all the verbs between fields
repaired as well as the Engineer's ID. Each form has a unique reference number.
When a repair is complete, the customer is given a copy of the repair form.
This suggests the following relationships.

Relationship Type Notes

engineer services product many-to-many An engineer services many products
and a product can be serviced by
many engineers. For example,
many engineers service washing
machines.
customer owns product many-to-many A customer may own many products
and a product can be owned by
many customers. For example,
many customers own washing
machines.
engineer completes form one-to-many An engineer completes many forms
but a form is completed by only one
engineer.
customer is given form one-to-many A customer may receive many
forms but a form is given to only
one customer.
product is on form one-to-many Only one product can be on a form
but a product may be on many
different forms.
Which leads to the entity relationship diagram (ERD).
services completes
ENGINEER
serviced completed
by by
PRODUCT FORM
owned by given to
CUSTOMER
owns is given
There are two many-to-many relationships that must be changed to one-to-many

relationships by using link entities. This is shown below.

services completes
ENGINEER
ENG_PROD
serviced completed
by by
PRODUCT FORM
owned by given to
CUST_PROD
CUSTOMER
owns is given
This suggests the following relations (not all the attributes are given).
ENGINEER(EngineerID, Name, … )
ENG_PROD(EngineerID, ProductID)
PRODUCT(ProductID, Description, … )
CUST_PROD(CustomerID, ProductID)
CUSTOMER(CustomerID, Name, … )
FORM(FormID, CustomerID, EngineerID, … )
These are in 3NF, but you should always check that they are.

Example Questions
1. A large organisation stores a significant amount of data. Describe the advantages

of storing this data in a database rather than in a series of flat files. (4)
A. -All the data is stored on the same structure and can therefore be accessed
through it.
-The data is not duplicated across different files, consequently…
-there is less danger of data integrity being compromised.
-Data manipulation/input can be achieved more quickly as there is only one
copy of each piece of data.
-Files need to be compatible, this problem does not arise with a database.
2. Explain what is meant by a table being in second normal form.

A. A table is in second normal form if each tuple contains no repeating attributes
(first normal form) and every attribute that is a non primary key is fully
dependent on the primary key.
Notes: Very dependent on the formal definitions, and certainly not expected in
these words in an exam. This is a very difficult idea for a student to convey, any
reasonable attempt that makes sense in plain English must be credited.
3. Every student in a school belongs to a form. Every form has a form tutor and all
the form tutors are members of the teaching body.
Draw an entity relationship diagram to show the relationships between the four
entities STUDENT, FORM, TUTOR, TEACHERS (6)
A.
STUDENT FORM
TEACHERS TUTOR
4. Explain what is meant by a foreign key. (3)

A. -A foreign key is an attribute in one table that is
-a primary key in another.
-It acts as a link between the two tables.

Chapter 8 Database CS 9618

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Chapter 8 Database CS 9618

Uploaded by

Copyright:

Available Formats

Computer Science 9618/12

8.1 Database Concepts

Field Name Data Type

Further ad hoc enquiries are virtually impossible.

This leads to redundant data or data duplication.

File containing Stock

File containing Stock

SAHIR MASROOR KHAN 03009543074

 Separation and isolation of data

As data is held in separate files so difficult find related data.

Same data is held in more than two one files

 Duplication is wasteful as it costs time and money. Data has to be

 Duplication leads to loss of data integrity.

Data formats are defined in the application programs. If there is a

Different applications may hold the data in different forms, again

 Fixed queries and the proliferation of application programs

File processing was a huge advance on manual processing of

Data redundancy: the same data stored more than once

SAHIR MASROOR KHAN 03009543074

A flat file database is a type of database that stores data in a single

Advantages of Using a Relational Database (RDB)

Improved security The database administrator (DBA) can define data

Increased productivity The DBMS provides file handling processes instead

Disadvantages are not in the Specification for this Module.

SAHIR MASROOR KHAN 03009543074

A fundamental principle of a relational database is that a tuple is a set of

Relation: the special type of table which is used in a relational

SAHIR MASROOR KHAN 03009543074

An attribute or a combination of attributes for which there is a value in

A key that could be chosen as the primary key

Composite / Compound key

A key may consist of two attributes, in which case it is called a composite

EMP(EmpID, NINumber, Name, Address)

SAHIR MASROOR KHAN 03009543074

The Purpose of Keys

If we choose EmpID as the primary key, then NINumber is called a secondary

To make searches fast

Now look at these two relations that we saw

CINEMA(CID, Cname, Loc, MID)

An attribute is a foreign key in a relation if it is the primary key in

SAHIR MASROOR KHAN 03009543074

A requirement for data to be accurate and up to date

Indexing is a data structure technique which allows you to quickly retrieve

Takes a search key as input

SAHIR MASROOR KHAN 03009543074

Index Table Student Table

We now consider the relationships between the entities.

SAHIR MASROOR KHAN 03009543074

SAHIR MASROOR KHAN 03009543074

4 Relational Databases and Normalisation

Easy Fasteners Ltd

To: Bill Jones No.: 005

Product No. Description

 Instead of fields we call the columns attributes and

 The rows are called tuples.

 The files are called relations (or tables).

We write the details of our CUSTOMER in standard notation as

CUSTOMER(Num, CustName, City, Country, (ProdID, Description))

Num CustName City Country ProdID Description

SAHIR MASROOR KHAN 03009543074

Num CustName City Country ProdID Description

Num CustName City Country ProdID Description

CUSTOMER(Num, CustName, City, Country, ProdID, Description)