IM201-Fundamentals of Database Systems

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 128

IM 201

Fundamentals of Database Systems


STUDENT
Name:
Student Number:
Program:
Section:
Home Address:
Email Address:
Contact Number:

PROFESSOR
Name:
Academic Department:
Consultation Schedule:
Email Address:
Contact Number:
M1:L1 Exercise
a. Define the following terms: data, database, DBMS, database system, database catalog, program-data
independence, user view, DBA, end user, canned transaction, deductive database system, persistent object,
meta-data, and transaction-processing application.
Data-facts or figures, or information that's stored in or utilized by a PC.
Database-assortment of related information..
DBMS (Database Management System)-computerized system that enables users to create and maintain a
database.
Database system- a collection of application programs that interact with the database along with DBMS and
database itself
Database catalog- a function included on a server that allows users and administrators to check information on
every database installed on that server.
Program-data independence - type of data transparency that refers to the immunity of changes made to users’
applications and their definition and organization of data.
User view-A view of part or all of the contents of a database specified to facilitate a particular purpose or user
activity.
DBA (Database administrator) -specialized computer systems administrator who maintains a successful
database environment by directing or performing all related activities to keep the data secure.
End user- are those who access the database from the terminal end.
Canned transaction-standard types of queries and updates which are frequently used by Naive end users to
constantly querying and updating database.
Deductive database system-a database system that contains so-called deductive rules.
Persistent object-an object that has been assigned a storage location in a federated database.
Meta-data-set of data that describes and gives information about other data.
Transaction processing application-A transaction processing application is a collection of transaction programs
designed to do the functions necessary to automate a given business activity.
b. What four main types of actions involve databases? Briefly discuss each.
Defining a database: It includes the data types, structures, and constraints of the data have to store in the
database. The database descriptive information is also stored by the DBMS in the form of a database catalog or
dictionary; it is called meta-data.
Constructing the database: It is the process of data storing on some storage medium that is maintained by the
DBMS.
Manipulating a database: It includes functions such as retrieve the database by using query, updating the
database to reflect changes in the system, and generate reports from the data.
Sharing a database: It allows multiple users and programs to access the database simultaneously.

c. Discuss the main characteristics of the database approach and how it differs from traditional file systems.
Databases store electronic representations of data, but traditional file systems generally store paper with data
written on it. In a traditional file system a file folder is the the way records are grouped, but in a database
records are indexed in many ways, both by "file folder" and by record or data field. With relational database
approaches relationships can be established between data in records and new records can be created from
crossing files and even databases.

Learning Module on IM 201


d. Discuss the capabilities that should be provided by a DBMS.
Controlling Redundancy: A technique to use redundant fields in a physical database in order to speed up
reading database access.
Restricting Unauthorized Access:
Multiple User interfaces: An operating system that allows concurrent access by multiple users on the system.
Representing Complex Relationships among Data
Enforcing Integrity Constraints
Providing Persistent storage for program objects
Providing Storage Structures and Search Techniques for Efficient Query Processing
Backup and Recovery: Process of backing up data in case of a loss and setting up systems that allow that data
recovery due to data loss.

e. Discuss the differences between database systems and information retrieval systems.
A database management system (DBMS or simply database) forms the back-end of a data information retrieval
system. Data retrieval is just one component of a DBMS; data input, storage and maintenance being the other
major components.

M1:L1 Application
a. Cite some examples of integrity constraints that you think can apply to the database shown in Figure 1.2.

Student _number of student is an integer (key constraints)


Grade in Grade _report is a single character ( Domain Constraints)
Name of student is a string alphabetic characters (key constraints)
A value of student number in a grade report record must be seen on student
record ( Referential Integrity constraint)
Value in course number in a section seen in some course record ( Referential
Integrity Constraint)
Learning Module on IM 201
b. Give examples of systems in which it may make sense to use traditional
file processing instead of a database approach.

Learning Module on IM 201


Summary of the Lesson:
1. In this chapter we defined a database as a collection of related data, where data means recorded facts. A
typical database represents some aspect of the real world and is used for specific purposes by one or more
groups of users. A DBMS is a generalized software package for implementing and maintaining a computerized
database. The database and software together form a database system.
2. We identified several characteristics that distinguish the database approach from traditional file-processing
applications, and we discussed the main categories of database users, or the actors on the scene. We noted that
in addition to database users, there are several categories of support personnel, or workers behind the scene, in a
database environment.
3. We presented a list of capabilities that should be provided by the DBMS software to the DBA, database
designers, and end users to help them design, administer, and use a database. Then we gave a brief historical
perspective on the evolution of database applications. We pointed out the recent rapid growth of the amounts
and types of data that must be stored in databases, and we discussed the emergence of new systems for handling
“big data” applications. Finally, we discussed the overhead costs of using a DBMS and discussed some
situations in which it may not be advantageous to use one.

Learning Module on IM 201


M1:L1 Enrichment Activity
a. If the name of the ‘CS’ (Computer Science) Department changes to ‘CSSE’ (Computer Science and Software
Engineering) Department and the corresponding prefix for the course number also changes, identify the
columns in the database that would need to be updated.

b. Give a one sentence description of each of the tasks you listed for question number 1.

Learning Module on IM 201


M1:L2 EXERCISE.
a. Define the following terms: data model, database schema, database state, internal schema, conceptual schema,
external schema, data independence, DDL, DML, SDL, VDL, query language, host language, data sublanguage,
database utility, catalog, client/server architecture, three-tier architecture, and n-tier architecture.

Data model- logical inter-relationships and data flow between different data elements involved in the information world.
Database schema- skeleton structure that represents the logical view of the entire database.
Database state- the tendency to try to use computers to manage society by watching people.
Internal state- physical storage structure of the database.
Conceptual schema- Database structure of the whole database for the community of users.
External schema- part of the database which specific user is interested in DBMS Architecture allows you to make changes
on the presentation level without affecting the other two layers.
Data independence- type of data transparency that matters for a centralized DBMS.
DDL (Data Definition Language) - a computer language used to create and modify the structure of database objects in a
database.
DML (Data Manipulation language) - a computer programming language used for inserting, deleting, and updating data in
a database.
SDL (Storage Definition Language) - program design and implementation language that is used to build real-time event-
driven systems that involve parallel processing.
VDL (View Definition Language) - It specifies user views and their mappings to the conceptual schema.
Query language - primarily created for creating, accessing and modifying data in and out from a DBMS.
Host language -
Data sublanguage
Database utility - is a tool to provide the interface between the ABAP Dictionary and the underlying database
management system (DBMS).
Catalog -
Client/server architecture - a computing model in which the server hosts, delivers and manages most of the resources and
services to be consumed by the client.
Three-tier architecture - a client-server architecture in which the functional process logic, data access, computer data
storage and user interface are developed and maintained as independent modules on separate platforms.
N-tier architecture - one that is distributed among three or more separate computers in a distributed network.

b. What is the difference between logical data independence and physical data independence? Which one is
harder to achieve? Why?

c. What is the difference between the two tier and three-tier client/server architecture.

Learning Module on IM 201


M1:L2 Application
If you were designing a Web-based system to make airline reservations and to sell airline tickets, which
DBMS Architecture would you choose from? Why? Why would the other architectures not be a good choice?

Learning Module on IM 201


Summary of the Lesson:
1. In this chapter we introduced the main concepts used in database systems. We defined a data model and we
distinguished three main categories:
■ High-level or conceptual data models (based on entities and relationships)
■ Low-level or physical data models
■ Representational or implementation data models (record-based, object oriented)
2. We distinguished the schema, or description of a database, from the database itself. The schema does not
change very often, whereas the database state changes every time data is inserted, deleted, or modified. Then we
described the three-schema DBMS architecture, which allows three schema levels:
■ An internal schema describes the physical storage structure of the database.
■ A conceptual schema is a high-level description of the whole database.
■ External schemas describe the views of different user groups.
3. A DBMS that cleanly separates the three levels must have mappings among the schemas to transform
requests and query results from one level to the next. Most DBMSs do not separate the three levels completely.
We used the three-schema architecture to define the concepts of logical and physical data independence.
4. Then we discussed the main types of languages and interfaces that DBMSs support.

Learning Module on IM 201


M1:L2 Enrichment Activity:
a. In addition to constraints relating the values of columns in one table to columns in another table, there are
also constraints that impose restrictions on values in a column or a combination of columns within a table. One
such constraint forces that a column or a group of columns must be unique across all rows in the table. For
example, in the STUDENT table, the StudentNumber column must be unique (to prevent two different students
from having the same StudentNumber). Identify the column or the group of columns in the other tables that
must be unique across all rows in the table.

Learning Module on IM 201


M1:L3 EXERCISE.
a. Write the correct SQL statement to create a new database called ccc;

b. SQL statement creates a full back up of the existing database "ccc" to file path “D:\backups\” with file name
“cccdb”:

c. SQL statement to use database ccc.

d. Write the correct SQL statement to create a new table called student. With the following columns and
datatypes:
studentID int,
LastName varchar(255)
FirstName varchar(255)
Address varchar(255)

e. SQL statement to add "course" column with datatype and size varchar(150) to the "student" table:

f. SQL statement to drop "course" column in "student" table:

Learning Module on IM 201


g. SQL statement to drop the existing database "ccc".

h. SQL statement to show all databases.

M1:L3 Application
Solve the following problem based on the given data below of Company ABC; note that your answer should
be reflected to the below info:
Database name: abcDB
Table name: worker

1. SQL statement to create database for company ABC.

2. SQL statement to show all databases;

3. SQL statement to use your newly created database for company ABC.

Learning Module on IM 201


4. SQL statement to create the database table of Company ABC based on the given table with the
following datatype and size; note that column name should not be changed based on the table above.

Worker id – int(10),
First name- char(25),
Last name – char(25),
Salary – int(15),
Joining date - datetime,
Department – char(25)
Note: datetime do not have size

Learning Module on IM 201


Summary of the Lesson:
1. In this chapter we introduced the capabilities of SQL.
2. Discussed the installation process of XAMPP localhost server to run SQL commands.
3. Discussed the operators and datatypes that can be used and the Database SQL statements.

Learning Module on IM 201


M1:L3 Enrichment Activity
Solve the following problem based on the given data below of Company DEF; note that your answer should
be reflected to the below info:
Database name: defDB
Table name: title

1. SQL statement to create database for company DEF.

2. SQL statement to show all databases;

3. SQL statement to use your newly created database for company DEF.

4. SQL statement to create the database table of Company DEF based on the given table with the following
datatype and size; note that column name should not be changed based on the table above.

Learning Module on IM 201


80

Worker reference id – int(10),


Worker title- char(25),
Affected from - datetime
Note: datetime do not have size

Learning Module on IM 201


81

M1 Assessment:
I. Identify the below responsibilities whether it is for database designer or database administrator. Write
DBD for database administrator otherwise DBA for database designer on the space provided before each
number.
__________________1. Conducting data backups
__________________2. Assessing database performance
__________________3. Modifying database structure if needed
__________________4. Restoring lost data
__________________5. communicate with all prospective database users
__________________6. Debugging programs or installing patches
__________________7. authorizing access to the database
__________________8. acquiring software
__________________9. purchasing hardware resources as needed
__________________10. responsible for identifying the data to be stored in the database
__________________11. choosing appropriate structures to represent and store this data
__________________12. implement the database design
__________________13. tune database performance
__________________14. install the database server software
__________________15. plan how the logical storage structure of the database will affect system performance

II. Create the schema from the below two views derived from the database. Write your answer on the
space provided.

Learning Module on IM 201


82

III. Write the SQL statement for the below problems on the lines provided.
1. Write the correct SQL statement to create a new database called teashop;
__________________________________________________________________________________________
__________________________________________________________________
2. SQL statement creates a full back up of the existing database " teashop " to file path “D:\backup\” with file
name “teashop_db”:
__________________________________________________________________________________________
__________________________________________________________________
3. SQL statement to use database teashop.

Learning Module on IM 201


83

__________________________________________________________________________________________
__________________________________________________________________
4. Write the correct SQL statement to create a new table called cust. With the following columns and datatypes:
custID int(10),
custLastName varchar(255)
custFirstName varchar(255)
custAddress varchar(255)
__________________________________________________________________________________________
__________________________________________________________________________________________
__________________________________________________________________________________________
__________________________________________________________________________________________
__________________________________________________________________________________________
__________________________________________________________________________________________
____________________________________________________________________________________
5. SQL statement to add "order_type" column with datatype and size varchar(250) to the "student" table:
__________________________________________________________________________________________
__________________________________________________________________________________________
__________________________________________________________________________________________
__________________________________________
6. SQL statement to drop "course" column in "student" table:
__________________________________________________________________________________________
__________________________________________________________________________________________
__________________________________________________________________________________________
__________________________________________
7. SQL statement to drop the existing database "teashop".
__________________________________________________________________________________________
__________________________________________________________________________________________
__________________________________________________________________________________________
__________________________________________
8. SQL statement to show all databases.
__________________________________________________________________________________________
__________________________________________________________________________________________

Learning Module on IM 201


84

I. Course Code IM 201


II. Course Title Fundamentals of Database Systems
III. Module Number 2
IV. Module Title SQL Commands and Relational Model
V. Overview of the Module This module will discuss SQL. SQL is a standard language for storing,
manipulating and retrieving data in databases. It is a standard language for
accessing and manipulating databases. It will also tackle the relational database.
The relational data model was first introduced by Ted Codd of IBM Research in
1970 in a classic paper (Codd, 1970), and it attracted immediate attention due to
its simplicity and mathematical foundation. The model uses the concept of a
mathematical relation—which looks somewhat like a table of values—as its
basic building block, and has its theoretical basis in set theory and first-order
predicate logic. In this chapter we discuss the basic characteristics of the model.
VI. Module Outcomes Once you have mastered the material, you will be able to: identify basic
commands of SQL; execute basic commands of SQL; apply basic commands of
SQL; identify the different database models; understand the concept relational
model; and identify the different relational keys that are being used in databases.

Learning Module on IM 201


85

Lesson1. SQL Commands

SQL is a standard language for storing, manipulating and retrieving data in databases. It is a standard language
for accessing and manipulating databases.
Lesson Objectives:
Once you have mastered the material in this chapter you will be able to:

1. Identify basic commands of SQL.


2. Execute basic commands of SQL.
3. Apply basic commands of SQL.

Discussion:
What is SQL?
SQL stands for Structured Query Language
SQL lets you access and manipulate databases
SQL became a standard of the American National Standards Institute (ANSI) in 1986, and of the International
Organization for Standardization (ISO) in 1987
What Can SQL do?

SQL can execute queries against a database

SQL can retrieve data from a database

SQL can insert records in a database

SQL can update records in a database

SQL can delete records from a database

SQL can create new databases

SQL can create new tables in a database

SQL can create stored procedures in a database

SQL can create views in a database

SQL can set permissions on tables, procedures, and views

SQL is a Standard - BUT....


Although SQL is an ANSI/ISO standard, there are different versions of the SQL language.
However, to be compliant with the ANSI standard, they all support at least the major commands (such as
SELECT, UPDATE, DELETE, INSERT, WHERE) in a similar manner.

Learning Module on IM 201


86

Note: Most of the SQL database programs also have their own proprietary extensions in addition to the SQL
standard!
Using SQL in Your Web Site
To build a web site that shows data from a database, you will need:

An RDBMS database program (i.e. MS Access, SQL Server, MySQL)

To use a server-side scripting language, like PHP or ASP

To use SQL to get the data you want

To use HTML / CSS to style the page
RDBMS
RDBMS stands for Relational Database Management System.
RDBMS is the basis for SQL, and for all modern database systems such as MS SQL Server, IBM DB2, Oracle,
MySQL, and Microsoft Access.
The data in RDBMS is stored in database objects called tables. A table is a collection of related data entries and
it consists of columns and rows.
Look at the "Customers" table:
Example

SELECT * FROM Customers;

Every table is broken up into smaller entities called fields. The fields in the Customers table consist of
CustomerID, CustomerName, ContactName, Address, City, PostalCode and Country. A field is a column in a
table that is designed to maintain specific information about every record in the table.
A record, also called a row, is each individual entry that exists in a table. For example, there are 91 records in
the above Customers table. A record is a horizontal entity in a table.
A column is a vertical entity in a table that contains all information associated with a specific field in a table.
SQL Syntax
Database Tables
A database most often contains one or more tables. Each table is identified by a name (e.g. "Customers" or
"Orders"). Tables contain records (rows) with data.
Below is a selection from the "Customers" table:

Learning Module on IM 201


87

The table above contains five records (one for each customer) and seven columns (CustomerID, CustomerName,
ContactName, Address, City, PostalCode, and Country).

SQL Statements
Most of the actions you need to perform on a database are done with SQL statements.
The following SQL statement selects all the records in the "Customers" table:
Example

SELECT * FROM Customers;

Keep in Mind That...


SQL keywords are NOT case sensitive: select is the same as SELECT

Semicolon after SQL Statements?


Some database systems require a semicolon at the end of each SQL statement.
Semicolon is the standard way to separate each SQL statement in database systems that allow more than one
SQL statement to be executed in the same call to the server.

Some of The Most Important SQL Commands

Learning Module on IM 201


88


SELECT - extracts data from a database

UPDATE - updates data in a database

DELETE - deletes data from a database

INSERT INTO - inserts new data into a database

CREATE DATABASE - creates a new database

ALTER DATABASE - modifies a database

CREATE TABLE - creates a new table

ALTER TABLE - modifies a table

DROP TABLE - deletes a table

CREATE INDEX - creates an index (search key)

DROP INDEX - deletes an index

The SQL INSERT INTO Statement


The INSERT INTO statement is used to insert new records in a table.
INSERT INTO Syntax
It is possible to write the INSERT INTO statement in two ways.
The first way specifies both the column names and the values to be inserted:

INSERT INTO table_name (column1, column2, column3, ...)


VALUES (value1, value2, value3, ...);

If you are adding values for all the columns of the table, you do not need to specify the column names in the
SQL query. However, make sure the order of the values is in the same order as the columns in the table. The
INSERT INTO syntax would be as follows:

INSERT INTO table_name


VALUES (value1, value2, value3, ...);

Demo Database
Below is a selection from the "Customers" table in the Northwind sample database:

Learning Module on IM 201


89

INSERT INTO Example


The following SQL statement inserts a new record in the "Customers" table:
Example

INSERT INTO Customers (CustomerName, ContactName, Address, City, PostalCode,


Country) VALUES ('Cardinal', 'Tom B. Erichsen', 'Skagen 21', 'Stavanger', '4006', 'Norway');

Insert Data Only in Specified Columns


It is also possible to only insert data in specific columns.
The following SQL statement will insert a new record, but only insert data in the "CustomerName", "City", and
"Country" columns (CustomerID will be updated automatically):
Example

INSERT INTO Customers (CustomerName, City, Country)


VALUES ('Cardinal', 'Stavanger', 'Norway');

The SQL SELECT Statement


The SELECT statement is used to select data from a database.
The data returned is stored in a result table, called the result-set.
SELECT Syntax

Learning Module on IM 201


90

SELECT column1, column2, ...


FROM table_name;

Here, column1, column2, ... are the field names of the table you want to select data from. If you want to select
all the fields available in the table, use the following syntax:

SELECT * FROM table_name;

Demo Database
Below is a selection from the "Customers" table in the Northwind sample database:

SELECT Column Example


The following SQL statement selects the "CustomerName" and "City" columns from the "Customers" table:
Example

SELECT CustomerName, City FROM Customers;

Learning Module on IM 201


91

SELECT * Example
The following SQL statement selects all the columns from the "Customers" table:
Example

SELECT * FROM Customers;

The SQL SELECT DISTINCT Statement


The SELECT DISTINCT statement is used to return only distinct (different) values.
Inside a table, a column often contains many duplicate values; and sometimes you only want to list the different
(distinct) values.

SELECT DISTINCT Syntax

Learning Module on IM 201


92

SELECT DISTINCT column1, column2, ...


FROM table_name;

Demo Database
Below is a selection from the "Customers" table in the Northwind sample database:

SELECT Example Without DISTINCT


The following SQL statement selects ALL (including the duplicates) values from the "Country" column in the
"Customers" table:
Example
SELECT Country FROM Customers;

SELECT DISTINCT Examples

Learning Module on IM 201


93

The following SQL statement selects only the DISTINCT values from the "Country" column in the
"Customers" table:
Example

SELECT DISTINCT Country FROM Customers;

The following SQL statement lists the number of different (distinct) customer countries:
Example

SELECT COUNT(DISTINCT Country) FROM Customers;

The SQL WHERE Clause


The WHERE clause is used to filter records.
The WHERE clause is used to extract only those records that fulfill a specified condition.

WHERE Syntax

SELECT column1, column2, ...


FROM table_name
WHERE condition;

Note: The WHERE clause is not only used in SELECT statement, it is also used in UPDATE, DELETE
statement, etc.!

Learning Module on IM 201


94

WHERE Clause Example


The following SQL statement selects all the customers from the country "Mexico", in the "Customers" table:
Example

SELECT * FROM Customers


WHERE Country='Mexico';

Text Fields vs. Numeric Fields


SQL requires single quotes around text values (most database systems will also allow double quotes).
However, numeric fields should not be enclosed in quotes:
Example

SELECT * FROM Customers


WHERE CustomerID=1;

Operators in The WHERE Clause


The following operators can be used in the WHERE clause:
Operator Description
= Equal
> Greater than
<Less than
>= Greater than or equal
<= Less than or equal
<> Not equal. Note: In some versions of SQL this operator may be written as !=
BETWEEN Between a certain range
LIKE Search for a pattern

Learning Module on IM 201


95

IN To specify multiple possible values for a column

The SQL AND, OR and NOT Operators


The WHERE clause can be combined with AND, OR, and NOT operators.
The AND and OR operators are used to filter records based on more than one
condition: The AND operator displays a record if all the conditions separated by AND
are TRUE. The OR operator displays a record if any of the conditions separated by OR
is TRUE. The NOT operator displays a record if the condition(s) is NOT TRUE.

AND Syntax

SELECT column1, column2, ...


FROM table_name
WHERE condition1 AND condition2 AND condition3 ...;

OR Syntax

SELECT column1, column2, ...


FROM table_name
WHERE condition1 OR condition2 OR condition3 ...;

NOT Syntax

SELECT column1, column2, ...


FROM table_name
WHERE NOT condition;

Learning Module on IM 201


96

AND Example
The following SQL statement selects all fields from "Customers" where country is "Germany" AND city
is "Berlin":
Example

SELECT * FROM Customers


WHERE Country='Germany' AND City='Berlin';

OR Example
The following SQL statement selects all fields from "Customers" where city is "Berlin" OR "München":
Example

SELECT * FROM Customers

Learning Module on IM 201


97

WHERE City='Berlin' OR City='München';

The following SQL statement selects all fields from "Customers" where country is "Germany" OR "Spain":
Example

SELECT * FROM Customers


WHERE Country='Germany' OR Country='Spain';

NOT Example
The following SQL statement selects all fields from "Customers" where country is NOT "Germany":
Example
SELECT * FROM Customers
WHERE NOT Country='Germany';

Combining AND, OR and NOT


You can also combine the AND, OR and NOT operators.
The following SQL statement selects all fields from "Customers" where country is "Germany" AND city must
be "Berlin" OR "München" (use parenthesis to form complex expressions):
Example

SELECT * FROM Customers


WHERE Country='Germany' AND (City='Berlin' OR City='München');

The following SQL statement selects all fields from "Customers" where country is NOT "Germany" and NOT
"USA":
Example

Learning Module on IM 201


98

SELECT * FROM Customers


WHERE NOT Country='Germany' AND NOT Country='USA';

The SQL ORDER BY Keyword


The ORDER BY keyword is used to sort the result-set in ascending or descending order.
The ORDER BY keyword sorts the records in ascending order by default. To sort the records in descending
order, use the DESC keyword.
ORDER BY Syntax

SELECT column1, column2, ...


FROM table_name
ORDER BY column1, column2, ... ASC|DESC;

Demo Database
Below is a selection from the "Customers" table in the Northwind sample database:

ORDER BY Example
The following SQL statement selects all customers from the "Customers" table, sorted by the "Country" column:

Learning Module on IM 201


99

Example

SELECT * FROM Customers


ORDER BY Country;

ORDER BY DESC Example


The following SQL statement selects all customers from the "Customers" table, sorted DESCENDING by the
"Country" column:
Example
SELECT * FROM Customers
ORDER BY Country DESC;

ORDER BY Several Columns Example


The following SQL statement selects all customers from the "Customers" table, sorted by the "Country" and the
"CustomerName" column. This means that it orders by Country, but if some rows have the same Country, it
orders them by CustomerName:
Example

SELECT * FROM Customers


ORDER BY Country, CustomerName;

ORDER BY Several Columns Example


The following SQL statement selects all customers from the "Customers" table, sorted ascending by the "Country"
and descending by the "CustomerName" column:
Example

SELECT * FROM Customers


ORDER BY Country ASC, CustomerName DESC;

Learning Module on IM 201


100

The SQL UPDATE Statement


The UPDATE statement is used to modify the existing records in a table.
UPDATE Syntax

UPDATE table_name
SET column1 = value1, column2 = value2, ...
WHERE condition;

Note: Be careful when updating records in a table! Notice the WHERE clause in the UPDATE statement. The
WHERE clause specifies which record(s) that should be updated. If you omit the WHERE clause, all records in
the table will be updated!

Demo Database
Below is a selection from the "Customers" table in the Northwind sample database:

UPDATE Table
The following SQL statement updates the first customer (CustomerID = 1) with a new contact person and a new
city.

Learning Module on IM 201


101

Example

UPDATE Customers
SET ContactName = 'Alfred Schmidt', City= 'Frankfurt'
WHERE CustomerID = 1;

The selection from the "Customers" table will now look like this:

UPDATE Multiple Records


It is the WHERE clause that determines how many records will be updated.

The following SQL statement will update the contactname to "Juan" for all records where country is "Mexico":
Example

UPDATE Customers
SET ContactName='Juan'
WHERE Country='Mexico';

Learning Module on IM 201


102

The selection from the "Customers" table will now look like this:

Update Warning!
Be careful when updating records. If you omit the WHERE clause, ALL records will be updated!

Example

UPDATE Customers
SET ContactName='Juan';

The selection from the "Customers" table will now look like this:

Learning Module on IM 201


103

The SQL DELETE Statement


The DELETE statement is used to delete existing records in a table.
DELETE Syntax

DELETE FROM table_name WHERE condition;

Note: Be careful when deleting records in a table! Notice the WHERE clause in the DELETE statement. The
WHERE clause specifies which record(s) should be deleted. If you omit the WHERE clause, all records in the
table will be deleted!
Demo Database
Below is a selection from the "Customers" table in the Northwind sample database:

Learning Module on IM 201


104

SQL DELETE Example


The following SQL statement deletes the customer "Alfreds Futterkiste" from the "Customers" table:
Example

DELETE FROM Customers WHERE CustomerName='Alfreds Futterkiste';

The "Customers" table will now look like this:

Delete All Records

Learning Module on IM 201


105

It is possible to delete all rows in a table without deleting the table. This means that the table structure, attributes,
and indexes will be intact:

DELETE FROM table_name;

The following SQL statement deletes all rows in the "Customers" table, without deleting the table:

Example

DELETE FROM Customers;

The SQL MIN() and MAX() Functions


The MIN() function returns the smallest value of the selected column.
The MAX() function returns the largest value of the selected column.

MIN() Syntax

SELECT MIN(column_name)
FROM table_name
WHERE condition;

MAX() Syntax

SELECT MAX(column_name)
FROM table_name
WHERE condition;

Learning Module on IM 201


106

Demo Database
Below is a selection from the "Products" table in the Northwind sample database:

MIN() Example
The following SQL statement finds the price of the cheapest product:
Example

SELECT MIN(Price) AS SmallestPrice


FROM Products;

MAX() Example
The following SQL statement finds the price of the most expensive product:
Example

SELECT MAX(Price) AS LargestPrice


FROM Products;

The SQL COUNT(), AVG() and SUM() Functions


The COUNT() function returns the number of rows that matches a specified criterion.
The AVG() function returns the average value of a numeric column.
The SUM() function returns the total sum of a numeric column.

Learning Module on IM 201


107

COUNT() Syntax

SELECT COUNT(column_name)
FROM table_name
WHERE condition;

AVG() Syntax

SELECT AVG(column_name)
FROM table_name
WHERE condition;

SUM() Syntax

SELECT SUM(column_name)
FROM table_name
WHERE condition;

Demo Database
Below is a selection from the "Products" table in the Northwind sample database:

Learning Module on IM 201


108

COUNT() Example
The following SQL statement finds the number of products:
Example

SELECT COUNT(ProductID)
FROM Products;

Note: NULL values are not counted.

AVG() Example
The following SQL statement finds the average price of all products:
Example

SELECT AVG(Price)
FROM Products;

Note: NULL values are ignored.

Demo Database
Below is a selection from the "OrderDetails" table in the Northwind sample database:

Learning Module on IM 201


109

SUM() Example
The following SQL statement finds the sum of the "Quantity" fields in the "OrderDetails" table:
Example

SELECT SUM(Quantity)
FROM OrderDetails;

Note: NULL values are ignored.

The SQL LIKE Operator


The LIKE operator is used in a WHERE clause to search for a specified pattern in a column.
There are two wildcards often used in conjunction with the LIKE operator:

% - The percent sign represents zero, one, or multiple characters


_ - The underscore represents a single character
Note: MS Access uses an asterisk (*) instead of the percent sign (%), and a question mark (?) instead of the
underscore (_).
The percent sign and the underscore can also be used in combinations!

LIKE Syntax

Learning Module on IM 201


110

SELECT column1, column2, ...


FROM table_name
WHERE column LIKE pattern;

Tip: You can also combine any number of conditions using AND or OR operators.
Here are some examples showing different LIKE operators with '%' and '_' wildcards:

Demo Database
The table below shows the complete "Customers" table from the Northwind sample database:

Learning Module on IM 201


111

SQL LIKE Examples


The following SQL statement selects all customers with a CustomerName starting with "a":
Example

SELECT * FROM Customers


WHERE CustomerName LIKE 'a%';

The following SQL statement selects all customers with a CustomerName ending with "a":
Example

SELECT * FROM Customers


WHERE CustomerName LIKE '%a';

The following SQL statement selects all customers with a CustomerName that have "or" in any position:

Learning Module on IM 201


112

Example

SELECT * FROM Customers


WHERE CustomerName LIKE '%or%';

The following SQL statement selects all customers with a CustomerName that have "r" in the second position:
Example

SELECT * FROM Customers


WHERE CustomerName LIKE '_r%';

The following SQL statement selects all customers with a CustomerName that starts with "a" and are at least 3
characters in length:
Example

SELECT * FROM Customers


WHERE CustomerName LIKE 'a__%';

The following SQL statement selects all customers with a ContactName that starts with "a" and ends with "o":
Example

SELECT * FROM Customers


WHERE ContactName LIKE 'a%o';

The following SQL statement selects all customers with a CustomerName that does NOT start with "a":
Example

Learning Module on IM 201


113

SELECT * FROM Customers


WHERE CustomerName NOT LIKE 'a%';

Learning Module on IM 201


114

M2:L1 EXERCISE.
Write false if the statement is true otherwise true. Write your answer on the space provided before each number.
__________________a. The WHERE clause is used to extract only those records that fulfill a specified condition.
__________________b. SQL requires single quotes around text values (most database systems will also allow
double quotes).
__________________c. The AND operator displays a record if all the conditions separated by AND is TRUE.
__________________d. The OR operator displays a record if any of the conditions separated by OR is TRUE.
__________________e. The NOT operator displays a record if the condition(s) is NOT TRUE.
__________________f. The COUNT() function returns the number of rows that matches a specified criteria.
__________________g. The AVG() function returns the average value of a numeric column.
__________________h. The SUM() function returns the total sum of a numeric column.
__________________i. If you are adding values for all the columns of the table, you do not need to specify the
column names in the SQL query.
__________________j. The WHERE clause specifies which record(s) should be deleted. If you omit the WHERE
clause, all records in the table will be deleted.

M2:L1 Application
Create the SQL command for the below problems; use the table below for the instance of the database.
DB Name: Company_ABC
DB Table: Employee
EMP_id EMP_name EMP_sex EMP_add EMP_byear EMP_hdate EMP_basicpay
San Pablo
1 John M City 1990 2009 25000
San Pablo
2 Eve F City 1994 2011 24000
Calamba
3 Matt M City 1992 2005 30000
Calamba
4 Mary F City 1987 2002 35000

Learning Module on IM 201


115

1. SQL command to create database Company_ABC.


__________________________________________________________________________________________
________________________________________________________________________________
2. SQL command to create table Employee.
__________________________________________________________________________________________
__________________________________________________________________________________________
__________________________________________________________________________________________
__________________________________________________________________________________________
__________________________________________________________________________________________
____________________________________________________________
3. SQL command to insert employee 1.
__________________________________________________________________________________________
________________________________________________________________________________
4. SQL command to insert employee 2.
__________________________________________________________________________________________
________________________________________________________________________________
5. SQL command to insert employee 3.
__________________________________________________________________________________________
________________________________________________________________________________
6. SQL command to insert employee 4.
__________________________________________________________________________________________
________________________________________________________________________________
7. SQL command to get the sum of basic pay of all employees.
__________________________________________________________________________________________
________________________________________________________________________________
8. SQL command to count all male employees residing at San Pablo City.
__________________________________________________________________________________________
________________________________________________________________________________

Learning Module on IM 201


116

Summary of the Lesson:


1. SQL lets you access and manipulate databases
2. What Can SQL do?
o SQL can execute queries against a database
o SQL can retrieve data from a database
o SQL can insert records in a database o
SQL can update records in a database o
SQL can delete records from a database o
SQL can create new databases
o SQL can create new tables in a database
o SQL can create stored procedures in a database
o SQL can create views in a database
o SQL can set permissions on tables, procedures, and views

3. SQL is a Standard however, to be compliant with the ANSI standard, they all support at least the major
commands (such as SELECT, UPDATE, DELETE, INSERT, WHERE) in a similar manner. Most of the SQL
database programs also have their own proprietary extensions in addition to the SQL standard.

Learning Module on IM 201


117

M2:L1 Enrichment Activity:


Use the above table and info in application to solve the below problems.
a. SQL command to count all female employees residing at Calamba City.
__________________________________________________________________________________________
________________________________________________________________________________
b. SQL command to get the age of employee 2.
__________________________________________________________________________________________
________________________________________________________________________________
c. SQL command to count employees that were hired from 2000 to 2010 and currently residing at San Pablo City.
__________________________________________________________________________________________
________________________________________________________________________________
d. SQL command to show all databases.
__________________________________________________________________________________________
________________________________________________________________________________
e. SQL command to show all tables.
__________________________________________________________________________________________
________________________________________________________________________________

Learning Module on IM 201


118

Lesson 2. Relational Model


The relational data model was first introduced by Ted Codd of IBM Research in 1970 in a classic paper (Codd,
1970), and it attracted immediate attention due to its simplicity and mathematical foundation. The model uses
the concept of a mathematical relation—which looks somewhat like a table of values—as its basic building
block, and has its theoretical basis in set theory and first-order predicate logic. In this chapter we discuss the
basic characteristics of the model.
Lesson Objectives:
Once you have mastered the material in this chapter you will be able to:

1. identify the different database models;


2. Understand the concept relational model;
3. identify the different relational keys that are being used in databases.

Discussion:
DBMS Database Models
A Database model defines the logical design and structure of a database and defines how data will be stored,
accessed and updated in a database management system. While the Relational Model is the most widely used
database model, there are other models too:
• Hierarchical Model
• Network Model
• Entity-relationship Model
• Relational Model
Hierarchical Model
This database model organizes data into a tree-like-structure, with a single root, to which all the other data is
linked. The hierarchy starts from the Root data, and expands like a tree, adding child nodes to the parent nodes.
In this model, a child node will only have a single parent node. This model efficiently describes many real-
world relationships like index of a book, recipes etc.
In hierarchical model, data is organized into tree-like structure with one one-to-many relationship between two
different types of data, for example, one department can have many courses, many professors and of course
many students.

Learning Module on IM 201


119

Network Model
This is an extension of the Hierarchical model. In this model data is organized more like a graph, and are
allowed to have more than one parent node. In this database model data is more related as more relationships
are established in this database model. Also, as the data is more related, hence accessing the data is also easier
and fast. This database model was used to map many-to-many data relationships. This was the most widely used
database model, before Relational Model was introduced.

Entity-relationship Model
In this database model, relationships are created by dividing object of interest into entity and its characteristics
into attributes.
Different entities are related using relationships.
E-R Models are defined to represent the relationships into pictorial form to make it easier for different
stakeholders to understand.
This model is good to design a database, which can then be turned into tables in relational model.

Learning Module on IM 201


120

Let's take an example, if we have to design a School Database, then Student will be an entity with attributes
name, age, address etc. As Address is generally complex, it can be another entity with attributes street name,
pincode, city etc, and there will be a relationship between them.

Relational Model
In this model, data is organized in two-dimensional tables and the relationship is maintained by storing a
common field. This model was introduced by Codd in 1970, and since then it has been the most widely used
database model, in fact, we can say the only database model used around the world. The basic structure of data
in the relational model is tables. All the information related to a particular type is stored in rows of that table.
Hence, tables are also known as relations in relational model. Relationships can also be of different types.

What is Relational Model?


RELATIONAL MODEL (RM) represents the database as a collection of relations. A relation is nothing but a
table of values. Every row in the table represents a collection of related data values. These rows in the table
denote a real-world entity or relationship.

Learning Module on IM 201


121

The table name and column names are helpful to interpret the meaning of values in each row. The data are
represented as a set of relations. In the relational model, data are stored as tables. However, the physical storage
of the data is independent of the way the data are logically organized.
Some popular Relational Database management systems are:

DB2 and Informix Dynamic Server - IBM


Oracle and RDB – Oracle
SQL Server and Access – Microsoft
Relational Model Concepts
Attribute - Each column in a Table. Attributes are the properties which define a relation. e.g., Student_Rollno,
NAME,etc.
Tables – In the Relational model the, relations are saved in the table format. It is stored along with its entities.
A table has two properties rows and columns. Rows represent records and columns represent attributes.
Tuple – It is nothing but a single row of a table, which contains a single record.
Relation Schema – A relation schema represents the name of the relation with its attributes.
Degree – The total number of attributes which in the relation is called the degree of the relation.
Cardinality – Total number of rows present in the Table.
Column – The column represents the set of values for a specific attribute.
Relation instance – Relation instance is a finite set of tuples in the RDBMS system. Relation instances never
have duplicate tuples.
Relation key - Every row has one, two or multiple attributes, which is called relation key.
Attribute domain – Every attribute has some pre-defined value and scope which is known as attribute domain.

Learning Module on IM 201


122

Relational Integrity constraints


Relational Integrity constraints is referred to conditions which must be present for a valid relation. These
integrity constraints are derived from the rules in the mini-world that the database represents.
There are many types of integrity constraints. Constraints on the Relational database management system is
mostly divided into three main categories are:
Domain constraints
Key constraints
Referential integrity constraints

Domain Constraints
Domain constraints can be violated if an attribute value is not appearing in the corresponding domain or it is not
of the appropriate data type.

Domain constraints specify that within each tuple, and the value of each attribute must be unique. This is
specified as data types which include standard data types integers, real numbers, characters, Booleans, variable
length strings, etc.
Key constraints
An attribute that can uniquely identify a tuple in a relation is called the key of the table. The value of the
attribute for different tuples in the relation has to be unique.
Example:

Learning Module on IM 201


123

In the given table, CustomerID is a key attribute of Customer Table. It is most likely to have a single key for
one customer, CustomerID =1 is only for the CustomerName =" Google".

Referential integrity constraints


Referential integrity constraints are based on the concept of Foreign Keys. A foreign key is an important
attribute of a relation which should be referred to in other relationships. Referential integrity constraint state
happens where relation refers to a key attribute of a different or same relation. However, that key element must
exist in the table.

In the above example, we have 2 relations, Customer and Billing.


Tuple for CustomerID =1 is referenced twice in the relation Billing. So we know CustomerName=Google has
billing amount $300.
Operations in Relational Model
Four basic update operations performed on relational database model are

Learning Module on IM 201


124

Insert, update, delete and select.


• Insert is used to insert data into the relation
• Delete is used to delete tuples from the table.
• Modify allows you to change the values of some attributes in existing tuples.
• Select allows you to choose a specific range of data.
Whenever one of these operations are applied, integrity constraints specified on the relational database schema
must never be violated.
Insert Operation
The insert operation gives values of the attribute for a new tuple which should be inserted into a relation.

Update Operation
You can see that in the below-given relation table CustomerName= 'Apple' is updated from Inactive to Active.

Delete Operation
To specify deletion, a condition on the attributes of the relation selects the tuple to be deleted.

In the above-given example, CustomerName= "Apple" is deleted from the table.

The Delete operation could violate referential integrity if the tuple which is deleted is referenced by foreign
keys from other tuples in the same database.

Learning Module on IM 201


125

Select Operation

In the above-given example, CustomerName="Amazon" is selected.


Best Practices for creating a Relational Model
• Data need to be represented as a collection of relations
• Each relation should be depicted clearly in the table
• Rows should contain data about instances of an entity
• Columns must contain data about attributes of the entity
• Cells of the table should hold a single value
• Each column should be given a unique name
• No two rows can be identical
• The values of an attribute should be from the same domain
Advantages of using Relational model
• Simplicity: A relational data model is simpler than the hierarchical and network model.
• Structural Independence: The relational database is only concerned with data and not with a structure. This
can improve the performance of the model.
• Easy to use: The relational model is easy as tables consisting of rows and columns is quite natural and
simple to understand
• Query capability: It makes possible for a high-level query language like SQL to avoid complex database
navigation.
• Data independence: The structure of a database can be changed without having to change any application.
• Scalable: Regarding a number of records, or rows, and the number of fields, a database should be enlarged to
enhance its usability.
Disadvantages of using Relational model
• Few relational databases have limits on field lengths which can't be exceeded.
• Relational databases can sometimes become complex as the amount of data grows, and the relations
between pieces of data become more complicated.
• Complex relational database systems may lead to isolated databases where the information cannot be shared
from one system to another.
What is Relational Algebra?
Every database management system must define a query language to allow users to access the data stored in the
database. Relational Algebra is a procedural query language used to query the database tables to access data in
different ways.

Learning Module on IM 201


126

In relational algebra, input is a relation(table from which data has to be accessed) and output is also a relation(a
temporary table holding the data asked for by the user).

Relational Algebra works on the whole table at once, so we do not have to use loops etc to iterate over all the
rows(tuples) of data one by one. All we have to do is specify the table name from which we need the data, and
in a single line of command, relational algebra will traverse the entire given table to fetch data for you.
The primary operations that we can perform using relational algebra are:
• Select
• Project
• Union
• Set Difference
• Cartesian product
• Rename
Select Operation (σ)
This is used to fetch rows(tuples) from table(relation) which satisfies a given condition.
Syntax: σp(r)
Where, σ represents the Select Predicate, r is the name of relation(table name in which you want to look for
data), and p is the prepositional logic, where we specify the conditions that must be satisfied by the data. In
prepositional logic, one can use unary and binary operators like =, <, > etc, to specify the conditions.
Let's take an example of the Student table we specified above in the Introduction of relational algebra, and fetch
data for students with age more than 17.
σage > 17 (Student)
This will fetch the tuples(rows) from table Student, for which age will be greater than 17.

Learning Module on IM 201


127

You can also use, and, or etc operators, to specify two conditions, for example,
σage > 17 and gender = 'Male' (Student)
This will return tuples(rows) from table Student with information of male students, of age more than
17.(Consider the Student table has an attribute Gender too.)
Project Operation (∏)
Project operation is used to project only a certain set of attributes of a relation. In simple words, If you want to
see only the names all of the students in the Student table, then you can use Project Operation.
It will only project or show the columns or attributes asked for, and will also remove duplicate data from the
columns.
Syntax: ∏A1, A2...(r)
where A1, A2 etc are attribute names(column names).
For example,
∏Name, Age(Student)
Above statement will show us only the Name and Age columns for all the rows of data in Student table.
Union Operation (∪)

This operation is used to fetch data from two relations(tables) or temporary relation(result of another operation).
For this operation to work, the relations(tables) specified should have same number of attributes(columns) and
same attribute domain. Also the duplicate tuples are automatically eliminated from the result.
Syntax: A ∪ B

where A and B are relations.


For example, if we have two tables RegularClass and ExtraClass, both have a column student to save name of
student, then,
∏Student(RegularClass) ∪ ∏Student(ExtraClass)

Above operation will give us name of Students who are attending both regular classes and extra classes,
eliminating repetition.
Set Difference (-)
This operation is used to find data present in one relation and not present in the second relation. This operation
is also applicable on two relations, just like Union operation.
Syntax: A - B

Learning Module on IM 201


128

where A and B are relations.


For example, if we want to find name of students who attend the regular class but not the extra class, then, we
can use the below operation:
∏Student(RegularClass) - ∏Student(ExtraClass)
Cartesian Product (X)
This is used to combine data from two different relations(tables) into one and fetch data from the combined
relation.
Syntax: A X B
For example, if we want to find the information for Regular Class and Extra Class which are conducted during
morning, then, we can use the following operation:
σtime = 'morning' (RegularClass X ExtraClass)
For the above query to work, both RegularClass and ExtraClass should have the attribute time.
Rename Operation (ρ)
This operation is used to rename the output relation for any query operation which returns result like Select,
Project etc. Or to simply rename a relation(table)
Syntax: ρ(RelationNew, RelationOld)
Apart from these common operations Relational Algebra is also used for Join operations like,
Natural Join
Outer Join
Theta join etc.
What are Keys in DBMS?
KEYS in DBMS is an attribute or set of attributes which helps you to identify a row(tuple) in a relation(table).
They allow you to find the relation between two tables. Keys help you uniquely identify a row in a table by a
combination of one or more columns in that table. Key is also helpful for finding unique record or row from the
table. Database key is also helpful for finding unique record or row from the table.

Learning Module on IM 201


129

In the above-given example, employee ID is a primary key because it uniquely identifies an employee record. In
this table, no other employee can have the same employee ID.
Why we need a Key?
Here are some reasons for using sql key in the DBMS system.
• Keys help you to identify any row of data in a table. In a real-world application, a table could contain
thousands of records. Moreover, the records could be duplicated. Keys ensure that you can uniquely identify
a table record despite these challenges.
• Allows you to establish a relationship between and identify the relation between tables
• Help you to enforce identity and integrity in the relationship.
Types of Keys in Database Management System
There are mainly seven different types of Keys in DBMS and each key has it’s different functionality:
• Super Key - A super key is a group of single or multiple keys which identifies rows in a table.
• Primary Key - is a column or group of columns in a table that uniquely identify every row in that table.
• Candidate Key - is a set of attributes that uniquely identify tuples in a table. Candidate Key is a super key
with no repeated attributes.
• Alternate Key - is a column or group of columns in a table that uniquely identify every row in that table.
• Foreign Key - is a column that creates a relationship between two tables. The purpose of Foreign keys is to
maintain data integrity and allow navigation between two different instances of an entity.
• Compound Key - has two or more attributes that allow you to uniquely recognize a specific record. It is
possible that each column may not be unique by itself within the database.
• Composite Key - An artificial key which aims to uniquely identify each record is called a surrogate key.
These kind of key are unique because they are created when you don't have any natural primary key.
• Surrogate Key - An artificial key which aims to uniquely identify each record is called a surrogate key.
These kind of key are unique because they are created when you don't have any natural primary key.
What is the Super key?
A superkey is a group of single or multiple keys which identifies rows in a table. A Super key may have
additional attributes that are not needed for unique identification.

Learning Module on IM 201


130

In the above-given example, EmpSSN, EmpSSN+Empname, EmpNum, EmpNum+Empname,


EmpNum+Empname+EmpSSN are superkeys.
What is a Primary Key?
PRIMARY KEY is a column or group of columns in a table that uniquely identify every row in that table. The
Primary Key can't be a duplicate meaning the same value can't appear more than once in the table. A table
cannot have more than one primary key.
Rules for defining Primary key:
• Two rows can't have the same primary key value
• It must for every row to have a primary key value.
• The primary key field cannot be null.
• The value in a primary key column can never be modified or updated if any foreign key refers to that
primary key.
Example:
In the following example, StudID is a Primary Key.

What is the Alternate key?


ALTERNATE KEYS is a column or group of columns in a table that uniquely identify every row in that table.
A table can have multiple choices for a primary key but only one can be set as the primary key. All the keys
which are not primary key are called an Alternate Key.

Learning Module on IM 201


131

Example:
In this table, StudID, Roll No, Email are qualified to become a primary key. But since StudID is the primary
key, Roll No, Email becomes the alternative key.

What is a Candidate Key?


CANDIDATE KEY is a set of attributes that uniquely identify tuples in a table. Candidate Key is a super key
with no repeated attributes. The Primary key should be selected from the candidate keys. Every table must have
at least a single candidate key. A table can have multiple candidate keys but only a single primary key.
Properties of Candidate key:
• It must contain unique values
• Candidate key may have multiple attributes
• Must not contain null values
• It should contain minimum fields to ensure uniqueness
• Uniquely identify each record in a table
Example: In the given table Stud ID, Roll No, and email are candidate keys which help us to uniquely identify
the student record in the table.

Learning Module on IM 201


132

What is the Foreign key?


FOREIGN KEY is a column that creates a relationship between two tables. The purpose of Foreign keys is to
maintain data integrity and allow navigation between two different instances of an entity. It acts as a cross-
reference between two tables as it references the primary key of another table.

In this key in dbms example, we have two table, teach and department in a school. However, there is no way to
see which search work in which department.
In this table, adding the foreign key in Deptcode to the Teacher name, we can create a relationship between the
two tables.

Learning Module on IM 201


133

This concept is also known as Referential Integrity.


What is the Compound key?
COMPOUND KEY has two or more attributes that allow you to uniquely recognize a specific record. It is
possible that each column may not be unique by itself within the database. However, when combined with the
other column or columns the combination of composite keys become unique. The purpose of the compound key
in database is to uniquely identify each record in the table.

In this example, OrderNo and ProductID can't be a primary key as it does not uniquely identify a record.
However, a compound key of Order ID and Product ID could be used as it uniquely identified each record.
What is the Composite key?
COMPOSITE KEY is a combination of two or more columns that uniquely identify rows in a table. The
combination of columns guarantees uniqueness, though individually uniqueness is not guaranteed. Hence, they
are combined to uniquely identify records in a table.
The difference between compound and the composite key is that any part of the compound key can be a foreign
key, but the composite key may or maybe not a part of the foreign key.
What is a Surrogate key?
SURROGATE KEYS is an artificial key which aims to uniquely identify each record is called a surrogate key.
This kind of partial key in dbms is unique because it is created when you don't have any natural primary key.
They do not lend any meaning to the data in the table. Surrogate key is usually an integer. A surrogate key is a
value generated right before the record is inserted into a table.

Learning Module on IM 201


134

Above, given example, shown shift timings of the different employee. In this example, a surrogate key is
needed to uniquely identify each employee.
Surrogate keys in sql are allowed when:
• No property has the parameter of the primary key.
• In the table when the primary key is too big or complicated.
Difference Between Primary key & Foreign key

Learning Module on IM 201


135

M2:L2 EXERCISE.
Identify the relational keys that is being asked on each problem, use the below tables to create your answer;
write your answer on the lines provided.

a. Superkeys in table students:


__________________________________________________________________________________________
__________________________________________________________________________________________
__________________________________________________________________________________________
__________________________________________
b. Superkeys in table subject:
__________________________________________________________________________________________
__________________________________________________________________________________________
__________________________________________________________________________________________
__________________________________________
______________________________________________________________________________
c. Candidate keys in all tables:
__________________________________________________________________________________________
__________________________________________________________________________________________
__________________________________________________________________________________________
__________________________________________________________________________________________
______________________________

Application

Learning Module on IM 201


136

Create a relationship between the three (3) tables given below by providing correct foreign keys to the blank
columns; use the below facts to create your answer. Encircle all primary keys on each table.
• Below is the teaching load of the Professors:
ProfIDSubjectCode
101 CS101
102 IS101
103 IT101

• All BSIT students are currently enrolled in IT101.


• All BSCS students are currently enrolled in CS101.
• All BSIS students are currently enrolled in IS101.

table: student
studentID studentName Address Course
2015123 Jenny Brgy. 1 BSIT
2015124 Albert Brgy. 2 BSIT
2015125 Simon Brgy. 1 BSCS
2015126 Jean Brgy. 1 BSCS
2015127 Veronica Brgy. 1 BSIT
2015128 Sean Brgy. 2 BSCS
2015129 Dave Brgy. 3 BSIS
2015130 Mae Brgy. 4 BSIS

table: subject
SubjectCode SubjectName
IT101 Intro to IT
CS101 Intro to CS
IS101 Intro to IS

table: prof
ProfID ProfName Department
101 John CS
102 Smith IS
103 Robert IT

Learning Module on IM 201


137

Summary of the Lesson:


1. In this chapter, we introduced the SQL database language.
2. A database most often contains one or more tables. Each table is identified by a name. Most of the actions
you need to perform on a database are done with SQL statements.
3. A key in SQL is an attribute or set of attributes which helps you to identify a row(tuple) in a relation(table).
DBMS keys allow you to establish a relationship between and identify the relation between tables.
Seven Types of DBMS keys are Super, Primary, Candidate, Alternate, Foreign, Compound, Composite, and
Surrogate Key.
• A super key is a group of single or multiple keys which identifies rows in a table.
• A column or group of columns in a table which helps us to uniquely identifies every row in that table is
called a primary key.
• All the keys which are not primary key are called an alternate key.
• A super key with no repeated attribute is called candidate key.
• A compound key is a key which has many fields which allow you to uniquely recognize a specific record.
• A key which has multiple attributes to uniquely identify rows in a table is called a composite key.
• An artificial key which aims to uniquely identify each record is called a surrogate key.
• Primary Key never accept null values while a foreign key may accept multiple null values.

Learning Module on IM 201


138

M2:L2 Enrichment Activity:


Answer the below problems using the three (3) tables on the previous exercise.
a. Primary keys in student and subject tables:
__________________________________________________________________________________________
__________________________________________________________________
b. Composite key in table enroll.
__________________________________________________________________________________________
__________________________________________________________________
c. Secondary keys in table student and subject.
__________________________________________________________________________________________
__________________________________________________________________
c. Foreign key in table student.
__________________________________________________________________________________________
________________________________________________________________________________

Learning Module on IM 201


139

M2 Assessment:
I. Multiple choice. Write your answer on the space provided before each number.

________1. A column which is added to create a relationship with another table.


a. Foreign Key b. Primary Key c. Super Key d. Candidate Key e. Alternate Key f. Surrogate Key
________2. Helps to maintain data integrity and also allows navigation between two different instances of an
entity.
a. Foreign Key b. Primary Key c. Super Key d. Candidate Key e. Alternate Key f. Surrogate Key
________3. A group of single or multiple keys which identifies rows in a table.
a. Foreign Key b. Primary Key c. Super Key d. Candidate Key e. Alternate Key f. Surrogate Key
________4. A key that may have additional attributes that are not needed for unique identification.
a. Foreign Key b. Primary Key c. Super Key d. Candidate Key e. Alternate Key f. Surrogate Key
________5. An attribute whose values match primary key values in the related table.
a. Foreign Key b. Primary Key c. Super Key d. Candidate Key e. Alternate Key f. Surrogate Key
________6. An artificial key which aims to uniquely identify each record.
a. Foreign Key b. Primary Key c. Super Key d. Candidate Key e. Alternate Key f. Surrogate Key
________7. A minimal super key.
a. Foreign Key b. Primary Key c. Super Key d. Candidate Key e. Alternate Key f. Surrogate Key
________8. With SQL, how do you select all the records from a table named "Persons" where the "LastName" is
alphabetically between (and including) "Hansen" and "Pettersen"?
a. SELECT LastName>'Hansen' AND LastName<'Pettersen' FROM Persons
b. SELECT * FROM Persons WHERE LastName BETWEEN 'Hansen' AND 'Pettersen'
c. SELECT * FROM Persons WHERE LastName>'Hansen' AND LastName<'Pettersen'
________9. Which SQL statement is used to extract data from a database?
d. EXTRACT e. GET f. OPEN g. SELECT
________10. Which SQL statement is used to update data in a database?
h. MODIFY i. UPDATE j. SAVE AS k. SAVE

Learning Module on IM 201


140

________11. Which SQL statement is used to delete data from a database?


l. COLLAPSE m. REMOVE n. DELETE
________12. Which SQL statement is used to insert new data in a database?
o. INSERT NEW s. ADD NEW q. INSERT INTO r. ADD RECORD
________13. With SQL, how do you select a column named "FirstName" from a table named "Persons"?
e. EXTRACT FirstName FROM Persons d. SELECT Persons.FirstName z. SELECT FirstName FROM
Persons
________14. With SQL, how do you select all the columns from a table named "Persons"?
a. SELECT [all] FROM Persons o. SELECT *.Persons y. SELECT Persons e. SELECT * FROM
Persons
________15. With SQL, how do you select all the records from a table named "Persons" where the value of the
column "FirstName" is "Peter"?
r. SELECT [all] FROM Persons WHERE FirstName LIKE 'Peter'
p. SELECT * FROM Persons WHERE FirstName<>'Peter' h.
SELECT * FROM Persons WHERE FirstName='Peter'
s. SELECT [all] FROM Persons WHERE FirstName='Peter'
________16. With SQL, how do you select all the records from a table named "Persons" where the value of the
column "FirstName" starts with an "a"?
p. SELECT * FROM Persons WHERE FirstName='%a%'
o. SELECT * FROM Persons WHERE FirstName LIKE
'a%' u. SELECT * FROM Persons WHERE FirstName='a'
y. SELECT * FROM Persons WHERE FirstName LIKE '%a'
________17. With SQL, how do you select all the records from a table named "Persons" where the "FirstName"
is "Peter" and the "LastName" is "Jackson"?
c. SELECT * FROM Persons WHERE FirstName='Peter' AND LastName='Jackson' v.
SELECT FirstName='Peter', LastName='Jackson' FROM Persons
f. SELECT * FROM Persons WHERE FirstName<>'Peter' AND LastName<>'Jackson'
________18. Which SQL statement is used to return only different values?

Learning Module on IM 201


141

d. SELECT UNIQUE o. SELECT DIFFERENT g. SELECT DISTINCT


________19. With SQL, how can you insert a new record into the "Persons" table?
r. INSERT ('Jimmy', 'Jackson') INTO Persons
s. INSERT VALUES ('Jimmy', 'Jackson') INTO Persons t.
INSERT INTO Persons VALUES ('Jimmy', 'Jackson')
________20. With SQL, how can you insert "Olsen" as the "LastName" in the "Persons" table?
p. INSERT INTO Persons ('Olsen') INTO LastName
q. INSERT ('Olsen') INTO Persons (LastName)
e. INSERT INTO Persons (LastName) VALUES ('Olsen')
________21. How can you change "Hansen" into "Nilsen" in the "LastName" column in the Persons table?
v. UPDATE Persons SET LastName='Nilsen' WHERE LastName='Hansen'
b. MODIFY Persons SET LastName='Nilsen' WHERE LastName='Hansen'
f. UPDATE Persons SET LastName='Hansen' INTO LastName='Nilsen' p.
MODIFY Persons SET LastName='Hansen' INTO LastName='Nilsen
________22. With SQL, how can you return the number of records in the "Persons" table?
o. SELECT COLUMNS(*) FROM Persons
w. SELECT NO(*) FROM Persons
p. SELECT COUNT(*) FROM Persons
q. SELECT LEN(*) FROM Persons

II. Write the SQL statement for the below problems; write your answer on the lines provided.
Table name: employee

Learning Module on IM 201


142

1. SQL statement to select all employees with an Fname starting with "j":
__________________________________________________________________________________________
__________________________________________________________________
2. SQL statement to select all employees with an Lname ending with "g":
__________________________________________________________________________________________
__________________________________________________________________
3. SQL statement to select all employees with an Address that have "as" in any position:
__________________________________________________________________________________________
__________________________________________________________________
4. SQL statement to select all employees with an Address that have "e" in the second position:
__________________________________________________________________________________________
__________________________________________________________________
d. SQL statement to select all employees with an Fname that starts with "b" and are at least 3 characters in length:
__________________________________________________________________________________________
__________________________________________________________________
e. SQL statement to select all employees with an Fname that starts with "b" and ends with "e":
__________________________________________________________________________________________
__________________________________________________________________
f. SQL statement to select all employees with an Address that does NOT start with "f":
__________________________________________________________________________________________
__________________________________________________________________
8. SQL statement to select all employees with an Fname starting with "e":
__________________________________________________________________________________________
__________________________________________________________________
9. SQL statement to select all employees with an Lname ending with "c":
__________________________________________________________________________________________
__________________________________________________________________
10. SQL statement to select all employees with an Address that have "if" in any position:
__________________________________________________________________________________________
__________________________________________________________________11. SQL statement to select all
employees with an Address that have "t" in the second position:

Learning Module on IM 201


143

__________________________________________________________________________________________
__________________________________________________________________
12. SQL statement to select all employees with an Fname that starts with "c" and are at least 3 characters in length:
__________________________________________________________________________________________
__________________________________________________________________
13. SQL statement to select all employees with an Fname that starts with "w" and ends with "e":
__________________________________________________________________________________________
__________________________________________________________________
14. SQL statement to select all employees with an Address that does NOT start with "k":
__________________________________________________________________________________________
__________________________________________________________________
15. SQL statement to display distinct values of SSN of all male employees.
__________________________________________________________________________________________
__________________________________________________________________
16. SQL statement to display distinct values of SSN of all female employees.
__________________________________________________________________________________________
__________________________________________________________________
17. SQL statement to display distinct values of SSN of all employees.
__________________________________________________________________________________________
__________________________________________________________________
18. SQL statement to display the total salary of all employees.
__________________________________________________________________________________________
__________________________________________________________________
19. SQL statement to count all employees.
__________________________________________________________________________________________
__________________________________________________________________
20. SQL statement to count all female and male employees.
__________________________________________________________________________________________
__________________________________________________________________

Learning Module on IM 201


144

I. Course Code IM 201


II. Course Title Fundamentals of Database Systems
III. Module Number 3
IV. Module Title Database Normalization and SQL Join and ER Model
V. Overview of the Module Database Normalization is a technique of organizing the data in the
database. Normalization is a systematic approach of decomposing tables to
eliminate data redundancy(repetition) and undesirable characteristics like
Insertion, Update and Deletion Anomalies. It is a multi-step process that
puts data into tabular form, removing duplicated data from the relation
tables.
This module will also discuss joins. A JOIN clause is used to combine rows
from two or more tables, based on a related column between them. It will
also tackle the ER modeling and how to create ER diagram.
VI. Module Outcomes What you will learn in this chapter: how normalization protects databases
from anomalies; rules on how to create 1NF, 2NF, 3NF and BCNF. You
will also be able to understand and create: inner join; left join; right join;
and full join; the main characteristics of entity relationship components;
how relationships between entities are defined and refined; how ERD
components affect database design and implementation.

Learning Module on IM 201


145

Lesson 1. Database Normalization and SQL Join

Database Normalization is a technique of organizing the data in the database. Normalization is a systematic
approach of decomposing tables to eliminate data redundancy(repetition) and undesirable characteristics like
Insertion, Update and Deletion Anomalies. It is a multi-step process that puts data into tabular form, removing
duplicated data from the relation tables.
This module will also discuss joins. A JOIN clause is used to combine rows from two or more tables, based on a
related column between them.
Lesson Objectives:
What you will learn in this chapter:

1. How normalization protects databases from anomalies.


2. Rules on how to create 1st normal form.
3. Rules on how to create 2nd normal form.
4. Rules on how to create 3rd normal form.
5. Rules on how to create BCNF.
You will also be able to understand and create:
6. inner join;
7. left join;
8. right join; and
9. full join.

Discussion:
Normalization is used for mainly two purposes,
• Eliminating redundant(useless) data.
• Ensuring data dependencies make sense i.e data is logically stored.

Problems Without Normalization


If a table is not properly normalized and have data redundancy then it will not only eat up extra memory space but
will also make it difficult to handle and update the database, without facing data loss. Insertion, Updation and

Learning Module on IM 201


146

Deletion Anomalies are very frequent if database is not normalized. To understand these anomalies let us take
an example of a Student table.

In the table above, we have data of 4 Computer Sci. students. As we can see, data for the fields branch,
hod(Head of Department) and office_tel is repeated for the students who are in the same branch in the college,
this is Data Redundancy.
Insertion Anomaly
Suppose for a new admission, until and unless a student opts for a branch, data of the student cannot be inserted,
or else we will have to set the branch information as NULL.
Also, if we have to insert data of 100 students of same branch, then the branch information will be repeated for
all those 100 students.
These scenarios are nothing but Insertion anomalies.
Updation Anomaly
What if Mr. X leaves the college? or is no longer the HOD of computer science department? In that case all the
student records will have to be updated, and if by mistake we miss any record, it will lead to data inconsistency.
This is Updation anomaly.
Deletion Anomaly
In our Student table, two different informations are kept together, Student information and Branch information.
Hence, at the end of the academic year, if student records are deleted, we will also lose the branch information.
This is Deletion anomaly.
Normalization Rule
Normalization rules are divided into the following normal forms:
First Normal Form
Second Normal Form

Learning Module on IM 201


147

Third Normal Form


BCNF
First Normal Form (1NF)
For a table to be in the First Normal Form, it should follow the following 4 rules:
1. It should only have single(atomic) valued attributes/columns.
2. Values stored in a column should be of the same domain
3. All the columns in a table should have unique names.
4. And the order in which data is stored, does not matter.

Second Normal Form (2NF)


For a table to be in the Second Normal Form,
1. It should be in the First Normal form.
2. And, it should not have Partial Dependency.

Third Normal Form (3NF)


A table is said to be in the Third Normal Form when,

1. It is in the Second Normal form.


2. And, it doesn't have Transitive Dependency.

Boyce and Codd Normal Form (BCNF)


Boyce and Codd Normal Form is a higher version of the Third Normal form. This form deals with certain type
of anomaly that is not handled by 3NF. A 3NF table which does not have multiple overlapping candidate keys is
said to be in BCNF. For a table to be in BCNF, following conditions must be satisfied:

1. It must be in 3rd Normal Form


2. and, for each functional dependency ( X → Y ), X should be a super Key.

What is First Normal Form (1NF)?


The 1st Normal form expects you to design your table in such a way that it can easily be extended and it is
easier for you to retrieve data from it whenever required.

Learning Module on IM 201


148

Rules for First Normal Form


The first normal form expects you to follow a few simple rules while designing your database, and they are:
Rule 1: Single Valued Attributes
Each column of your table should be single valued which means they should not contain multiple values. We
will explain this with help of an example later, let's see the other rules for now.
Rule 2: Attribute Domain should not change
This is more of a "Common Sense" rule. In each column the values stored must be of the same kind or type.
For example: If you have a column dob to save date of births of a set of people, then you cannot or you must not
save 'names' of some of them in that column along with 'date of birth' of others in that column. It should hold
only 'date of birth' for all the records/rows.
Rule 3: Unique name for Attributes/Columns
This rule expects that each column in a table should have a unique name. This is to avoid confusion at the time
of retrieving data or performing any other operation on the stored data.
If one or more columns have same name, then the DBMS system will be left confused.
Rule 4: Order doesn't matters
This rule says that the order in which you store the data in your table doesn't matter.
Here is our table, with some sample data added to it.

Our table already satisfies 3 rules out of the 4 rules, as all our column names are unique, we have stored data in
the order we wanted to and we have not inter-mixed different type of data in columns.
But out of the 3 different students in our table, 2 have opted for more than 1 subject. And we have stored the
subject names in a single column. But as per the 1st Normal form each column must contain atomic value.
How to solve this Problem?
It's very simple, because all we have to do is break the values into atomic values.

Learning Module on IM 201


149

Here is our updated table and it now satisfies the First Normal Form.

By doing so, although a few values are getting repeated but values for the subject column are now atomic for
each record/row.
Using the First Normal Form, data redundancy increases, as there will be many columns with same data in
multiple rows but each row as a whole will be unique.
What is Second Normal Form?
For a table to be in the Second Normal Form, it must satisfy two conditions:
The table should be in the First Normal Form.
There should be no Partial Dependency.
What is Dependency?
Let's take an example of a Student table with columns student_id, name, reg_no(registration number), branch
and address(student's home address).

In this table, student_id is the primary key and will be unique for every row, hence we can use student_id to
fetch any row of data from this table.
Even for a case, where student names are same, if we know the student_id we can easily fetch the correct record.

Learning Module on IM 201


150

Hence we can say a Primary Key for a table is the column or a group of columns(composite key) which can
uniquely identify each record in the table.
I can ask from branch name of student with student_id 10, and I can get it. Similarly, if I ask for name of
student with student_id 10 or 11, I will get it. So all I need is student_id and every other column depends on it,
or can be fetched using it.
This is Dependency and we also call it Functional Dependency.
What is Partial Dependency?
Now that we know what dependency is, we are in a better state to understand what partial dependency is.
For a simple table like Student, a single column like student_id can uniquely identfy all the records in a table.
But this is not true all the time. So now let's extend our example to see if more than 1 column together can act as
a primary key.
Let's create another table for Subject, which will have subject_id and subject_name fields and subject_id will be
the primary key.

Now we have a Student table with student information and another table Subject for storing subject information.
Let's create another table Score, to store the marks obtained by students in the respective subjects. We will also
be saving name of the teacher who teaches that subject along with marks.

Learning Module on IM 201


151

In the score table we are saving the student_id to know which student's marks are these and subject_id to know
for which subject the marks are for.
Together, student_id + subject_id forms a Candidate Key for this table, which can be the Primary key.
Confused, how this combination can be a primary key?
See, if I ask you to get me marks of student with student_id 10, can you get it from this table? No, because you
don't know for which subject. And if I give you subject_id, you would not know for which student. Hence we
need student_id + subject_id to uniquely identify any row.
But where is Partial Dependency?
Now if you look at the Score table, we have a column names teacher which is only dependent on the subject, for
Java it's Java Teacher and for C++ it's C++ Teacher & so on.
Now as we just discussed that the primary key for this table is a composition of two columns which is student_id
& subject_id but the teacher's name only depends on subject, hence the subject_id, and has nothing to do with
student_id.
This is Partial Dependency, where an attribute in a table depends on only a part of the primary key and not on
the whole key.
How to remove Partial Dependency?
There can be many different solutions for this, but our objective is to remove teacher's name from Score table.
The simplest solution is to remove columns teacher from Score table and add it to the Subject table. Hence, the
Subject table will become:

Learning Module on IM 201


152

And our Score table is now in the second normal form, with no partial dependency.

Recap
For a table to be in the Second Normal form, it should be in the First Normal form and it should not have Partial
Dependency.
Partial Dependency exists, when for a composite primary key, any attribute in the table depends only on a part
of the primary key and not on the complete primary key.
To remove Partial dependency, we can divide the table, remove the attribute which is causing partial
dependency, and move it to some other table where it fits in well.

Third Normal Form (3NF)


Third Normal Form is an upgrade to Second Normal Form. When a table is in the Second Normal Form and has
no transitive dependency, then it is in the Third Normal Form.
So let's use the same example, where we have 3 tables, Student, Subject and Score.

Learning Module on IM 201


153

In the Score table, we need to store some more information, which is the exam name and total marks, so let's
add 2 more columns to the Score table.

Requirements for Third Normal Form


For a table to be in the third normal form,
1. It should be in the Second Normal form.
2. And it should not have Transitive Dependency.

What is Transitive Dependency?

Learning Module on IM 201


154

With exam_name and total_marks added to our Score table, it saves more data now. Primary key for our Score
table is a composite key, which means it's made up of two attributes or columns → student_id + subject_id.
Our new column exam_name depends on both student and subject. For example, a mechanical engineering student
will have Workshop exam but a computer science student won't. And for some subjects you have Prctical exams and
for some you don't. So we can say that exam_name is dependent on both student_id and subject_id.
And what about our second new column total_marks? Does it depend on our Score table's primary key?
Well, the column total_marks depends on exam_name as with exam type the total score changes. For example,
practicals are of less marks while theory exams are of more marks.
But, exam_name is just another column in the score table. It is not a primary key or even a part of the primary
key, and total_marks depend on it.
This is Transitive Dependency. When a non-prime attribute depends on other non-prime attributes rather than
depending upon the prime attributes or primary key.
How to remove Transitive Dependency?
Again the solution is very simple. Take out the columns exam_name and total_marks from Score table and put
them in an Exam table and use the exam_id wherever required.

Advantage of removing Transitive Dependency


The advantage of removing transitive dependency are,
Amount of data duplication is reduced.
Data integrity achieved.

Learning Module on IM 201


155

Boyce-Codd Normal Form (BCNF)


Boyce-Codd Normal Form or BCNF is an extension to the third normal form, and is also known as 3.5 Normal
Form.
Rules for BCNF
For a table to satisfy the Boyce-Codd Normal Form, it should satisfy the following two conditions:
1. It should be in the Third Normal Form.
2. And, for any dependency A → B, A should be a super key.
The second point sounds a bit tricky, right? In simple words, it means, that for a dependency A → B, A cannot
be a non-prime attribute, if B is a prime attribute.
Example
Below we have a college enrolment table with columns student_id, subject and professor.

As you can see, we have also added some sample data to the table.
In the table above:
One student can enroll for multiple subjects. For example, student with student_id 101, has opted for subjects -
Java & C++
For each subject, a professor is assigned to the student.
And, there can be multiple professors teaching one subject like we have for Java.
What do you think should be the Primary Key?
Well, in the table above student_id, subject together form the primary key, because using student_id and subject,
we can find all the columns of the table.

Learning Module on IM 201


156

One more important point to note here is, one professor teaches only one subject, but one subject may have two
different professors.
Hence, there is a dependency between subject and professor here, where subject depends on the professor name.
This table satisfies the 1st Normal form because all the values are atomic, column names are unique and all the
values stored in a particular column are of same domain.
This table also satisfies the 2nd Normal Form as there is no Partial Dependency.
And, there is no Transitive Dependency, hence the table also satisfies the 3rd Normal Form.
But this table is not in Boyce-Codd Normal Form.
Why this table is not in BCNF?
In the table above, student_id, subject form primary key, which means subject column is a prime attribute.
But, there is one more dependency, professor → subject.
And while subject is a prime attribute, professor is a non-prime attribute, which is not allowed by BCNF.
How to satisfy BCNF?
To make this relation(table) satisfy BCNF, we will decompose this table into two tables, student table and
professor table.
Below we have the structure for both the tables.

Learning Module on IM 201


157

And now, this relation satisfy Boyce-Codd Normal Form.


A more Generic Explanation
In the picture below, we have tried to explain BCNF in terms of relations.

SQL JOIN
A JOIN clause is used to combine rows from two or more tables, based on a related column between them.
Let's look at a selection from the "Orders" table:

Learning Module on IM 201


158

Then, look at a selection from the "Customers" table:

Notice that the "CustomerID" column in the "Orders" table refers to the "CustomerID" in the "Customers" table.
The relationship between the two tables above is the "CustomerID" column.

Then, we can create the following SQL statement (that contains an INNER JOIN), that selects records that have
matching values in both tables:
Example

SELECT Orders.OrderID, Customers.CustomerName, Orders.OrderDate


FROM Orders
INNER JOIN Customers ON Orders.CustomerID=Customers.CustomerID;

and it will produce something like this:

Learning Module on IM 201


159

Different Types of SQL JOINs


Here are the different types of the JOINs in SQL:
(INNER) JOIN: Returns records that have matching values in both tables.
LEFT (OUTER) JOIN: Returns all records from the left table, and the matched records from the right table.
RIGHT (OUTER) JOIN: Returns all records from the right table, and the matched records from the left table.
FULL (OUTER) JOIN: Returns all records when there is a match in either left or right table.

SQL INNER JOIN Keyword


The INNER JOIN keyword selects records that have matching values in both tables.
INNER JOIN Syntax

SELECT column_name(s)
FROM table1
INNER JOIN table2
ON table1.column_name = table2.column_name;

Learning Module on IM 201


160

Demo Database
We will use the well-known Northwind sample database.
Below is a selection from the "Orders" table:

And a selection from the "Customers" table:

SQL INNER JOIN Example


The following SQL statement selects all orders with customer information:
Example

SELECT Orders.OrderID, Customers.CustomerName


FROM Orders
INNER JOIN Customers ON Orders.CustomerID = Customers.CustomerID;

OrderID CustomerName
10308 Ana Trujillo Emparedados
y helados

Learning Module on IM 201


161

10309 Hungry Owl All-Night


Grocers
10310 The Big Cheese

Note: The INNER JOIN keyword selects all rows from both tables as long as there is a match between the
columns. If there are records in the "Orders" table that do not have matches in "Customers", these orders
will not be shown!
JOIN Three Tables
The following SQL statement selects all orders with customer and shipper information:
Example

SELECT Orders.OrderID, Customers.CustomerName, Shippers.ShipperName


FROM ((Orders
INNER JOIN Customers ON Orders.CustomerID = Customers.CustomerID)
INNER JOIN Shippers ON Orders.ShipperID = Shippers.ShipperID);

Note: You can create Shippers table considering that it has ShipperID.

SQL LEFT JOIN Keyword


The LEFT JOIN keyword returns all records from the left table (table1), and the matched records from the right
table (table2). The result is NULL from the right side, if there is no match.
LEFT JOIN Syntax

SELECT column_name(s)
FROM table1
LEFT JOIN table2
ON table1.column_name = table2.column_name;

Learning Module on IM 201


162

Note: In some databases LEFT JOIN is called LEFT OUTER JOIN.

Demo Database
We will use the well-known Northwind sample database.
Below is a selection from the "Customers" table:

And a selection from the "Orders" table:

SQL LEFT JOIN Example


The following SQL statement will select all customers, and any orders they might have:
Example

SELECT Customers.CustomerName, Orders.OrderID


FROM Customers
LEFT JOIN Orders ON Customers.CustomerID = Orders.CustomerID
ORDER BY Customers.CustomerName;

Learning Module on IM 201


163

Note: The LEFT JOIN keyword returns all records from the left table (Customers), even if there are no matches
in the right table (Orders). Run the SQL statement to see the result; create the tables first before running the join
statement.

SQL RIGHT JOIN Keyword


The RIGHT JOIN keyword returns all records from the right table (table2), and the matched records from the
left table (table1). The result is NULL from the left side, when there is no match.
RIGHT JOIN Syntax

SELECT column_name(s)
FROM table1
RIGHT JOIN table2
ON table1.column_name = table2.column_name;

Note: In some databases RIGHT JOIN is called RIGHT OUTER JOIN.

Demo Database
We will use the well-known Northwind sample database.

Learning Module on IM 201


164

SQL RIGHT JOIN Example


The following SQL statement will return all employees, and any orders they might have placed:
Example

SELECT Orders.OrderID, Employees.LastName, Employees.FirstName


FROM Orders
RIGHT JOIN Employees ON Orders.EmployeeID = Employees.EmployeeID
ORDER BY Orders.OrderID;

Note: The RIGHT JOIN keyword returns all records from the right table (Employees), even if there are no
matches in the left table (Orders). Run the SQL statement to see the result; create the tables first before running
the join statement.

SQL FULL OUTER JOIN Keyword

Learning Module on IM 201


165

The FULL OUTER JOIN keyword returns all records when there is a match in left (table1) or right (table2)
table records.
Note: FULL OUTER JOIN can potentially return very large result-sets!
Tip: FULL OUTER JOIN and FULL JOIN are the same.
FULL OUTER JOIN Syntax

SELECT column_name(s)
FROM table1
FULL OUTER JOIN table2
ON table1.column_name = table2.column_name
WHERE condition;

Demo Database
We will use the well-known Northwind sample database.

Learning Module on IM 201


166

SQL FULL OUTER JOIN Example


The following SQL statement selects all customers, and all orders:

SELECT Customers.CustomerName, Orders.OrderID


FROM Customers
FULL OUTER JOIN Orders ON Customers.CustomerID=Orders.CustomerID
ORDER BY Customers.CustomerName;

A selection from the result set may look like this:

Learning Module on IM 201


167

Note: The FULL OUTER JOIN keyword returns all matching records from both tables whether the other table
matches or not. So, if there are rows in "Customers" that do not have matches in "Orders", or if there are rows in
"Orders" that do not have matches in "Customers", those rows will be listed as well.

Learning Module on IM 201


168

M3:L1 EXERCISE.
a. If a table includes a ZIP code with every address, what 3NF rule does the table break? Why?
__________________________________________________________________________________________
__________________________________________________________________________________________
__________________________________________________________________________________________
__________________________________________________________________________________________
______________________________
b. What data anomalies can result from including postal codes in address data? How bad are they? How can you
mitigate the problems?
__________________________________________________________________________________________
__________________________________________________________________________________________
__________________________________________________________________________________________
__________________________________________________________________________________________
__________________________________________________________________________________________
__________________________________________________________________________________________
__________________________________________________________________________________________
__________________________________________________________________________________________
__________________________________________________________________________________________
__________________________________________________________________________________________
__________________________________________________________________________________________
__________________________________________________________________________________________
__________________________________________________________________________________________

c. Suppose you’re writing an application to record times for dragon boat races and consider the table below.
Assume the table’s key is Heat. What 1NF, 2NF, and 3NF rules does this design violate?

__________________________________________________________________________________________
__________________________________________________________________________________________
__________________________________________________________________________________________
__________________________________________________________________________________________
__________________________________________________________________________________________

Learning Module on IM 201


169

__________________________________________________________________________________________
__________________________________________________________________________________________
__________________________________________________________________________________________
__________________________________________________________________________________________
__________________________________________________________________________________________
____________________________________
M3:L1 Application

Create the 1st, 2nd, and 3rd normal form of the given table below. Write your answer on the space provide.

a. 1NF

b. 2NF

Learning Module on IM 201


170

c. 3NF

Learning Module on IM 201


171

Summary of the Lesson:


1. Normalization protects a database from data anomalies.
2. 1NF rules:
1. Each column must have a unique name.
• The order of the rows and columns doesn’t matter.
• Each column must have a single data type.
• No two rows can contain identical values.
• Each column must contain a single value.
• Columns cannot contain repeating groups.
3. 2NF rules:
• It is in 1NF.
• All non-key fields depend on all key fields.
4. 3NF rules:
• It is in 2NF.
• It contains no transitive dependencies. (No non-key fields depend on other non-key fields.)
5. SQL JOIN inner, left, right and outer are used to combine tables.

Learning Module on IM 201


172

M3:L1 Enrichment Activity


Create the SQL command for the below problems; use the tables below for the instance of the database.

a. SQL command to inner join table DRUG and table PRESCRIPTION.


__________________________________________________________________________________________
__________________________________________________________________________________________
__________________________________________________________________________________________
__________________________________________________________________________________________
__________________________________________________________________________________________
__________________
b. SQL command to display columns “drug_name” and “pres_dosage”.
__________________________________________________________________________________________
__________________________________________________________________________________________
__________________________________________________________________________________________
__________________________________________________________________________________________
__________________________________________________________________________________________
__________________
c. SQL command to display all columns of tables DOCTOR, PATIENT and PRESCRIPTION.
__________________________________________________________________________________________
__________________________________________________________________________________________
__________________________________________________________________________________________
__________________________________________________________________________________________
__________________________________________________________________________________________
__________________
d. SQL command to display all patient names and their assigned doctors.

Learning Module on IM 201


173

__________________________________________________________________________________________
__________________________________________________________________________________________
__________________________________________________________________________________________
__________________________________________________________________________________________
__________________________________________________________________________________________
__________________
e. SQL command to join tables DRUG, PATIENT and PRESCRIPTION.
__________________________________________________________________________________________
__________________________________________________________________________________________
__________________________________________________________________________________________
__________________________________________________________________________________________
__________________________________________________________________________________________
__________________

Learning Module on IM 201


174

Lesson 2. Entity-Relationship Model

We present the modeling concepts of the entity–relationship (ER) model, which is a popular high-level
conceptual data model. This model and its variations are frequently used for the conceptual design of database
applications, and many database design tools employ its concepts. We describe the basic data-structuring
concepts and constraints of the ER model and discuss their use in the design of conceptual schemas for database
applications. We also present the diagrammatic notation associated with the ER model, known as ER diagrams.
Lesson Objectives:
In this lesson, you will learn:
1. The main characteristics of entity relationship components.
2. How relationships between entities are defined and refined.
3. How ERD components affect database design and implementation.
Discussion:
What is the ER Model?
ENTITY RELATIONAL (ER) MODEL is a high-level conceptual data model diagram. ER modeling helps you
to analyze data requirements systematically to produce a well-designed database. The Entity-Relation model
represents real-world entities and the relationship between them. It is considered a best practice to complete ER
modeling before implementing your database.
ER modeling helps you to analyze data requirements systematically to produce a well-designed database. So, it
is considered a best practice to complete ER modeling before implementing your database.
What is ER Diagrams?
ENTITY-RELATIONSHIP DIAGRAM (ERD) displays the relationships of entity set stored in a database. In
other words, we can say that ER diagrams help you to explain the logical structure of databases. At first look, an
ER diagram looks very similar to the flowchart. However, ER Diagram includes many specialized symbols, and
its meanings make this model unique. The purpose of ER Diagram is to represent the entity framework
infrastructure.
Facts about ER Diagram Model:
• ER model allows you to draw Database Design
• It is an easy to use graphical tool for modeling data
• Widely used in Database Design
• It is a GUI representation of the logical structure of a Database
• It helps you to identifies the entities which exist in a system and the relationships between those entities
Why use ER Diagrams?
Here, are prime reasons for using the ER Diagram

Learning Module on IM 201


175

• Helps you to define terms related to entity relationship modeling


• Provide a preview of how all your tables should connect, what fields are going to be on each table
• Helps to describe entities, attributes, relationships
• ER diagrams are translatable into relational tables which allows you to build databases quickly
• ER diagrams can be used by database designers as a blueprint for implementing data in specific software
applications
• The database designer gains a better understanding of the information to be contained in the database with
the help of ERP diagram
• ERD is allowed you to communicate with the logical structure of the database to users
Components of the ER Diagram
This model is based on three basic concepts:
1. Entities
2. Attributes
3. Relationships

Example
For example, in a University database, we might have entities for Students, Courses, and Lecturers. Students
entity can have attributes like Rollno, Name, and DeptID. They might have relationships with Courses and
Lecturers.

Learning Module on IM 201


176

WHAT IS ENTITY?
A real-world thing either living or non-living that is easily recognizable and no recognizable. It is anything in
the enterprise that is to be represented in our database. It may be a physical thing or simply a fact about the
enterprise or an event that happens in the real world.
An entity can be place, person, object, event or a concept, which stores data in the database. The characteristics
of entities are must have an attribute, and a unique key. Every entity is made up of some 'attributes' which
represent that entity.
Examples of entities:
Person: Employee, Student, Patient
Place: Store, Building
Object: Machine, product, and Car
Event: Sale, Registration, Renewal
Concept: Account, Course

Learning Module on IM 201


177

Relationship
Relationship is nothing but an association among two or more entities. E.g., Tom works in the Chemistry
department.

Learning Module on IM 201


178

Relationships are associations between entities (sometimes they are referred to as data associations). Figure
above is an entity-relationship (E-R) diagram that shows various types of relationships.
The first type of relationship is a one-to-one relationship (designated as 1:1). The diagram shows that there is
only one PRODUCT PACKAGE for each PRODUCT. The second one-to-one relationship shows that each
EMPLOYEE has a unique OFFICE. Notice that all these entities can be described further (a product price
would not be an entity, nor would a phone extension).
Another type of relationship is a one-to-many (1:M) or a many-to-one association. As shown in the figure, a
PHYSICIAN in a health maintenance organization is assigned many PATIENTS, but a PATIENT is assigned
only one PHYSICIAN. Another example shows that an EMPLOYEE is a member of only one DEPARTMENT,
but each DEPARTMENT has many EMPLOYEES.
Finally, a many-to-many relationship (designated as M:N) describes the possibility that entities may have
many associations in either direction. For example, a STUDENT can have many COURSE(s), and at the same
time a COURSE may have many STUDENT(s) enrolled in it. The second example shows that a
SALESPERSON can call on many CITY(s) and a CITY can be a sales area for many SALESPERSON(s).
The standard symbols for crow’s foot notation, the official explanation of the symbols, and what they actually
mean, are all given in Figure 13.3. Notice that the symbol for an entity is a rectangle.

Learning Module on IM 201


179

An entity is defined as a class of a person, place, or thing. A rectangle with a diamond inside stands for an
associative entity, which is used to join two entities. A rectangle with an oval in it stands for an attributive entity,
which is used for repeating groups.
When a straight line connects two plain entities and the ends of the line are both marked with two short marks
(||), a one-to-one relationship exists. Following that you will notice a crow’s foot with a short mark (|); when
this notation links entities, it indicates a relationship of one-to-one or one-to-many (to one or more).
Entities linked with a straight line plus a short mark (|) and a zero (which looks more like a circle, O) are
depicting a relationship of one-to-zero or one-to-one (only zero or one). A fourth type of link for relating
entities is drawn with a straight line marked on the end with a zero (O) followed by a crow’s foot. This type
shows a zero-to-zero, zero-to-one, or zero-to-many relationships.
Finally, a straight line linking entities with a crow’s foot at the end depicts a relationship to more than one.
Entities take part in relationships. We can often identify relationships with verbs or verb phrases.
For example:
• You are attending this lecture
• I am giving the lecture
• Just loke entities, we can classify relationships according to relationship-types:
• A student attends a lecture
• A lecturer is giving a lecture.

Attributes
It is a single-valued property of either an entity-type or a relationship-type. For example, a lecture might have
attributes: time, date, duration, place, etc.
Steps to Create an ERD
Following are the steps to create an ERD.
Let's study them with an example:
In a university, a Student enrolls in Courses. A student must be assigned to at least one or more Courses. Each
course is taught by a single Professor. To maintain instruction quality, a Professor can deliver only one course.
Step 1) Entity Identification
We have three entities
Student
Course

Learning Module on IM 201


180

Professor

Step 2) Relationship Identification


We have the following two relationships
The student is assigned a course
Professor delivers a course

Step 3) Cardinality Identification


For them problem statement we know that,
A student can be assigned multiple courses
A Professor can deliver only one course

Step 4) Identify Attributes


You need to study the files, forms, reports, data currently maintained by the organization to identify attributes.
You can also conduct interviews with various stakeholders to identify entities. Initially, it's important to identify
the attributes without mapping them to a particular entity.
Once, you have a list of Attributes, you need to map them to the identified entities. Ensure an attribute is to be
paired with exactly one entity. If you think an attribute should belong to more than one entity, use a modifier to
make it unique.
Once the mapping is done, identify the primary Keys. If a unique key is not readily available, create one.

Learning Module on IM 201


181

Step 5) Create the ERD


A more modern representation of ERD Diagram

NOTE: Exercise, application and activity do not have attributes.

Learning Module on IM 201


182

M3:L2 EXERCISE.
Create the ER Diagram for each problem; use the space provided to draw your answer.
a. One EMPLOYEE is assigned to one PHONE EXTENSION.

b. Many EMPLOYEES are members of a DEPARTMENT.

c. An EMPLOYEE is assigned to an OFFICE.

d. One CARGO AIRCRAFT will serve one or more DISTRIBUTION CENTERs.

e. SYSTEMS ANALYST may be assigned to MANY PROJECTS.

M3:L2 Application
a. The physician treats the illness a PATIENT has. Many PATIENT(s) experience many TREATMENT(s).
TREATMENT(s) can include the taking of PRESCRIPTION(s). Create the ERD of the above facts. Draw your
answer on the space provided.
Tip: You should be creating a single ERD with interconnected entities.

Learning Module on IM 201


183

b. The PATRON actually makes a one or more RESERVATION. RESERVATION(s) is/are for a
CONCERT/SHOW. Create the ERD of the above facts. Draw your answer on the space provided.

Learning Module on IM 201


184

Summary of the Lesson:


1. In this chapter we presented the modeling concepts of a high-level conceptual data model, the entity–
relationship (ER) model.
2. Then we discussed the ER model concepts at the schema or “intension” level:
• Entity types and their corresponding entity sets
• Key attributes of entity types
• Value sets (domains) of attributes
3.We presented two methods for specifying the structural constraints on relationship type which are the
Cardinality ratios (1:1, 1:N, M:N)

Learning Module on IM 201


185

M3:L2 Enrichment Activity


Create the ER Diagram for each problem; use the space provided to draw your answer.
a. A MACHINE may or may not be undergoing SCHEDULED MAINTENANCE.

b. “One or many SALESPERSONs are assigned to one or more CUSTOMERs.

c. One or more EMPLOYEEs may or may not be assigned to the HOME OFFICE.

d. Many PASSENGERs are flying to many DESTINATIONs.

Learning Module on IM 201


186

M3 Assessment:
I. Define the connection of the below paired entities whether it is 1:1, 1:M or M:N based on the given
scenario. Write your answer in column connectivity.
ABC College is divided into several schools. Each school is composed of several departments. Each department
may offer courses. Each department may have many professors assigned to it. Each professor may teach up to
four classes; each class is section of course. Student may enroll in several classes.
Each department has several students. Each student has only a single major and is associated with a single
department. Each student has an advisor in his or her department. Each advisor counsels several students. The
relationship between class is taught in a room and the room in the building. Professor can only teach in one school.
Item
No. Entity Connectivity Entity
1. School Department
2. Department Student
3. Department Professor
4. Department Course
5. Course Class
6. Professor School
7. Professor Department
8. Professor Class
9. Professor Student
10. Student Class
11. Building Room
12. Room Class

I. Create the 1st, 2nd and 3rd normal form of the below table. Use the space provided below for your
answer.

1. 1NF

Learning Module on IM 201


187

2. 2NF

3. 3NF

Learning Module on IM 201


188

II. Create the INNER, LEFT and RIGHT JOIN of the below tables.

Table name: Customerx

Table name: Ordery

1. INNER JOIN
__________________________________________________________________________________________
__________________________________________________________________________________________
__________________________________________________________________________________________
______________________________________________________________________

Learning Module on IM 201


189

2. LEFT JOIN
__________________________________________________________________________________________
__________________________________________________________________________________________
__________________________________________________________________________________________
______________________________________________________________________
3. RIGHT JOIN
__________________________________________________________________________________________
__________________________________________________________________________________________
__________________________________________________________________________________________
______________________________________________________________________

Learning Module on IM 201


190

References/Attributions:

Kenneth E. Kendall & Julie E. Kendall, 2018, "System Analysis and Design, 10th Edition"

Rod Stephens, 2015, "Beginning Software Engineering"

Ramez Elmasri, Shamkant B. Navathe, 2015, “Fundamentals of Database Systems”

https://www.studytonight.com/dbms
https://www.guru99.com/
https://www.w3schools.com/sql
https://www.tutorialspoint.com/

Learning Module on IM 201

You might also like