Professional Documents
Culture Documents
DBMS
DBMS
DBMS
1
1.1 Introduction
The word data refers to information or facts usually collected as the result of experience,
observation or experiment, or processes within a computer system, or premises. Data may consist
Data are often viewed as lowest level of abstraction from which information and knowledge are
derived.
A Database is an organized collection of data. The term originated with in the computer industry,
but its meaning has been broadened by popular use, to the extent that the some database
sophisticated software package called database management system. It has programs to set up
storage structures, load the data, and accept data request from programs and users.
Database plays important roles in all areas where computers are included such as library,
education, medicine, science etc. when you buy some goods from the market then you are simply
use the concept of databases. Databases play a crucial role in the growth of computer industry.
Before we start the concept of databases lets first start the basics of data and information.
1.2 Objective
This chapter discusses about the basic concept of database management system and provides an
excellent discussion about data, information and knowledge. It includes differentiation between
these three basic terms. This chapter comprise of file processing system and database system
along with its advantages and disadvantages. It defines the basic DBMS terminology and
explains the database components along with the brief role of people who design and manage
2
1.3.1 Data
Data is the term, which is very simple to grab. Data is defined as collection of meaningful facts
which can be stored and processed by the human or computer. In other words data is the material
on which computer program work upon. The word raw indicates that the facts have not yet been
processed to reveal their meaning. Data can be a number, letter or alphabet, word and special
symbol etc. Data can exist in any form, usable or not. It does not have meaning of itself. In
Example 1: The following sequence of digit 230504 is meaningless by itself since it could refer
to a part number of automobile, date of birth, the number of rupees spent on a project, population
of a town and so on. Therefore this sequence of digit would be considered as data.
Example 2: A set of words like ―Aryan, mathematics, highest mark, annual examination‖ would
1.3.2 Information
When the data is processed and converted into a meaningful and useful form, it is known as
information. Information will be generated after arranging data into a suitable and meaningful
form. For a business to be successful, a fast access to information is vital as important decisions
3
are based on the information available at any point of time. Such information can then be used as
the foundation for decision making. Traditionally, the data was stored in voluminous repository
However, storing data and retrieving information from these repositories was a time consuming
task. With the development of computers, the problem of information storage and retrieval was
resolved. Computers replaced tons of paper, file folders, and ledgers as the principal media for
Example 3: In example 1, if we know the sequence of data refers to, then it becomes meaningful
and can be called information. When we write above as 23-05-07, it may mean date of birth.
Example 4: If the data mention in example 2 is processed as ―Aryan secured highest marks in
1.3.3 Knowledge
Knowledge is the appropriate collection of information, such that its intent is to be useful.
Knowledge is a deterministic process. Knowledge is derived from information in the same way
information is derived from data. When someone "memorizes" information, then they have
amassed knowledge. This knowledge has useful meaning to them. It can be considered as the
integration of human perceptive processes that helps them to infer further knowledge.
4
Example 5: Elementary school children memorize, or amass knowledge of, the "number table".
They can tell you that "2 x 2 = 4" because they have amassed that knowledge (it being included
in their number table). So, knowledge is a continually gaining process from the information they
Knowledge adds understanding and retention to information. It is the next natural progression
after information. Therefore if we want to have appropriate knowledge; we must have right
information. To have right information; we must have complete and correct data. Therefore
while maintaining the data is the database we must ensure that there should not be any missing
and unprocessed facts and considered as an aggregation information in the same way
data.
business and of itself. business and of itself business for decision making
process.
5
Data is a prerequisite to Processed form of data Knowledge is usually based
information
human or machine, to derive amongst people, about things, of information results into
Data can be a number, letter, Information provides answers Knowledge answers "how"
Example: In healthcare Example: The processed data Example: The trend of vital
industry data includes vital leads to certain information signs over time provides a
signs, weight and relevant which helps in providing the pattern that may lead to
In the early systems data was handled manually by the different users. The human being as the
users manages the whole database without the support of computers. It has got many problems
1. Address Dictionary: In an address book, numbers of pages are pre-allotted for writing
the address starting with the specific alphabet. Let it is ―A‖. Now if you start writing the
address related to names beginning with ―A‖; and if number of pages allotted to alphabet
―A‖ finished. Then it becomes a problem. One solution to the problem is to buy a new
6
address book with larger size and to transfer all the previous addresses in the new one.
This solution is very tiring and time consuming process. The second solution is to use
some blank pages at the end of same address book. This process is again cumbersome
because if you want to search the address for a specific person then you have to scroll the
allotted pages to that alphabet and also to search the last pages of that address book. So
searching has to perform twice at two different places in the same address book.
2. Repeated Transaction: There are many transactions which occur repeatedly on day to
day, week to week and month to month basis. For example to make the salary calculation
all payroll transactions are recorded manually in the ledger for a month and same
transactions are recorded again manually for the next month and so on. It‘s a just
calculation task and does not require any logic or intelligence. Therefore is not a wise
decision to waste the human skill and intelligence on such repetitive calculations.
3. Searching Process: Searching for a single entry in large number of manual records is
very difficult. For example in a publishing company Mr. Arpit is a subscriber. Now he
wants to renew his subscription. For the purpose he sent a cheque to the publishing
company. In this case the publisher has to search the all big list of subscriber to find out
4. Updating the Manual Records: It is a difficult task to update the records of manual
database. First issue is the identification of appropriate record to be updated and the
second issue is the problem of overwriting. For security aspects we generally avoid
overwriting in the records because it may give a wrong impression to the reader of that
record.
7
Hence, when the database is large in size and difficult to manage then it is better to use
File processing systems was an early attempt to computerize the manual filing system that we are
all familiar with. A file system is a method for storing and organizing computer files and the data
they contain to make it easy to find and access them. The manual filing system works well when
the number of items to be stored is small. It even works quite adequately when there are large
numbers of items and we have only to store and retrieve them. However, the manual filing
system breaks down when we have to cross-reference or process the information in the files.
There are following problems associated with the File Based approach:
1. Duplication in data: In this system data stored in the files are independent to each other.
Therefore, there is possibility of storing the same data in the multiple files. This causes to
duplication in data.
Example 6: Student roll number may be stored twice in two different files.
2. Inconsistency of data: Since in file processing system, the files being maintained are
independent to each other. It means that there is no relationship among them. Therefore,
if any data item is to be changed then all files containing related data need to be updated.
The problem arise if all the files might not be updated causing to inconsistency.
improvement in qualification; the data item may be updated only in one file and rest of
3. Lack of Data integrity: It is problem of ensuring that the data in the database is
accurate. In any application there are certain data integrity rules in the form of certain
8
condition and constraints that need to be maintained. In the file system it is not possible
to change the application program to apply such rules because these programs are
Example 8: The integrity constraint that the phone number of the student should be of 10
digits only, has to be implemented in all application programs using student file. For one
application, it is quite easy to incorporate this integrity rule, but for a number of
4. Lack of Security: In a file system security constraints are not easy to enforce because
data is stored in different independent files. Therefore unauthorized users can destroy the
Example 9: Any unauthorized user can access your files and can perform any fraudulent
5. Data dependence: Data is stored in the files and files are maintained to fulfill the
other. Therefore if any change took place in any data item, then it must be updated in the
entire application programs using that data. This is called as data dependence.
Example 10: Let an organization changes there employee id‘s from 6-digit to 10- digits,
all the application program that uses the data item have to be modified.
6. Difficult to share data: Files maintain the data may be of different format. Therefore
format of data stored in one file will differ from the format of data stored in other file. If
at any time data of these two files are need to be shared then different data format will
cause a problem. The solution to the problem is to develop an interface which further is a
9
Example 11: Gender of MBA students is stored as ―1‖ and ―0‖ (Where ―1‖ stands for
male and ―0‖ stands for female); hence data type may be number, whereas gender of
MCA students is stored as ―M‖ and ―F‖ (Where ―M‖ stands for male and ―F‖ stands for
female); hence data type may be character. If we want to calculate the total number of
male and female in MBA and MCA then data type will create a problem.
7. Difficult to get quick response: Queries in the application program are written to meet
the specific requirements. If any clause of the query change, then it becomes difficult to
Example 12: Suppose there is a condition that a student; whose age is 35 to 40 year can
only applies for a specific job. But if this age criteria changes from 30 to 35 years; then
respective changes has to be incorporated on all the related queries belonging to that
8. Concurrent problem: In a file system, when two or more users access the same data file
for read and write operations, then problem of concurrency may arise which leads to data
in inconsistent state.
Example 13: Suppose a spouse opens a bank account with a balance of Rs. 5000. After
some day husband withdraws Rs. 500 and balance remains as Rs. 4500, at the same time
wife also withdraws Rs. 700 while having the impression that balance would be Rs. 5000.
Since both transactions are executing concurrently therefore the problem of concurrency
arises.
9. Inadequate to Represent Data Modeling of Real World: Data in the file system is
simple maintained to support only an application program. It does not show any
10
relationship among data in different files. Moreover complex data cannot define in the
file system.
10. Difficulty in Data Representation from User’s View: To create useful application for
the user, it is necessary to combine the data of different files. But in file system
independent and isolated data is recorded and relationship among them is very hard to
determine. Therefore data in file system do not meet the user‘s requirement.
In order to remove all the above limitations of file based approach, a new approach was required
The database is a shared collection of logically related data, designed to meet the information
needs of an organization. A database is a computer based record keeping system whose over all
purpose is to record and maintains information. The database is a single, large repository of data,
which can be used simultaneously by many departments and users. Instead of disconnected files
with redundant data, all data items are integrated with a minimum amount of duplication. The
database is no longer owned by one department but is a shared corporate resource. The database
holds not only the organizational operational data but also a description of this data. For this
description of data is known as Data Dictionary or Meta Data. It is the self describing nature of a
A database implies separation of physical storage from use of the data by an application program
to achieve program/ data independence. Using a database system, the user or programmer or
application specialist need not know the details of how the data are stored and such details are
transparent to the users. Changes can be made to data without affecting other components of the
11
system. These changes include change of data format or file structure or relocation from one
device to another.
1. Shared: data in a database are shared among different users and applications.
2. Persistence: Data in a database exist permanently in the sense the data can live beyond
3. Validity/ Integrity/ Correctness: Data should be correct with respect to the real world
5. Consistency: Whenever more than one data elements in a database represent related real
world values, the values should be consistent with respect to the relationship.
6. Non-redundancy: No two data-items in a database should represent the same real world
entity.
7. Independence: Data at different level should be independent of each other so that the
To create, manage and manipulate data in databases, a management system known as database
of programs that enables users to define, create and maintain a database and provide controlled
access to the data. Defining a database involves specifying the data types, structures, and
constraints for the data to be stored in the database. Database may be defined as repository of
12
data for an organization such that it can be shared and integrated. Creating the database is the
process of storing the data itself on some storage medium that is controlled by the DBMS.
Manipulating a database includes such functions as querying the database to retrieve specific
data, updating the database to reflect changes in the real world, and generating reports from the
data. There are different types of DBMS ranging from small systems that run on personal
computers to huge systems that run on mainframes. The following are main examples of
database applications:
These systems allow users to create, update and extract information from their databases.
Compared to a manual filling system, the biggest advantages to a computerized database system
are speed, accuracy and accessibility. The other advantages of a DBMS are as follows.
The database Management System has promising potential advantages, which are explained
below:
details of data representation and storage. The DBMS can provide an abstract view of the
2. Concurrent access and crash recovery: A DBMS schedules concurrent accesses to the
data in such a manner that users can think of the data as being accessed by only one user
13
3. Reduced application development time: Clearly, the DBMS supports many important
functions that are common to many applications accessing data stored in the DBMS.
This, in conjunction with the high-level interface to the data, facilitates quick
development of applications. Such applications are also likely to be more robust than
applications developed from scratch because many important tasks are handled by the
unnecessary duplication of data and effectively reduces the total amount of data storage
required. It also eliminates the extra processing necessary to trace the required data in a
redundancies that exist in the DBMS are controlled and the system ensures that these
6. Shared Data: A database allows the sharing of data under its control by any number of
application programs or users. For example, the applications for the public relations and
7. Integrity: Centralized control can also ensure that adequate checks are incorporated in
the DBMS to provide data integrity. Data integrity means that the data contained in the
database is both accurate and consistent. Therefore, data values being entered for the
storage could be checked to ensure that they fall within a specified range and are of the
correct format.
14
8. Security: Data is of vital importance to an organization and may be confidential. Such
confidential data must not be accessed by unauthorized persons. The DBA who has the
ultimate responsibility for the data in the DBMS can ensure that proper access procedures
are followed, including proper authentication schemes for access to the DBMS and
additional checks before permitting access to sensitive data. Different levels of security
9. Conflict Resolution: Since the database is under the control of the DBA, he/she should
resolve the conflicting requirements of various users and applications. In essence, the
DBA chooses the best file structure and access method to get optimal performance for the
10. Standards can be enforced: Since all access to the database must be through DBMS so
standards can be enforced. Standards may relate to naming of data, format of data,
structure of data etc. Standardizing stored data formats is usually desirable for the
Although there are many advantages of DBMS, the DBMS may also have some minor
system is cost. In addition to the cost of purchasing or developing the software, the
hardware has to be upgraded to allow for the extensive programs and work spaces
required for their execution and storage. The processing overhead introduced by DBMS
to implement security, integrity, and sharing of the data causes a degradation of the
15
response and through-put times. An additional cost is that of migration from a
lack of duplication requires that the database be adequately backed up so that in the case
of failure the data can be recovered. Centralization also means that the data is accessible
from a single source. This increases the potential severity of security breaches and
disruption of the operation of the organization because of downtimes and failures. The
cooperating distributed databases resolves some of the problems resulting from failures
and downtimes.
3. Complexity of Backup and Recovery: Backup and recovery operations are fairly
database system, the data stored into file must be converted to database file. It is very
difficult and costly method to convert data files into database. For the purpose, we have
5. Cost of staff training: Most of DBMSs are often complex system so the training for
users to use DBMSs is required. Training is required to all levels including programming
16
6. Database damage: In most of the organizations all data is integrated into a single
7. High cost of DBMS: Because a complete DBMS is very large and sophisticated piece of
provide better information, certain applications may be slower due to the integration of
data.
1.3.12 Difference between File System Approach and Database Management System
3. Large files are stored under this system. 3. Fewer files are stored under this system.
4. Data is stored in the form of files. 4. Data is stored in the form of tables.
6. The data may have redundancy under this 6. Under this system there is reduced
system. redundancy.
7. Data is isolated from other. 7. Data can be shared under this system.
8. There is no security and integrity of data. 8. It maintains security and integrity of data.
9. Backup and recovery process is simple in 9. Backup and recovery process is complex in
10. Its examples are C, COBOL. 10. Its example is oracle Oracle, SQL.
17
1.3.13 DBMS Terminology
Database – A collection (or list) of information. A database is comprised of one or more lists
Tables – The view that displays the data base as a combinations of rows (records) and columns
(fields). The cells contain the bits and pieces of data for each record in each field. The first row
Field names – Identify the different categories in a database. The top row is reserved for field
names. Examples of field names are First name, last name, address, city, state, zip, phone
number.
Field – Categories in a database. Fields are displayed in columns. For Example, in a database,
the address field contains the address for each of the records. These are the bits and pieces of
data.
18
Records – Related information that is separated by columns or fields. A name and address are
considered one record in the database. A second Name and address are a different record.
Cells - The intersection of columns and rows that contain the data for each record.
Data – All of the records of information in a database including the field names i.e. Data + Field
1. Data
2. Hardware
3. Software
4. Users
These components coordinate with each other to form an effective database system.
1. Data - It is a very important component of the database system. Most of the organizations
generate, store and process large amount of data. The data acts a bridge between the
machine parts i.e. hardware and software and the users which directly access it or
explained below:
a) User Data - It consists of a table(s) of data called Relation(s) where Column(s) are
called fields or attributes and rows are called Records for tables. A Relation must be
structured properly.
basically means "data about data". System Tables store the Metadata which includes.
19
- Number of fields and field Names
- Null Constraint
c) Application Metadata - It stores the structure and format of Queries, reports and
2. Hardware - The hardware consists of the secondary storage devices such as magnetic disks
(hard disk, zip disk, floppy disks), optical disks (CD-ROM), magnetic tapes etc. on
which data is stored together with the Input/ Output devices (mouse, keyboard,
printers), processors, main memory etc. which are used for storing and retrieving the
data in a fast and efficient manner. Since database can range from those of a single user
therefore proper care should be taken for choosing appropriate hardware devices for a
required database.
3. Software - The Software part consists of DBMS which acts as a bridge between the user and
the database or in other words, software that interacts with the users, application
20
programs, and database and files system of a particular storage media (hard disk,
magnetic tapes etc.) to insert, update, delete and retrieve data. For performing these
operations such as insertion, deletion and updation, we can either use the Query
Languages like SQL, QUEL, Gupta SQL or application software such as Visual 3asic,
Developer etc.
4. Users - Users are those persons who need the information from the database to carry out their
Executives etc. On the basis of the job and requirements made by them they are
provided access to the database totally or partially. The people who work with
Database users are those who interact with the database in order to query and update the
database, and generate reports. Database users are further classified into the following categories:
a) Naive users: The users who query and update the database by invoking some
already written application programs. For example, the owner of the bookstore enters the
details of various books in the database by invoking appropriate application program. The
b) Sophisticated users: The users, such as business analyst, scientist, etc., who are
familiar with the facilities provided by a DBMS interact with the system without writing
any application programs. Such users use database query language to retrieve information
c) Specialized users: The users who write specialized database programs, which are
different from traditional data processing applications, such as banking and payroll
21
management which use simple data types. Specialized users write applications such as
computer-aided design systems, knowledge-base and expert systems that store data
d) System analysts: The users determine the requirements of the database users
(especially naive users) to create a solution for their business need, and focus on non-
technical and technical aspects. The non-technical aspects involve defining system
requirements, facilitating interaction between business users and technical staff, etc.
Technical aspects involve developing the specification for user interface (application
programs).
implement the specifications given by the system analysts, and develop application
programs. They can choose tools, such as rapid application development (RAD) to
develop the application program with minimal effort. The database application
programmer develops application program to facilitate easy data access for the database
users.
Database Administrator is a person who has central control over both data and application
programs. The responsibilities of DBA vary depending upon the job description and corporate
and organization policies. Some of the responsibilities of DBA are given here.
a) Schema definition and modification: The overall structure of the database is known as
database schema. It is the responsibility of the DBA to create the database schema by
executing a set of data definition statements in DDL. The DBA also carries out the
22
b) New software installation: It is the responsibility of the DBA to install new DBMS
software, application software, and other related software. After installation, the DBA
monitoring the security of the database system. It involves adding and removing users,
d) Data analysis: DBA is responsible for analyzing the data stored in the database, and
studying its performance and efficiency in order to effectively use indexes, parallel query
execution, etc.
e) Preliminary database design: The DBA works along with the development team during
the database design stage due to which many potential problems that can arise later (after
f) Physical organization modification: The DBA is responsible for carrying out the
g) Routine maintenance checks: The DBA is responsible for taking the database backup
periodically in order to recover from any hardware or software failure (if occurs). Other
routine maintenance checks that are carried out by the DBA are checking data storage
and ensuring the availability of free disk space for normal operations, upgrading disk
The disk manager is part of the operating system of the host computer and all physical input and
output operations are performed by it. The disk manager transfers the block or page requested by
23
the file manager so that the latter need not be concerned with the physical characteristics of the
The data manager is the central software component of the DBMS. It is sometimes referred to as
the database control system. One of the functions of the data manager is to convert operations in
the user's queries coming directly via the query processor or indirectly via an application
program from the user's logical view to a physical file system. The data manager is responsible
for interfacing with the file system. In addition, the tasks of enforcing constraints to maintain the
consistency and integrity of the data, as well as its security, are also performed by the data
manager. It is also the responsibility of the Data Manager to provide the synchronization in the
simultaneous operations performed by concurrent users and to maintain the backup and recovery
operations.
Responsibility for the structure of the files and managing the file space rests with the file
manager. It is also responsible for locating the block containing the required record, requesting
block from the disk manager, and transmitting the required record to the data manager as shown.
The file manager can be implemented using an interface to the existing file subsystem provided
by the operating system of the host computer or it can include a file subsystem written especially
1.4 Summary
Human is dealing with data and information since long time, perhaps beginning of the
civilization human is manipulating data. Since then, given and take of information is in practice,
24
To achieve the objective of this chapter, origin of database concept extend from file systems has
been discussed. Flaws of file systems, advantages of database system and a comparative survey
components of database systems along with the role of different users are explained herewith.
1. Elmasri & Navathe: Fundamentals of Database systems, 3rd Edition, Addison Wesley,
New Delhi.
2. Korth & Silberschatz : Database System Concept, 4th Edition, McGraw Hill
International Edition.
Delhi.
1. What is information? How it differs from data? Define database management system.
4. Differentiate between
25
b) Data, Information and Knowledge
7. Discuss the role of data manager, file manager and disk manger in database management
system.
8. What are the important components of database management system? Explain database
users.
26
Chapter – 2: Database Systems Architecture, Functions &
Component Modules
Writer: Dr. Kanwal Garg
Vetter: Prof. Rajender Nath
Structure:
2.1 Introduction
2.2 Objective
2.3 Presentation of Content
2.3.1 Database Instances and Schemas
2.3.2 Three Level Architecture of DBMS
2.3.3 Mapping
2.3.4 Data Independence
2.3.5 Database Language and Interface
(i) DBMS Languages
(ii) DBMS Interface
2.3.6 DBMS Functions
2.3.7 Component Modules of DBMS
2.4 Summary
2.5 Suggested Reading/ Reference Material
2.6 Self Assessment Questions (SAQ)
27
2.1 Introduction
of interrelated data stored together with controlled redundancy to serve one or more applications
in an optimal fashion. The data are stored in such a fashion that they are independent of the
program to the people using the data. A common and controlled approach is used in adding data
The users of the database do not have to worry about the physical implementation and internal
working of the database. The database management system has different layers and the different
In this chapter, we present three tier architecture of DBMS package, which has been evolved
from the traditional system, where the whole database was tightly integrated. Mapping among
the different views/levels of three tier architecture are explained. Modification of data at schema
level, which keeps data separated from all program is discussed. DBMS languages, interfaces,
functions and component modules are taken care of for the clear understanding of DBMS.
2.2 Objective
After going through to this chapter the student will be having clear understanding to separate the
user application and the physical database. Three tier architecture of DBMS is presented to
provide a framework on which subsequent chapters can build. Such a framework is useful for
National Standard Institute/ Standards Planning and Requirement Committee) which explains
database as three views. The mapping will define the correspondence between three view levels.
Modification of schema in the database will clarify the points to be taken care of. To
communicate with the database, DBMS interfaces and languages are incorporated. A detailed
28
knowledge of DBMS functions and component module is delivered in the last section of this
chapter.
Database changes over time when information is inserted or deleted. The collection of
information stored in the database at a particular moment is called an instance of the database.
A schema diagram, as shown below, displays only names of record types (entities) and names of
data items (attributes) and does not show the relationship among the various files.
Schema of SupplierMaster
Schema of ClientMaster
Computers
29
The schema will remain the same while the values filled into it change from instant to instant.
When the schema framework is filled in with data item values, it is referred as an instance of the
schema. The data in the database at a particular moment of time is called a database state or
snapshot, which is also called the current set of occurrences or instances in the database.
In other words, ―the description of a database is called the database schema, which is specified
A schema diagram displays only some aspects of a schema, such as the names of record types
and data items and some types of constraints. Other aspects are not specified in the schema
diagram. As in the above figure there is neither the data types of attributes nor the relationship
The actual data in database may change quite frequently. The data in the database at a particular
moment in times is called a database state or snapshot. It is also known as current set of
occurrence or instances in the database. The DBMS is partially database responsible for ensuring
that every state of the database is a valid state that is a state that satisfies the structure and
constraint specified is schema. Hence, specifying a correct schema to the DBMS is extremely
A database management system is a mega software designed to assist in managing, maintain and
utilizing large collection of data. A database is hence a general purpose software system that
facilitates the process of defining (task of internal view), constructing (task of conceptual view)
and manipulating (task of external view) database for various applications. A Complete
understanding of these processes, their role and implementations are discussed in detail in three
30
The three-level/ tier architecture as shown in Figure 2.1 is also known as three-schema
separate the user applications and the physical database. Three schema architecture is a
convenient tool with which users can visualize the schema levels in a database system. DBMS
architecture is a framework where the structure of DBMS is defined. The main aim of this
architecture is to achieve the characteristics by defining the abstract view of the data and by
Three level architecture frame work was suggested by ANSI/SPARC (American National
Standard Institute/ Standards Planning and Requirement Committee). The view at each level is
described by schema. A schema is an outline or plan that describes the structure of database. The
word scheme means a systematic plan for achieving some goals. The word scheme can be
interchangeably used by schema. The subset of schema is known as Subschema. It refers to the
user‘s view of field that he uses from the database. Each view accesses some portion of database.
1. External Level: - The external Level is described by an external schema i.e. it consists of
definition of logical records and relationship in the external view. Each external schema
describes the part of the database that a particular user group is interested in and hides the rest of
the database from that user group. It also contains the method of deriving the objects in the
31
End Users
(Community User
Conceptual- Internal
View) Mapping
(Storage View)
This is the highest level of abstraction where only those parts of the entire databases are include
which are of concern to a user. Despite the use of simpler structures at the logical level, some
complexity remains, because of the large size of the database. Many users of the database system
will not be concerned with all this information. Instead, such users need to access only a part of
the database. So that their interaction with the system is simplified, the view level of abstraction
is defined. The system may provide many views for the same database. Users can always fulfill
all demand using the part of the view provided and may never need the entire database so it is
32
called ―user‘s view‖ and ―view‖ which is complete and independent. The external view is written
2. Conceptual Level: - The conceptual level has a conceptual schema which represents the
structure of entire database for a community of users. Conceptual schema describes the records
and relationship included in the Conceptual view. The conceptual schema hides the details of
physical storage structures and concentrates on describing entities, data types, relationships, user
operations, and constraints. It also contains the method of deriving the objects in the conceptual
One conceptual view represents the entire database. There is only one conceptual view per
database. It is large, complex and sophisticated. Database change over time as data is inserted
and deleted. The collection of information stored in the database at a particular moment is called
an instance of the database. The overall design of the database is called database schema and
these schemas changes frequently. The description of data at this level is in a format independent
of its physical representation. It also includes features that specify the checks to retain data
consistence and integrity. The conceptual view is written in conceptual schema using conceptual
3. Internal Level: - The internal level has an internal schema. Internal level indicates how the
data will be stored and describes the data structures and access method to be used by the
database. It contains the definition of stored record and method of representing the data fields
This lowest level of abstraction describes how the data are stored in the database, and what
relationship exists among those data. The entire database is thus described in terms of a small
number of relatively simple structures. Although implementation of the simple structures at the
33
logical level may involve complex physical-level structures, the user of the logical level does not
need to be aware of this complexity. The internal view is written in internal schema using
2.3.3 Mapping
The processes of transforming requests and results between levels are called mappings. These
mappings may be time-consuming, so some DBMSs, especially those that are meant to support
small databases do not support external views. Even in such systems, however, a certain amount
of mapping is necessary to transform requests between the conceptual and internal levels. The
mapping description is stored in data dictionary. The DBMS is responsible for mapping between
External- Conceptual Mapping: A mapping between external and conceptual views gives the
correspondence among the records and relationship of the conceptual and external view. The
external and conceptual mapping tells the DBMS which objects on the conceptual level
correspond to the object requested on a particular user‘s external view. If changes are made to
either external view or conceptual view, then mapping must be changed accordingly.
Conceptual- Internal Mapping: The Conceptual- Internal mapping defines the correspondence
between the conceptual view and the internal view, i.e. the database stored on the physical
storage device. It describes how conceptual records are stored to and retrieved from the storage
device. This means that Conceptual- Internal mapping tells the DBMS that how the conceptual
records are physically represented. If the structure of the stored database is changed, then the
mapping must be changed accordingly. It is the responsibility of DBA to manage such changes.
34
These mapping are used primarily for data independence. All details are used in these mapping
so as to make overall view data independent. The changes in mapping are responsibilities of
DBA. In addition to the mapping and three views, there are three more points of reference in
architecture. One is DBMS; other is DBA and third is user-interface. An example for the
Conceptual View:
Name : Char(20);
Address : Varchar2(20);
Marks1 : Number(3);
Marks2 : Number(3);
Marks3 : Number(3);
TMarks : Number(3);
Grade : Char(2);
External View:
View N1 = Student
Name : Char(20);
TMarks : Number(3);
Grade : Char(2);
35
}
View N2 = Student
Name : Char(20);
Address : Varchar2(20);
Grade : Char(2);
Internal View:
Schema = Student
Offset =0
The three views represented here give a broad idea of the database views. At conceptual level all
entries are made. At external level data availability is dependent on user. At internal level
The ability to modify a schema definition at one level of a database system without having to
change the schema at the next higher level is called data independence. Data independence is a
form of database management that keeps data separated from all programs that make use of data.
36
1. Logical data independence: It is the capacity to change the conceptual schema without
having to change external schema or application programs. We may change the conceptual
schema to expand the database (by adding a record type or data item), to change constraints, or
to reduce the database (by removing a record type or data item). To expand database, we can do
changes in conceptual schema and we can also change conceptual schema to change constraints.
It means that logical data independence gives us the freedom of changing the conceptual schema
without worrying about external schema. For example sometimes we may need to change the
logical schema by adding or removing the fields or attributes from the database. With logical
Example 1: The addition or removal of new entity, attributes, and relationships to the conceptual
schema should be possible without having to change existing schemas or having to rewrite
existing application programs. Consider a relation i.e. Student (name, rollno, class)
Student
If one more attribute i.e. Marks, is added in to the existing relation i.e. Student then the structure
Student
In Figure 2.2, we may need to change the logical schema by adding or removing the fields/
attributes from the database. With logical data independence, the change is possible. The change
37
would be absorbed by the view definitions and mapping between the external and the conceptual
view.
2. Physical data independence: It is the capacity to change the internal schema without having
to change the conceptual schema. Hence, the external schemas need not be changed as well.
Changes to the internal schema may be needed because of using different file organization or
storage structure, storage devices, or indexing strategy should be possible without having to
Modifying indexes.
The main objective of DBMS is to allow its users to perform a number of operations on database
such as retrieval, deletion and modification of data in abstract terms without knowing about the
physical representation of data. Therefore DBMS must provide appropriate languages and
interfaces for each category of users. In this section we discuss the types of languages and
interfaces provided by a DBMS and the user categories targeted by each interface.
A language is needed to describe the database to the DBMS as well provided facilitites for
changing the database and for defining and changing physical data structure. Another language is
called Data Description/ Definition Language (DDL) and Data Manipulation Language (DML)
38
respectively. Each DBMS has a DDL as well as a DML. The two languages may be parts of a
unified database language. The DBMS languages are of three forms as explained below:
These are the subroutine called from one or more programming languages. For example, a
system amy provide extension to COBOL, FORTRAN, C, C++ etc. to enable the user to interact
with the database. The programming language that is extended is usually called the host
language.
2. Query Language
These are special purpose languages that usually provide more powerful facilities to interact with
the database. These languages are often designed to be simple so that non-programmers may use
this easily. There are four types of database languages or you may call it as SQL components i.e.
Data Definition Language (DDL), Data Manipulation Language (DML), Data Control Language
DDL is a computer language for defining the type of data structure used in database. DDL
statements are used to create, modify and remove database objects such as tables, indexes and
users. CREATE, ALTER, DROP, TRUNCATE, RENAME are the various DDL commands.
TRUNCATE- Remove all records from a table, including all spaces allocated to the
record.
39
DDL has a pre-defined syntax for describing data.
DML is a family of computer languages used by computer program database users to retrieve,
insert, delete and update data in a database. Currently most popular DML is that of SQL, which
is used to retrieve and manipulate data in a relational database. DML may be of two types i.e.
procedural (the user specifies what data is needed and how to get it) and non-procedural (the user
specifies what data is needed). DML performs the operations like SELECT, INSERT, UPDATE,
DELETE. These commands are used for a specific purpose as given below:
DELETE- Delete all records from a table, the space for the records remain.
DCL is a computer language and subset of SQL, used to control access the data in a database.
That is a user can access any data based on the privileges given to him. DCL statements are used
to provide a kind of security to the database. GARNT and REVOKE are the commands that
come under the preview of DCL. The purpose of using these commands is as follow:
TCL statements are used to manage the changes made by DML statements. It allows statements
to be grouped together into logical transactions. For revoking the transactions and to make the
data commit to database, we use TCL statements. COMMIT, ROLLBACK, SAVEPOINT, SET
ROLLBACK- Identify a point in a transaction to which you can later roll back
40
SAVEPOINT- Restore database to original since the last COMMIT
SET TRANSACTION- Change transaction options like isolation level and what rollback
segment to use.
3. Data Sublanguage
In relational database theory, the term sublanguage, first used for this purpose by E. F. Codd in
1970, refers to a computer language used to define or manipulate the structure and contents of a
modern RDBMS's are QBE (Query by Example) and SQL (Structured Query Language). In
1985, Codd encapsulated his thinking in twelve rules which every database must satisfy in order
to be truly relational. The fifth rule is known as the Comprehensive data sublanguage rule, and
states.
the communication boundary between the DBMS and clients or to the abstraction provided by a
41
A DBMS interface hides the implementation of the functionality of the component it
encapsulates. Any real life data stored via an application poses with the help of SQL, a query
language, a query to the database system. There, the corresponding answer (result set) is
prepared and also with the help of SQL given back to the application. This communication can
take place interactively or be embedded into another language. Working principle of a database
DBMS provides the User-friendly interfaces which may include the following:
1. Menu-Based Interfaces for Web Clients or Browsing: These interfaces present the
user with lists of options, called menus, which lead the user through the formulation of a request.
Menus do away with the need to memorize the specific commands and syntax of a query
language; rather, the query is composed step by step by picking options from a menu that is
can fill out all of the form entries to insert new data, or they fill out only certain entries, in which
case the DBMS will retrieve matching data for the remaining entries. Forms are usually designed
and programmed for naive users as interfaces to canned transactions. A form based interface is
3. Graphical User Interfaces: A graphical user interface (GUI) typically displays a schema
to the user in diagrammatic form. The user can then specify a query by manipulating the
diagram. In many cases, GUIs utilize both menus and forms. Most GUIs use a pointing device,
4. Interfaces for Parametric Users: Parametric users, such as bank tellers, often have a
small set of operations that they must perform repeatedly. Systems analysts and programmers
42
design and implement a special interface for each known class of naïve users. Usually, a small
set of abbreviated commands is included, with the goal of minimizing the number of keystrokes
5. Text Based Interface: To be able to administrate the database or for other professional
users there are possibilities to communicate with the DBMS directly in the query language (in
code form) like SQL input/output window as shown in Figure 2.6. Text-based interfaces are very
powerful tools and allow a comprehensive interaction with a DBMS. However, the use of these
6. Interfaces for the DBA: Most database systems contain privileged commands that can
be used only by the DBA's staff. These include commands for creating accounts, setting system
parameters, granting account authorization, changing a schema, and reorganizing the storage
structures of a database.
43
Figure 2.6: Text Based Interface
some other language and attempt to "understand" them. A natural language interface usually has
its own "schema," which is similar to the database conceptual schema, as well as a dictionary of
important words. The natural language interface refers to the words in its schema, as well as to
the set of standard words in its dictionary, to interpret the request. If the interpretation is
successful, the interface generates a high-level query corresponding to the natural language
request and submits it to the DBMS for processing; otherwise, a dialogue is started with the user
There are many functions a Database Management System (DBMS) serves that are key
a business environment, the first task is to decide what type of DBMS one actually requires. A
DBMS performs several important functions that guarantee integrity and consistency of data in
44
the database. Most of these functions are transparent to end-users. There are the following
1) Ability to Update and Retrieve Data: This is a fundamental component of a DBMS and
essential to database management. Without the ability to view or manipulate data, there would be
no point to using a database system. Updating data in a database includes adding new records,
deleting existing records and changing information within a record. The user does not need to be
aware of how DBMS structures this data, all the user needs to be aware of is the availability of
updating and/or pulling up information, the DBMS handles the processes and the structure of the
data on a disk.
2) Support Concurrent Updates: Concurrent updates occur when multiple users make updates
management as this component ensures that updates are made correctly and the end result is
accurate. Without DBMS intervention, important data could be lost and/or inaccurate data stored.
DBMS uses features to support concurrent updates such as batch processing, locking, two-phase
locking, and time stamping to make certain updates that are done accurately. Database
management system is responsibility to make sure that all updates are stored properly since the
3) Recovery of Data: In the event a catastrophe occurs, DBMS must provide ways to recover a
database so that data is not permanently lost. There are times when computers may crash, a fire
or other natural disaster may occur, or a user may enter incorrect information invalidating or
making records inconsistent. If the database is destroyed or damaged in any way, the DBMS
must be able to recover the correct state of the database, and this process is called Recovery. The
easiest way to do this is to make regular backups of information. This can be done at a set
45
structured time, so in the event a disaster occurs, the database can be restored to the state that it
of the data. The internal schema defines how the data should be stored by the storage
management mechanism and the storage manager interfaces with the operating system to access
5) Self- Describing Nature of a Database System: A database system contains not only the
database itself but also a complete definition or description of the database. This system is stored
6) Program- Data Independence: DBMS access programs which are written independently of
any specific files. The structure of data files is stored in the DBMS catalog separately from the
access programs.
definitions. User application programs can operate on data by invoking these operations through
their names and arguments. Users do not care about how the operation is implemented.
8) Support of Multiple Views: Each user of the database many require a different perspective of
the database. A view may be a subset of the database or it may contain virtual data that is derived
unauthorized access, either international or accidental. It furnishes mechanism to ensure that only
10) Database Access and Application Programming Interfaces: All DBMS provide interface
to enable applications to use DBMS services. They provide data access via Structured Query
46
Language (SQL). The DBMS query language contains two components: (a) a Data Definition
11) Concurrency Control Service: Since DBMSs support sharing of data among multiple users,
they must provide a mechanism for managing concurrent access to the database. DBMSs ensure
that the database kept in consistent state and that integrity of the data is preserved.
single user or application program, which accesses or changes the contents of the database.
Therefore, a DBMS must provide a mechanism to ensure either that all the updates
13) Backup and Recovery Management: The DBMS provides mechanisms for backing up data
periodically and recovering from different types of failures. This prevents the loss of data.
The DBMS software is partitioned into several modules. Each module or component is assigned
a specific operation to perform. Some of the functions of the DBMS are supported by operating
systems (OS) to provide basic services and DBMS is built on top of it. Figure 2.7 explains
database system being a complex software system which is partitioned into several software
components that handle various tasks such as data definition and manipulation, security and data
integrity, data recovery and concurrency control, and performance optimization etc. as explained
below:
1) Data Definition: The DBMS provides functions to define the structure of the data. These
functions include defining and modifying the record structure, the data type of fields, and the
various constraints to be satisfied by the data in each field. It is the responsibility of DBA to
define the database, and make changes to its definition (if required) using the DDL and other
47
privileged commands. The DDL compiler component of DBMS processes these schema
definitions, and stores the schema descriptions in the DBMS catalog (data dictionary). Other
DBMS components then refer to the catalog information as and when required.
2) Data Manipulation: Once the data structure is defined, data needs to be manipulated. The
manipulation of data includes insertion, deletion, and modification of records. The functions that
perform these operations are also part of the DBMS. These functions can handle planned as well
I. The queries that are defined as a part of the application programs are known as planned
queries. The application programs are submitted to a pre-compiler, which extracts DML
commands from the application program and send them to DML compiler for
compilation. The rest of the program is sent to the host language compiler. The object
codes of both the DML commands and the rest of the program are linked and sent to the
II. The sudden queries that are executed as and when the need arises are known as
unplanned queries (interactive queries). These queries are compiled by the query
complier, and then optimized by the query optimizer. The query optimizer consults the
data dictionary for statistical and other physical information about the stored data. The
optimized query is finally passed to the query evaluation engine for execution. The naive
users of the database can also query and update the database by using some already given
application program interfaces. The object code of these queries is also passed to query
48
3) Data Security and Integrity: The DBMS contains functions, which handle the security and
integrity of data stored in the database. Since these functions can be easily invoked by the
application, the application programmer need not code these functions in any PL/SQL program.
4) Concurrency and Data Recovery: The DBMS also contains some functions that deal with
the concurrent access of records by multiple users and the recovery of data after a system failure.
5) Performance Optimization: The DBMS has a set of functions that optimize the performance
of the queries by evaluating the different execution plans of a query and choosing the best among
them.
6) Run Time Database Manager: Run time database manager is the central software
component of the DBMS, which interfaces with user-submitted application programs and
49
queries. It handles database access at run time. It converts operations in user's queries coming.
Directly via the query processor or indirectly via an application program from the user's logical
view to a physical file system. It accepts queries and examines the external and conceptual
schemas to determine what conceptual records are required to satisfy the user‘s request. It
enforces constraints to maintain the consistency and integrity of the data, as well as its security.
It also performs backing and recovery operations. Run time database manager is sometimes
referred to as the database control system and has the following components:
Authorization control: The authorization control module checks the authorization of users
Integrity checker: It .checks the integrity constraints so that only valid data can be entered
Query optimizer: The query optimizers determine an optimal strategy for the query
execution.
Transaction manager: The transaction manager ensures that the transaction properties
Scheduler: It provides an environment in which multiple users can work on same piece of
7) Query processor: The query processor transforms user queries into a series of low level
instructions. It is used to interpret the online user's query and convert it into an efficient series of
operations in a form capable of being sent to the run time data manager for execution. The query
processor uses the data dictionary to find the structure of the relevant portion of the database and
50
uses this information in modifying the query and preparing and optimal plan to access the
database.
8) Data Manager: The data manager is responsible for the actual handling of data in the
database. It provides recovery to the system which that system should be able to recover the data
after some failure. It includes Recovery manager and Buffer manager. The buffer manager is
responsible for the transfer of data between the main memory and secondary storage (such as
9) Database Engine: The Database Engine is the core service for storing, processing, and
securing data. The Database Engine provides controlled access and rapid transaction processing
to meet the requirements of the most demanding data consuming applications within your
enterprise. Use the Database Engine to create relational databases for online transaction
processing or online analytical processing data. This includes creating tables for storing data, and
database objects such as indexes, views, and stored procedures for viewing, managing, and
securing data.
10) Data dictionary: A data dictionary is a reserved space within a database which is used to
store information about the database itself. A data dictionary is a set of table and views which
can only be read and never altered. Most data dictionaries contain different information about the
data used in the enterprise. In terms of the database representation of the data, the data table
defines all schema objects including views, tables, clusters, indexes, sequences, synonyms,
procedures, packages, functions, triggers and many more. This will ensure that all these things
follow one standard defined in the dictionary. The data dictionary also defines how much space
has been allocated for and / or currently in used by all the schema objects. A data dictionary is
51
used when finding information about users, objects, schema and storage structures. Every time a
data definition language (DDL) statement is issued, the data dictionary becomes modified. A
User permissions
User statistics
11) Query Processor: A relational database consists of many parts, but at its heart are two major
components: the storage engine and the query processor. The storage engine writes data to and
reads data from the disk. It manages records, controls concurrency, and maintains log files. The
query processor accepts SQL syntax, selects a plan for executing the syntax, and then executes
the chosen plan. The user or program interacts with the query processor, and the query processor
in turn interacts with the storage engine. The query processor isolates the user from the details of
execution: The user specifies the result, and the query processor determines how this result is
DDL interpreter
DML compiler
52
12) Report writer: Also called a report generator, a program, usually part of a database
management system, which extracts information from one or more files and presents the
information in a specified format. Most report writers allow you to select records that meet
certain conditions and to display selected fields in rows and columns. You can also format data
into pie charts, bar charts, and other diagrams. Once you have created a format for a report, you
can save the format specifications in a file and continue reusing it for new data.
In this way, the DBMS provides an environment that is both convenient and efficient to use
when there is a large volume of data and many transactions need to be processed concurrently.
2.4 Summary
In this chapter a DBMS is presented that cleanly separates the three levels which have mapping
between the schemas to transform requests and results from one level to the next. Most DBMS‘s
do not separate the three levels completely. We used the three-schema architecture to define the
In the next section different types of user-friendly interfaces provided by DBMS and the users
with each interface is associated. Main types of languages that DBMS supports are also
explained which gives a thorough knowledge of high-level language that can be used as a
In the last, main functionality of DBMS and its different component modules are explained.
When deciding to implement a DBMS in a business environment, the first and most important
task is to decide what type of DBMS that business actually requires. Also it is important to know
that how many and which modules are actually requires to fulfill the business desire. Therefore,
selection of the concerned module or component is required to perform the requisite operation.
53
2.5 Suggested Reading/ Reference Material
1. Elmasri & Navathe: Fundamentals of Database systems, 3rd Edition, Addison Wesley,
New Delhi.
2. Korth & Silberschatz : Database System Concept, 4th Edition, McGraw Hill International
Edition.
3. Raghu Ramakrishnan & Johannes Gehrke: Database Management Systems, 2nd edition,
Delhi.
2. Outline three-level schema architecture of DBMS; distinguish each of the level clearly.
3. What do you mean by mapping? Discuss the different type of mapping in three-tier
architecture of DBMS.
5. What do you mean by data independence? Discuss the different type of data
independence.
54
Chapter – 3: Entity-Relationship (ER) Modeling
55
3.1 Introduction
It is the responsibility of database administrator (DBA) to perform the logical database design,
assigning the related data items of the database to columns of tables in a manner that preserve
desired properties. The most important test of logical design is that tables and attributes faithfully
reflect relationships among objects in the real world and that this remains true after all likely
database updates in future. Database with different data models have different structures for
representing data in relational database; the fundamental structure for representing data what we
The DBA starts by studying some real-world enterprise whose operations need to be supported
on a computerized database system. After a great deal with expertise of examination system;
DBA comes up with a list of data items and underlying data objects that must be keep track with
number of rules or constraints concerning the interrelationship of these data items. For all these
purposes the DBA used a data model to represent data items and their relationship called Entity-
3.2 Objective
This chapter will provide an idea to view real world objects as entity and relationship among
them by using basic component of relational model i.e. Entity-relationship diagram. The
objective of an entity relationship diagram is to show the business rules that apply to an
organization data. It contains entities, which are things of interest to a company, and
relationships, which are relationships between entities. It also documents volumetric data so that
we know what the initial data storage requirements will be together with the anticipated growth.
To design E-R diagram; basic data structuring concepts, constraints, relationship, keys,
cardinality ratios etc. are elaborated so that these concepts can be used in the designing of
56
conceptual schema for database applications. This design plan is designed by a database
developer to implement specific database management software. This model can be used to
early stage and developed as the requirements of the database and its processing become better
needs and can serve as a schema diagram for the required system's database. A schema diagram
is any diagram that attempts to show the structure of the data in a database. Nearly all systems
of the methodology and nearly all CASE (Computer Aided Software Engineering) tools contain
the facility for drawing entity-relationship diagrams. An entity-relationship diagram could serve
as the basis for the design of the files in a conventional file-based system as well as for a schema
In 1976, Chen developed the Entity-Relationship (ER) model, defined as ―a high-level data
model that is useful in developing a conceptual design for a database‖. An Entity Relationship
(ER) diagram is an excellent communications tool, which can be used to confirm business
requirements and provide direction to the architecture and design team as they move forward
Is used as the ―target‖ for data movement mapping and helps ensure no data is
overlooked;
57
Provides direction to the architecture and design team to start physical database design;
and
Helps make important decisions about facts and dimensions required for business
intelligence purposes.
Creation of an ER diagram, which is one of the first steps in designing a database, helps the
designer(s) to understand and to specify the desired components of the database and the
relationships among them, attributes of the entities and the relationships. These three categories
are considered to be sufficient to model the essentially static data-base parts of any organization's
(i) Entity: An entity is an object that exists and which is distinguishable from other objects.
An entity can be a person, a place, an object, an event, or a concept about which an organization
It is important to understand the distinction between an entity type, an entity instance, and an
entity set. An entity type defines a collection of entities that have same attributes. An entity
instance is a single item in this collection. An entity set is a set of entity instances.
Example 2: Let student is an entity type; a student with ID number 13-PGDCA-1100 is an entity
In the E-R diagram, we assign a name to each entity type. When assigning names to entity types,
we follow certain naming conventions. An entity name should be a concise singular noun that
captures the unique characteristics of the entity type. An E-R diagram depicts an entity type
using a rectangle with the name of the entity inside as shown in Figure 3.1.
58
STUDENT EMPLOYEE DEPARTMENT
An entity type may be of two types i.e. Strong entity and Weak entity. Entity types that have key
attribute (Primary Key) are called strong entity type. The strong entity type is also called regular
entity type. The entity type STUDENT is a strong entity type, since it has StudentID as a key
attribute as shown in example 3. While entity types that do not have any key attributes is called
weak entity type. In example 3, class is a weak entity, since it does not have any key attribute.
The weak entity type is also called child entity type or the subordinate entity type. In an E-R
diagram a strong entity is shown in rectangular box as shown in Figure 3.2 (a) and a weak entity
Class
STUDENT
Example 3: STUDENT = {Student Id, Name, Address, PhoneNo, Age, Dateofbirth, Language}
A particular value of an attribute, such as 101 as StudentID and Aryan as Name etc. for Student
entity as shown in Example 3 is a value of the attribute. Most of the data in a database consists of
59
values of attributes. The set of all possible values of an attribute is the attribute domain.
Sometimes the value of an attribute is unknown or missing, and sometimes a value is not
applicable. In such cases, the attribute can have the special value as null.
1. Each word in a name starts with an uppercase letter followed by lower case letters.
2. If an attribute name contains two or more words, the first letter of each subsequent word is
also in uppercase, unless it is an article or preposition, such as ―a,‖ ―the,‖ ―of,‖ or ―about‖ etc.
E-R diagrams depict an attribute inside an ellipse/oval and connect the ellipse/oval with a line to
the associated entity type. Figure 3.3 illustrates an E-R diagram of Student entity with some of
One must note that all of the attributes as shown in Figure 3.3 are actually the several types of
attributes which uses different notations. These include: simple, composite, single-valued, multi-
valued, stored, derived and key attributes. In the upcoming subsections, we discuss the
cannot be further divided into smaller components. A composite attribute, however, can be
divided into smaller subparts in which each subpart represents an independent attribute. Name in
this case is a composite attribute, since it can be further divided into smaller subpart. Similarly
Address can also be composite attribute. All other attributes, even those that are subcategories of
Name and Address, are simple attributes. Figure3.3 presents the notation that depicts a composite
attribute. Simple and composite attribute are denoted by oval/ ellipse in an E-R diagram.
60
b) Single-Valued and Multi-Valued Attributes: Most attributes have a single value for an
entity instance; such attributes are called single-valued attributes. A multi-valued attribute, on the
other hand, may have more than one value for an entity instance.
First Name
PhoneNo Name
Middle Name
Languages
Last Name
Student
StudentID
Dateofbirth Age
Example 4: Figure 3.3 states that in STUDENT entity type; language is an attribute. Language
attribute may be a multi-valued attribute, because it may store the names of the languages that a
student speaks. Since a student may speak several languages, it is a multi-valued attribute.
Attributes like Student Id of the STUDENT entity type is a single-valued attributes, because a
student has only one Student Id. In the E-R diagram, we denote a multi-valued attribute with a
61
Note: Student entity is a strong entity; since it has StudentId as key attribute.
c) Stored and Derived Attributes: The value of a derived attribute can be determined by
analyzing other attributes. In Figure 3.3 Age is a derived attribute and DateofBirth is a stored
attribute of STUDENT entity type. The value of Age attribute can be derived from the current
date and the attribute DateofBirth. An attribute whose value cannot be derived from the values of
other attributes is called a stored attribute. A derived attribute Age is not stored in the database.
Derived attributes are depicted in the E-R diagram with a dashed (dotted) ellipse/ oval.
attributes that uniquely identify an individual instance of an entity type. No two instances within
an entity set can have the same key attribute value. For the STUDENT entity shown in Figure
3.3, StudentID is the key attribute since each student identification number is unique. Name, by
contrast, cannot be an identifier because two students can have the same name. We underline key
(iii) Relationship Types: The first two major elements of entity-relationship diagrams are
entity types and attributes. The final element is the relationship type. Sometimes, the word 'types'
is dropped and relationship types are called simply 'relationships' but since there is a difference
between the terms, one should really use the term relationship type.
Real-world entities have relationships between them, and relationships between entities on the
of a network of entity types and connecting relationship types. A relationship type is a named
association between entities. Individual entities have individual relationships of the type between
62
entity-relationship diagram, this is generalized into entity types and relationship types. The entity
type PERSON is related to the entity type HOUSE by the relationship type OCCUPIES. There
are lots of individual persons, lots of individual houses, and lots of individual relationships
linking them.
There can be more than one type of relationship between entities. Entities in an organization do
not exist in isolation but are related to each other. Students take courses and each STUDENT
entity is related to the COURSE entity. Faculty members teach courses and each FACULTY
entity is also related to the COURSE entity. Consequently, the STUDENT entity is related to the
FACULTY entity through the COURSE entity. E-R diagrams can also illustrate relationships
relationship set is a grouping of all matching relationship instances, and the term relationship
Figure 3.4: The relationship between FACULTY and COURSE entities in an E-R diagram.
In an E-R diagram, relationship types are represented with diamond-shaped boxes connected by
straight lines to the rectangles that represent participating entity types. A relationship type is a
given name that is displayed in this diamond-shaped box and typically takes the form of a
present tense verb or verb phrase that describes the relationship. An E-R diagram may depict a
relationship as shown in Figure 3.4 between the entities FACULTY and COURSE.
63
(iv) Degree of Relationship: The number of entity sets that participate in a relationship is
Example 5: The degree of the relationship featured in Figure 3.4 is two because FACULTY and
COURSE are two separate entity types that participate in the relationship. The three most
common degrees of a relationship in a database are unary (degree 1), binary (degree 2), and
Let E1, E2, . . . ,En denote n entity sets and let R be the relationship. The degree of the
Example 6: Let two students are roommates and stay together in a hostel. Because they share the
same address, a unary relationship exists between them for the attribute Address in Figure 3.3.
Example 8: Consider a student using certain equipment for a project. In this case, the
STUDENT, PROJECT, and EQUIPMENT entity types relate to each other with ternary
64
(v) Cardinality of a Relationship: The term cardinal number refers to the number used in
counting. When we say cardinality of a relationship, we mean the ability to count the number of
Example 9: If the entity types A and B are connected by a relationship, then the maximum
cardinality represents the maximum number of instances of entity B that can be associated with
However, we don‘t need to assign a number value for every level of connection in a relationship.
In fact, the term maximum cardinality refers to only two possible values: one or many. While
this may seem to be too simple, the division between one and many allows us to categorize all of
the permutations possible in any relationship. The maximum cardinality value of a relationship,
then, allows us to define the four types of relationships possible between entity types A and B.
can be associated with a given instance of entity A. However, only one instance of entity A can
Example 10: While a customer of a company can make many orders, an order can only be
can be associated with a given instance of entity B. However, only one instance of entity B can
65
d) Many-to-Many Relationship In a many-to-many relationship, many instances of entity
A can be associated with a given instance of entity B, and, likewise, many instances of entity B
Example 12: A machine may have different parts, while each individual part may be used in
different machines.
(vi) Representing Relationship Types: Figure 3.5 displays how we represent different
relationship types in an E-R diagram. An entity on the one side of the relationship is represented
by a vertical line, ―I,‖ which intersects the line connecting the entity and the relationship. Entities
on the many side of a relationship are designated by a crowfoot as depicted in Figure 3.5.
(vii) Role Names and Recursive Relationships: Each entity type in a relationship plays a
particular role. The role name specifies the role that a participating entity type plays in the
relationship and explains what the relationship means. For example, in the relationship between
Employee and Department, the Employee entity type plays the employee role, and the
Department entity type plays the department or employer role. In most cases the role names do
not have to be specified, but in cases where the same entity participates more than once in a
Example 13: Let there are two entity types MANAGER and ORGANIZATION. The
relationship name is manages. It states that MANAGER plays the role or worker (employee) and
ORGANIZATION plays the role of owner (employer). Further employee manages the
In a recursive relationship the same entity type participate in more than once for a relationship
type in different roles. Such relationship types are called recursive relationship.
66
Example 14: In the Company schema, each employee has a supervisor, we need to include the
entity type participates twice in the relationship, once as an employee and once as a supervisor,
and therefore we can specify two roles, employee and supervisor as shown in Figure 3.6.
1 1 1 M
A B R
R A B
M 1 M M
A B A R B
R
Employee
Supervisor Supervisee
Supervises
67
3.3.2 Relationship Constraints
Relationship types have certain constraints that limit the possible combination of entities that
Example 15: An example of a constraint is that if we have the entities Doctor and Patient, the
organization may have a rule that a patient cannot be seen by more than one doctor. This
constraint needs to be described in the schema. There are two main types of relationship
Binary relationships are relationships between exactly two entities. The cardinality ratio specifies
the maximum number of relationship instances that an entity can participate in. The possible
cardinality ratios for binary relationship types are: 1:1, 1: N, N: 1, M: N. Cardinality ratios are
shown on ER diagrams by displaying 1, M and N on the diamonds box. The ratio shown closest
to an entity represents the ratio the other entity has to that entity.
The participation constraint specifies whether the existence of an entity depends on its being
related to another entity via the relationship type. The constraint specifies the minimum number
of relationship instances that each entity can participate in. There are two types of participation
constraints:
a) Total Participation:
An entity can exist, only if it participates in at least one relationship instance, then that
relationship is called total participation, meaning that every entity in one set, must be
68
An example would be the Employee and Department relationship. If company policy
states that every employee must work for a department, then an employee can exist only
if it participates in at least one relationship instance (i.e. an employee can‘t exist without
a department)
Total participation is represented by a double line, going from the relationship to the
dependent entity.
b) Partial Participation:
If only a part of the set of entities participate in a relationship, then it is called partial
participation.
Using the Company example, every employee will not be a manager of a department, so
For example, in the relationship Works_On, between the Employee entity and the
Department entity we would like to keep track of the number of hours an employee
relationship.
Another example is for the ―manages‖ relationship between employee and department,
69
For some relationships (1:1, or 1:N), the attribute can be placed on one of the
participating entity types. For example the ―Manages‖ relationship is 1:1, StartDate can
3.3.3 Keys
Keys are, as their name suggests, a key part of a relational database and a vital part of the
structure of a table. They ensure each record within a table can be uniquely identified by one or a
combination of fields within the table. They help enforce integrity and help identify the
relationship between tables. There are three main types of keys i.e. candidate keys, primary keys
and foreign keys. There is also an alternative key or secondary key that can be used, as the name
suggests, as a secondary or alternative key to the primary key and composite key as explained
below:
i) Candidate Key: A candidate key is any set of one or more columns whose combined
values are unique among all occurrences and the key cannot be further reduced. Since a null
value is not guaranteed to be unique, no component of candidate key is allowed to be null. There
can be any number of candidate keys in a table. Two properties must be satisfied by a candidate
key:
Irreducible: The attributes which are used to form the keys should not be further broken down
Candidate Key
StudentID First Name Last Name Class Marks
70
101 Satvik Juneja PGDCA 2900
Example 16: In Table 3.1, As an example we might have a student_id that uniquely identifies
the students in a student table. This would be a candidate key. But in the same table we might
have the student‘s first name and last name that also, when combined, uniquely identify the
Once your candidate keys have been identified you can now select one to be your primary key
ii) Super Key: A super key is the combination of attributes that can be uniquely identify a
database record. A table might have many super keys. Candidate keys are a special subset of
super keys that do not have any extraneous information in them. In other words if we add another
attribute to candidate key and it still satisfies the uniqueness property, then the combination of
those attributes is known as Super key. The main properties are as follows:
71
Example 17: In Table 3.1, an attribute student_id act as the candidate key and it can also act as a
super key as it satisfies the uniqueness property. If we add another attribute to that key say
FirstName, LastName, Class, Marks and it still satisfies the uniqueness property then it‘s a super
key.
iii) Primary Key (PK): Primary keys are used to uniquely identify rows in a relational
database design. It usually comprises of a single table column, but may consist of a multiple
columns as well. It is possible for a table to have more than one column with unique values in the
table, however only one primary key can be defined. Each column with distinct values is called a
unique key. If we have more than one candidate key in our relation then choose one out of all
candidate keys. Primary keys can be defined at the time of table creation or can be added in after
the table has been created. Following points should be kept in mind while making primary key
a) No rows can have an empty value (called the null) in the primary key column.
b) The value of the primary key attribute must not be duplicated in any tuple/ row.
c) The primary key should be composed of the minimum number of attributes that satisfies the
d) The value of the primary key will remain same during the life time of the relation.
Primary Key
Roll Name Class Marks
72
103 Aryan PGDCA 2875
Invalid Entry
103 Mukta PGDCA 2870
Example 18: In Table 3.2, there are six tuple/ rows in total. In which row no. 5 is invalid,
because Roll 103 is duplicate. The attribute, Name, and the values Mukta and Aryan have the
same Roll as 103. It does not satisfy the properties of primary key. Further in row no. 6, Roll
Null is there. It is also an invalid entry. Since primary key does not allow null values.
iv) Foreign key: A foreign key (FK) is a field or group of fields in a database record that
points to a key field or group of fields forming a key of another database record in some (usually
different) table. Usually a foreign key in one table refers to the primary key (PK) of another
table. This way references can be made to link information together and it is an essential part of
database normalization.
v) Alternate Key: An alternate key or secondary key is a candidate key which is not
selected to be the Primary key. In a relation there are number of attributes which may uniquely
identify the rows of a table. These attributes are called as candidate key. Out of these candidate
keys one is selected as a Primary key of the relation and the remaining candidate key left after
vi) Composite Key: A compound key is a key that consists of two or more attributes that
uniquely identify the rows of a relation. Composite keys are also known as concatenated or
aggregate keys. A composite key cannot be irreducible and also it cannot contain null values.
73
3.3.4 ER Model/ Diagram
The Entity-Relationship (ER) model was originally proposed by Peter in 1976 as a way to unify
the network and relational database views. Simply stated, the ER model is a conceptual data
model that views the real world as entities and relationships. A basic component of the model is
the Entity-Relationship diagram, which is used to visually represent data objects. For the
• It maps well to the relational model. The constructs used in the ER model can easily be
• It is simple and easy to understand with a minimum of training. Therefore, the model can
be used by the database designer to communicate the design to the end user.
• In addition, the model can be used as a design plan by the database developer to
There are two techniques used for the purpose of data base designing from the system
specialized graphic technique that illustrates the interrelationships between entities in a database.
ER diagram often use symbols to represent three different types of information. Boxes are
commonly used to represent entities. Diamonds are used to represent relationships and ovals/
ellipse are used to represent attributes. The E-R models are designed diagrammatically using the
Entity- Relationship diagrams which represent the elements of conceptual model. The overall
74
logical structure of a database can be expressed graphically by an E-R diagram. Table 3.3 shows
Notation Meaning
Entity type
Attribute
Key attribute
Derived attribute
Multivalued
attribute
Composite attribute
Relationship type
Total participation
Many-to-one
relationship
75
Following are advantages of an E-R Model:
1. Visual Representation: The foremost and most important ERD benefit is that it provides a
visual representation of the design. It is normally crucial to have an ERD if you are looking to
come up with an effective database design. This is because the patterns assist the designer in
focusing on the way the database will primarily work with all the data flows and interactions. It
is common to the ERD being used together with data flow diagrams so as to attain a better visual
database and their relationship with each other. ERD normally uses symbols for representing
three varying kinds of information. Diamonds are used for representing the relationships, ovals
are usually used for representing attributes and boxes represent the entities. This allows a
3. Simple to Understand: ERD is easy to understand and simple to create. In effect, this design
can be used to be shown to the representatives for both approval and confirmation. The
76
representatives can also make their contributions to the design, allowing the possibilities of
4. High Flexibility: The ERD model is quite flexible to use as other relationships can be derived
easily from the already existing ones. This can be done using other relational tables and
mathematical formulae.
1. No Industry Standard for Notation: There is no industry standard notation for developing
an E-R diagram.
2. Popular for High-Level Design: The E-R data model is especially popular for high level.
For each entity set and relationship set, there is a unique table which is assigned the name of the
corresponding set. Each table has a number of columns with unique names.
Step 1: For regular entity type E in ER schema, create a relation R that includes all the
primary key.
Step2: For weak entity type W in ER schema, with owner entity type E, create a relation
primary key attributes of the relation Q for the owner entity type E. Primary key is
77
Step 3: For 1:1 relationship X, suppose S and T are the relations for the entity types
Step 4: For 1: N relationship Y, suppose S relation corresponds to the entity type at the N-
side and T relation corresponds to the entity type at the other side. Include
CUTOMER BRANCH
Account
Account_no Balance
78
Step 6: For multi-valued attributes A, create a new relation R that includes an attribute
corresponding to A. Include primary key of the relation of the entity type having
Step 7: For n-ary relationship type X, and n>2, create a new relation R, include primary
3.4 Summary
Entity-Relationship (E-R) model is a high level conceptual data model developed by Chen in
1976 to facilitate database design. In this chapter, we had discussed an overview about E-R
modeling. Different type of entities, attribute and relationship among them are clearly elaborated.
We also exemplify the key concept, which are very important in E-R designing process. We also
discussed that how to construct an E-R diagram and further how to map an E-R model into
relational tables.
1. Elmasri & Navathe: Fundamentals of Database systems, 3rd Edition, Addison Wesley,
New Delhi.
2. Korth & Silberschatz.: Database System Concept, 4th Edition, McGraw Hill International
Edition.
3. Raghu Ramakrishnan & Johannes Gehrke: Database Management Systems, 2nd edition,
Delhi.
79
5. Bipin C.Desai : An Introduction to Database System, Galgotia Publication, New Delhi
1. What is E-R modeling? What are the components of E-R model? Discuss.
2. Differentiate between entity, entity type and entity set. Explain the different type of
entities.
3. What do you mean by attribute? Explore the different type of attributes with examples.
4. Outline the different notations and naming conventions used to represent an E-R diagram.
6. Distinguish between
80
Chapter – 4: Database Design: Case Studies
81
4.1 Introduction
designer‘s skill. A DBA can improve performance by adjusting some DBMS parameters like size
of the buffer pool or frequency of checkpoints. The overall database design activity has to
undergo systematic process called the design methodology. The overall process include
conceptual and external schema design, that is created as a collection of relations and views
along with a set of integrity constraints, we must address performance goals through physical
database design, in which we design the physical schema. It is usually necessary to tune
4.2 Objective
The design process consists of two parallel activities. The first activity involves the design of
data content and structure of the database. The second activity relates to the design of the
database application. These two activities strongly influence each other. Traditionally, database
design methodologies have focused on different phases as discussed in upcoming sections. These
phases are similar to software design phases, but not strictly restrict to sequence of these phases.
Proper database design is the only way that the database application will be efficient, flexible,
and easy to manage and maintain. An important aspect of database design is to use relationships
between tables instead of throwing all your data into one long flat file. Types of relationships
Using relationships to properly organize your data is called normalization. There are many levels
of normalization, but the primary levels are the first, second, and third normal forms. Each level
82
has a rule or two that you must follow. Following all the rules helps ensure that your database is
To take an idea from inception through to fruition, you should follow a design process. This
process essentially says, ―Think before you act.‖ Discuss rules, requirements, and objectives;
then create the final version of your normalized tables. The systematic process of designing a
and business needs of an organization, modeling the specified requirements, and realizing the
requirements using a database. The goal of designing a database is to produce efficient, high
quality, and minimum cost database. In large organizations, database administrator (DBA) is
responsible for designing an efficient database system. He is responsible for controlling the
database life-cycle process. The overall database design and implementation process consists of
several phases.
i) Requirement Collection and Analysis: It is the process of knowing and analyzing the
expectations of the users for the new database application in as much detail as possible. A team
of analysts or requirement experts are responsible for carrying out the task of requirement
analysis. They review the current file processing system or DBMS system, and interact with the
users extensively to analyze the nature of business area to be supported and to justify the need
for data and databases. The initial requirements may be informal, incomplete, inconsistent, and
(OOA), data flow diagrams (DFDs), etc., are used to transform these requirements into better
structured form. This phase can be quite time-consuming; however, it plays the most crucial and
important role in the success of the database system. The result of this phase is the document
83
ii) Conceptual Database Design: In this phase, the database designer selects a suitable data
model and translates the data requirements resulting from previous phase into a conceptual
database schema by applying the concepts of chosen data model. The conceptual schema is
independent of any specific DBMS. The main objective of conceptual schema is to provide a
detailed overview of the organization. In this phase, a high-level description of the data and
constraints are developed. The entity-relationship (E-R) diagram is generally used to represent
the conceptual database design. The conceptual schema should be expressive, simple,
iii) Choice of a DBMS: The choice of a DBMS depends on many factors such as cost,
DBMS features and tools, underlying model, portability, and DBMS hardware requirements. The
technical factors that affect the choice of a DBMS are the type of DBMS (relational, object,
object-relational, etc.), storage structures and access paths that DBMS supports, the interfaces
available, the types of high-level query languages, and the architecture it supports (client/server,
parallel or distributed). The various types of costs that must be considered while choosing a
DBMS are software and hardware acquisition cost, maintenance cost, database creation and
iv) Logical Database Design: Once an appropriate DBMS is chosen, the next step is to map
the high-level conceptual schema onto the implementation data model of the selected DBMS. In
this phase, the database designer moves from an abstract data model to the implementation of the
database. In case of relational model, this phase generally consists of mapping the E-R model
v) Physical Database Design: In this phase, the physical features such as storage structures,
file organization, and access paths for the database files are specified to achieve good
84
performance. The various options for file organization and access paths include various types of
vi) Database System Implementation: Once the logical and physical database designs are
completed, the database system can be implemented. DDL statements of the selected DBMS are
used and compiled to create the database schema and database files, and finally the database is
vii) Testing and Evaluation: In this phase, the database is tested and fine-tuned for the
performance, integrity, concurrent access, and security constraints. This phase is carried out in
parallel with application programming. If the testing fails, various actions are taken such as
software or hardware.
We must keep it in view that once the application programs are developed, it is easier to change
the physical database design. However, it is difficult to modify the logical database design as it
may affect the queries (written using DML commands) embedded in the program code. Thus, it
is necessary to carry out the design process effectively before developing the application
programs. While designing a database schema, it is necessary to avoid two major issues, namely,
redundancy and incompleteness. These problems may lead to bad database design.
Following points must be kept in mind before drawing the effective ER diagrams:
1. Identify all the relevant entities in a given system and determine the relationships among
these entities.
85
3. Provide a precise and appropriate name for each entity, attribute, and relationship in the
diagram. Terms that are simple and familiar always beats vague, technical-sounding words.
In naming entities, remember to use singular nouns. However, adjectives may be used to
distinguish entities belonging to the same class (part-time employee and full time employee,
The Inventory System provides a complete set of methods to support inventory handling. All
users of the Inventory System need the same functionality to complete their varied tasks.
Notify the store of a customer‘s intent to purchase an item that is not currently in stock.
(backorder)
Notify the store of a customer‘s intent to purchase an item that has never been in stock.
(preorder)
preorder.
Decrease the number of items available for purchase, backorder, or preorder, perhaps
86
Determine when a specific item will be back in stock.
For drawing an ER diagram of Inventory system, following components of ER diagram are taken
care of:
Entity Purpose
the organization.
dealing with.
specific category.
customer.
2. Respective attributes of the entity along with their type and name of the constraint.
Attribute Constraint
87
MName, LName
Fax Number(10)
Email Varchar2(25)
PayMethod Char(10)
MName, LName
MName, LName
Fax Number(10)
Email Varchar2(25)
88
Supplierid Varchar2(3) Foreign Key
Discount Number(3,2)
Uprice Number(3,2)
89
ODetailID Varchar2(3) Foreign Key
Discount Number(3.2)
Specimen of an E-R diagram: Given below an ER diagram is just an idea to the problem. It
90
MNam MNam LNam MNam LNam
FName e FName e
FName e LNam e e
e sex Phone Phone
SupplierID Name
Name CID Fax
Name Password SID
Address
Paymethod StaffID Staff 1
Phone username Register Customer
s N
SUPPLIER Address
Fax N 1 Email
1
Email
1
has Orders
Sup takes
Desc
ProductID plies RoleNam 1 N
Popstoc e N
k RoleID CID
SupplierID N Role
Pclstock Order
PName ordID
Pordered
Product 1
qperunit OrdDate 1
N uprice
discount contain contain
s s
N
Belongs
to
ODetailID paypaid ordID
N
to N
1 billno N Having OrderDetail
ODetailID
CatID Category
Paydate Payment
ordDat
deliDate
PayDue e
CName
PayID deliqty Discount
BalanceDue
ProdID
91 ordquantity
(ii) Draw an E-R diagram of Payroll System:
A payroll system refers to the scheme that is used to pay employees in a firm. A payroll refers to
the financial records that relate to the payment of the employees. A payroll database is an
automated system that allows you to input employees‘ payroll information and compensate them
accordingly. The database may be a stand-alone system that enables only payroll operations, or
an integrated system that enables related business functions. Here we will restrict our scope only
to stand- alone system. A stand-alone payroll database is a single payroll application that you use
to perform payroll tasks. This option may come in handy if you already have HR and accounting
solutions in place. An effective stand-alone database gives you a complete range of services that
allows you to fully manage your payroll activities. This includes new-hire reporting, wage and
deduction calculations, check printing, direct deposit, wage garnishments, tax reporting and
keeping etc.
For drawing an ER diagram of Inventory system, following components of ER diagram are taken
care of:
Entity Purpose
92
Salary To record the complete detail of the salary to the employee.
2. Respective attributes of the entity along with their type and name of the constraint.
Attribute Constraint
L_NAM
Email Varchar2(20)
93
Phone Number(10) Not Null
Specimen of an E-R diagram: Given below an ER diagram is just an idea to the problem. It
94
M
EMPLOYEE WORKS
IN
ADDRES 1
COMPANY
Phone S 1 1
DESIGNATIO C_ID
N C_NAME
NAME EMAILID
E_ID HAS
F_NAM
B_ID E 1…….M
L_NAM
E EMPLOY
1…….M BRANCH
ES
M
DID
M HEADED
BY
B_NAME B_ID
PHONE
GETS
BELONGS
DID E_ID
1…….. M
M 1
1
EMPLOYER
PAYS
1
ENAME EMAILID
DEPARTMENT EMP_ADDR
1
DID Dname
SALARY
E_ID E_ID
PERQS NET_SAL
BASIC
ALLOWNACE TOT_SAL TAX
95
iii) Draw an E-R diagram of Reservation System:
We now live in an era where practically everything is inextricable from the internet, including
business. It's now crucial that every business - no matter the sector - has a recognizable web
presence. This help in organizing/ reserving tour and other activity service online. An online
reservation system is "used to store and retrieve information about tour product, tour product
options or lodging facility and conduct transactions for booking it." As a case study, we are
hereby discussing the air reservation system as a reference. Airline need to maintain multiple
type of information such as route information, aircraft information, schedule information, fare
For drawing an ER diagram of Inventory system, following components of ER diagram are taken
care of:
Entity Purpose
2. Respective attributes of the entity along with their type and name of the constraint.
Attribute Constraint
96
To Varchar2( (15) Not Null
Lname
To Flight, Passenger \1 : N
97
Specimen of an E-R diagram: Given below an ER diagram is just an idea to the problem. It
To
Flight Passenger id.
From no. Name
1 N
Departure
Flight Passenger
1To
Date FName Lname
M M
Departue
Time
Address
Arrival Date
1
N class
N
Airplane Booking
Model No. Seat_ No.
Capacity
Booking_ Date
Online_ payment
Registration No.
98
iv) Draw an E-R diagram of Online Book Store:
Shopping for books online helps you find the best possible price for just about any book you
want. If you‘re in the market for rare, collectible or autographed books, it‘s much cheaper and
faster to search online than it would be to call up local used and independent bookstores that
carry these types of items. The features available on many online bookstores also allow you to
compare similar titles with the click of a mouse and read reviews from professionals and
customers. You can also resell your used books to get more cash in your pocket and to clear out
your cluttered bookshelf. It‘s never been easier to ensure you never get stuck with a crummy title
again. A quality online bookstore will have a good product selection, an easy-to-use -yet
1. Entities identified for drawing an ER diagram of Online Book Store are as follow:
Entity Purpose
stocked.
99
2. Respective attributes of the entity along with their type and name of the constraint.
Attribute Constraint
MName, LName
100
MName, LName
MName, LName
Specimen of an E-R diagram: Given below an ER diagram is just an idea to the problem. It
101
M 1
CUSTOMER VISITS BOOKSTORE
ADDRESS
1 1
PHONE REGISTERATION _NO
C_ID
NAME ADDRESS
CONTAINS
F_NAME
M_NAME
L_NAME
PLACE CONTAINS 0…….M
ORDER
0..1 BOOKS PDF
ORDER_ID ISSUE_NO YEAR
ORDER_DATE
MAY
HAVE NAME AUTHOR
ORDER_QTY
0……..M M ISBN
ORDER M
BOOKS PUBLISHED
M M BY
WRITTEN
BY ISBN NO.
1
PRICE
AUTHOR STOCKS
ADDRESS
A_ID NAME
ADDRESS L_NAME
L_NAME
NAME 1……..M F_NAME
WAREHOUSE M_NAME
F_NAME M_NAME
CODE 102
PHONE
ADDRESS
4.3.3 Some Other Specimen ER Diagram
103
(ii) An ER Diagram of University System:
104
(iii) An ER Diagram of an Organization System:
105
(iv) An ER Diagram of Banking System:
ER diagrams constitute a very useful framework for creating and manipulating databases. Some
First, ER diagrams are easy to understand and do not require a person to undergo
extensive training to be able to work with it efficiently and accurately. This means that
106
designers can use ER diagrams to easily communicate with developers, customers, and
Second, ER diagrams are readily translatable into relational tables which can be used to
Lastly, ER diagrams may be applied in other contexts such as describing the different
4.4 Summary
This entity-relationship diagram depicts the major concepts and relationships needed for
managing any of the real life case studies. It is neither a complete data model depicting every
necessary relational database table, nor is it meant to be an exactly same design for
implementations of such real life case studies. Alternate models may capture the necessary
attributes and relationships. Therefore, in this chapter an attempt has been initiated to design
some useful case studies which will assist developers with envisioning the complexity of the
environment that an ERM system must address, and ensure that crucial relationships and features
1. www.tutorialspoint.com/dbms/er_diagram_representation.htm
2. www.umsl.edu/~bcjtz4/umsl/er_diagrams.html
3. https://www.google.co.in/search?q=e-
r+diagrams+examples&sa=X&biw=1280&bih=590&tbm=isch&tbo=u&source=univ&ei
=nKAIVLXcCsyIuAT2woCIAQ&ved=0CCcQ7Ak
107
4. Elmasri & Navathe: Fundamentals of Database systems, 3rd Edition, Addison Wesley,
New Delhi.
1. What are the different phases of database designing? Explain each in detail.
3. What are the uses of E-R diagram? Draw an E-R diagram of library system of an
Institute.
5. What steps should be kept in mind while designing an ER diagram of University system?
Discuss.
diagram.
108
Chapter – 5: Data Models
109
5.1 Introduction
Data models in DBMS are systems that help you use and create databases. DBMS actually stands
for a database management system. Various DBMS types exist with different speed, flexibilities
and implementations. Each type has an advantage over others but there is no one superior kinds.
The kind of structure and data you need determines which data model in DBMS suits your needs
best.
A data model can be thought of as a flowchart of diagram that shows data relationships among
objects. It can be time-intensive to capture all the data in a model but this should not be rushed as
end users to control, maintain or create records in a data base. Primarily, features of DBMS
address database creation for record interrogation, queries and data extraction. The difference
between an application development environment and a DBMS system ranges from the
A data model not only describes the structure of the data, it also defines a set of operations that
can be performed on the data. A data model generally consists of data model theory, which is a
formal description of how data may be structured and used, and data model instance, which is a
practical data model designed for a particular application. The process of applying a data model
5.2 Objective
abstraction that concentrates on the essential, inherent aspects an organization and ignores the
accidental properties. A data model represents the organization itself. It should provide the basic
concepts and notations that will allow database designers and end users unambiguously and
110
accurately to communicate their understanding of the organizational data. This chapter focuses
on the popular data models i.e. Hierarchical Data Model, Network Data Model, and Relational
Data Model. The terminologies of these models are discussed in detail using simple examples.
Some other data models are also explained which can also be used to represent data. A data
model makes it easy to understand the actual meaning of data to endure that:
The main objective of database system is to highlight only the essential features and to hide the
storage and data organization details from the user. This is known as data abstraction. A database
model provides the necessary means to achieve data abstraction. A database model or simply a
data model is an abstract model that describes how the data is represented and used. A data
model consists of a set of data structures and conceptual tools that is used to describe the
A data model not only describes the structure of the data, it also defines a set of operations that
can be performed on the data. A data model generally consists of data model theory, which is a
formal description of how data may be structured and used, and data model instance, which is a
practical data model designed for a particular application. The process of applying a data model
111
Depending on the concept they use to model the structure of the database, the data models are
categorized into three types, namely, high-level or conceptual data models, representational or
Conceptual data model describes the information used by an organization in a way that is
independent of any implementation-level issues and details. The main advantage of conceptual
data model is that it is independent of implementation details and hence, can be understood even
by the end users having non-technical background. The most popular conceptual data model is,
2) Low-level or physical data models - Physical data model describes the data in terms of
a collection of files, indices, and other storage structures such as record formats, record ordering,
and access paths. This model specifies how the database will be executed in a particular DBMS
software such as Oracle, Sybase, etc., by taking into account the facilities and constraints of a
given database management system. It also describes how the data is stored on disk and what
oriented) - The representational or implementation data models hide some data storage details
from the users; however, can be implemented directly on a computer system. Representational
data models are used most frequently in all traditional commercial DBMSs. The various
Some data models are schematics which depict the manner in which data records are connected
or related within a file structure. These are called record or structural data models. Some data
models are used to identify the subjects of corporate data processing - these are called entity-
112
relationship data models. Still another type of data model is used for analytic purposes to help
the analyst to solidify the semantics associated with critical corporate or business concepts.
II. Communications tool to facilitate interaction among the designer, the applications
III. Good database design uses an appropriate data model as its foundation
Modern database implementation models were not created from a vacuum. Instead, they are the
end result of decades of evolution. This evolution has been in the form of a series or
progressively more sophisticated data models. Such models are designed by the researchers with
ii. It should be simple and expressible to design the data in the database.
Some of the common characteristics among data models are given as follow:
113
iii. Representation of real-world transformations (behavior) must be in compliance with
The hierarchical data model is the oldest type of data model, developed by IBM in 1968. This
data model organizes the data in a tree-like structure, in which each child node (also known as
dependents) can have only one parent node. The database based on the hierarchical data model
comprises a set of records connected to one another through links. The link is an association
114
between two or more records. The top of the tree structure consists of a single node that does not
The root may have any number of dependents; each of these dependents may have any number
of lower level dependents. Each child node can have only one parent node and a parent node can
have any number of (many) child nodes. It, therefore, represents only one-to-one and one-to-
many relationships. The collection of same type of records is known as a record type. Figure 5.2
shows the hierarchical model of Online Book database. It consists of three record types, namely,
PUBLISHER, BOOK, and REVIEW. For simplicity, only few fields of each record type are
Advantages:
Disadvantages:
115
Figure 5.2: Hierarchical Model of Online Book database
In order to understand the hierarchical data model better, let us take the example of the sample
database consisting of supplier, parts and shipments. The record structure and some sample
records for supplier, parts and shipments elements are as given in following tables.
We assume that each row in Supplier table is identified by a unique SNo (Supplier Number) that
uniquely identifies the entire row of the table. Likewise each part has a unique Pno (Part
Number). Also we assume that no more than one shipment exists for a given supplier/part
116
Hierarchical View for the Suppliers-Parts Database
The tree structure has parts record superior to supplier record. That is parts from the parent and
supplier forms the children. Each of the four trees figure, consists of one part record occurrence,
together with a set of subordinate supplier record occurrences. There is one supplier record for
each supplier of a particular part. Each supplier occurrence includes the corresponding shipment
quantity.
117
For example, supplier S3 supplies 300 quantities of part P2. Note that the set of supplier
occurrences for a given part occurrence may contain any number of members, including zero (for
the case of part P4). Part PI is supplied by two suppliers, S1 and S2. Part P2 is supplied by three
suppliers, S1, S2 and S3 and part P3 supplied by only supplier SI as shown in figure.
There are four basic operations Insert, Update, Delete and Retrieve that can be performed on
each model. Now, we consider in detail that how these basic operations are performed in
Insert Operation: It is not possible to insert the information of the supplier e.g. S4 who does not
supply any part. This is because a node cannot exist without a root. Since, a part P5 that is not
supplied by any supplier can be inserted without any problem, because a parent can exist without
118
any child. So, we can say that insert anomaly exists only for those children, which has no
corresponding parents.
Update Operation: Suppose we wish to change the city of supplier S1 from Qadian to
Jalandhar, then we will have to carry out two operations such as searching S1 for each part and
then multiple updation for different occurrences of S1. But, if we wish to change the city of part
P1 from Qadian to Jalandhar, then these problems will not occur because there is only a single
entry for part P I and the problem of inconsistency will not arise. So, we can say that update
anomalies only exist for children not for parent because children may have multiple entries in the
database.
record. Hence, the only way to delete a shipment (or supplied quantity) is to delete the
corresponding supplier record. But such an action will lead to loss of information of the supplier,
which is not desired. For example: Supplier S2 stops supplying 250 quantity of part PI, then the
whole record of S2 has to be deleted under part PI which may lead to loss the information of
supplier. Another problem will arise if we wish to delete a part information and that part happens
to be only part supplied by some supplier. In hierarchical model, deletion of parent causes the
deletion of child records also and if the child occurrence is the only occurrence in the whole
database, then the information of child records will also lost with the deletion of parent. For
example: if we wish to delete the information of part P2 then we also lost the information of S3,
S2 and S1 supplier. The information of S2 and Sl can be obtained from PI, but the information
Record Retrieval: Record retrieval methods for hierarchical model are complex and
asymmetric.
119
(ii) Network Data Model
The first specification of network data model was presented by Conference on Data Systems
but complicated. In a network model the data is also represented by a collection of records, and
relationships among data are represented by links. However, the link in a network data model
represents an association between precisely two records. Thus, the complete network of
relationships is represented by several pairwise sets; in each set some (one) record type is owner
(at the tail of the network arrow) and one or more record types are members (at the head of the
relationship arrow). Usually, a set defines a 1: M relationship, although 1:1 is permitted. Like
hierarchical data model, each record of a particular record type represents a node. However,
unlike hierarchical data model, all the nodes are linked to each other without any hierarchy. The
main difference between hierarchical and network data model is that in hierarchical data model,
the data is organized in the form of trees and in network data model, the data is organized in the
form of graphs. Figure 5.3 shows the network model of Online Book database.
Advantages:
It includes data definition language (DDL) and data manipulation language (DML)
Disadvantages:
120
Navigational system yields complex implementation, application development and
management.
Considering again the sample supplier-part database, its network view is shown. In addition to
the part and supplier record types, a third record type is introduced which we will call as the
connector. A connector occurrence specifies the association (shipment) between one supplier and
one part. It contains data (quantity of the parts supplied) describing the association between
All connector occurrences for a given supplier are placed on a chain .The chain starts from a
supplier and finally returns to the supplier. Similarly, all connector occurrences for a given part
are placed on a chain starting from the part and finally returning to the same part.
121
Operations on Network Model
Insert Operation: To insert a new record containing the details of a new supplier, we simply
create a new record occurrence. Initially, there will be no connector. The new supplier's chain
will simply consist of a single pointer starting from the supplier to itself.
For example, supplier S4 can be inserted in network model that does not supply any part as a
new record occurrence with a single pointer from S4 to itself. This is not possible in case of
hierarchical model. Similarly a new part can be inserted who does not supplied by any supplier.
Consider another case if supplier S 1 now starts supplying P3 part with quantity 100, then a new
connector containing the 100 as supplied quantity is added in to the model and the pointer of S1
We can summarize that there is no insert anomalies in network model as in hierarchical model.
Update Operation: Unlike hierarchical model, where updation was carried out by search and
had many inconsistency problems, in a network model updating a record is a much easier
process. We can change the city of S I from Qadian to Jalandhar without search or inconsistency
122
problems because the city for S1 appears at just one place in the network model. Similarly, same
Delete operation: If we wish to delete the information of any part say PI, then that record
occurrence can be deleted by removing the corresponding pointers and connectors, without
affecting the supplier who supplies that part i.e. P1, the model is modified as shown. Similarly,
In order to delete the shipment information, the connector for that shipment and its
corresponding pointers are removed without affecting supplier and part information.
123
For example, if supplier SI stops the supply of part PI with 250 quantity the model is modified as
Retrieval Operation: Record retrieval methods for network model are symmetric but complex.
The relational data model was developed by E. F. Codd in 1970. In the relational data model,
unlike the hierarchical and network models, there are no physical links. All data is maintained in
the form of tables (generally, known as relations) consisting of rows and columns. Each row
(record) represents an entity and a column (field) represents an attribute of the entity. The
relationship between the two tables is implemented through a common attribute in the tables and
not by physical links or pointers. This makes the querying much easier in a relational database
system than in the hierarchical or network database systems. Thus, the relational model has
become more programmer friendly and much more dominant and popular in both industrial and
academic scenarios. Oracle, Sybase, DB2, Ingres, Informix, MS-SQL Server are few of the
popular relational DBMSs. Figure 5.4 shows the relational model of Online Book database.
124
Properties of Relational Tables:
Advantages:
Powerful RDBMS isolates the end user from physical level details and improves
Entity: Publisher
125
Entity: Book
Entity: Review
R_ID Rating
A0002 6.0
A0006 7.5
Disadvantages:
Conceptual simplicity gives relatively untrained people the tools to use a good system
poorly, and if unchecked, it may produce the same data anomalies found in file systems.
model
When we move with the data models such as hierarchical model, network model, relational
model, we can identify number of differences in terms of data structures, Data manipulation and
data integrity.
126
structure for storing data in a identify multiple branches based on relation
one root and a number like several trees which share dimensional table.
them
Data One to many or one to Allowed the network model to One to One,
relationships to many
relationships
Data Based on parent child A record can have many Based on relational
children.
as SQL)
127
manipulation are complex and complex and symmetric are simple and
asymmetric symmetric
Data integrity Cannot insert the Does not suffer form any Does not suffer from
parent.
Data integrity Multiple occurrences Free from update anomalies. Free form update
lead to problems of
inconsistency during
Data integrity Deletion of parent Free from deletion anomalies Free from deletion
child records
The Entity - Relationship Model (E-R Model) is a high-level conceptual data model developed
designing a successful database. A conceptual data model is a set of concepts that describe the
structure of a database and associated retrieval and updation transactions on the database. A high
level model is chosen so that all the technical aspects are also covered. The E-R data model grew
out of the exercise of using commercially available DBMS's to model the database. The E-R
model is the generalization of the earlier available commercial models like the Hierarchical and
128
the Network Model. It also allows the representation of the various constraints as well as their
relationships. Therefore, the Entity-Relationship (E-R) Model is based on the view of a real
world that consists of set of objects called entities and relationships among entity sets which are
basically a group of similar objects. The relationships between entity sets is represented by a
named E-R relationship and is of 1:1, 1: N or M: N type which tells the mapping from one entity
set to another. Entity-Relationship model has one important advantage. In as much as it is non-
DBMS specific, and is in fact not a DBMS model at all, data models can be developed by the
design team without first having to make a choice as to which DBMS to use.
1. The E-R diagram used for representing E-R Model can be easily converted into Relations
2. The E-R Model is used for the purpose of good database design by the database developers so
3. It is helpful as a problem decomposition tool as it shows the entities and the relationship
4. It is inherently an iterative process. On later modifications, the entities can be inserted into this
model.
5. It is very simple and easy to understand by various types of users and designers because
Object DBMSs add database functionality to object- oriented programming languages. They
bring much more than persistent storage to programming language objects. Object DBMSs
extend the semantics of the C++, Smalltalk and Java object programming languages to provide
129
full-featured database programming capability, while retaining native language compatibility. A
major benefit of this approach is the unification of the application and database development into
a seamless data model and language environment. As a result, applications require less code, use
more natural data modeling, and code bases are easier to maintain. Object developers can write
According to Rao (1994), "The object-oriented database (OODB) paradigm is the combination of
object-oriented programming language (OOPL) systems and persistent systems. The power of
the OODB comes from the seamless treatment of both persistent data, as found in databases, and
In contrast to a relational DBMS where a complex data structure must be flattened out to fit into
tables or joined together from those tables to form the in-memory structure, object DBMSs have
no performance overhead to store or retrieve a web or hierarchy of interrelated objects. This one-
to-one mapping of object programming language objects to database objects has two benefits
over other storage approaches: it provides higher performance management of objects, and it
enables better management of the complex interrelationships between objects. This makes object
DBMSs better suited to support applications such as financial portfolio risk analysis systems,
telecommunications service applications, world wide web document structures, design and
manufacturing systems, and hospital patient record systems, which have complex relationships
between data.
Advantages:
130
Disadvantages:
Slow development of standards caused vendors to supply their own enhancements, thus
Object/relational database management systems (ORDBMSs) add new object storage capabilities
to the relational systems at the core of modern information systems. These new facilities
integrate management of traditional fielded data, complex objects such as time-series and
geospatial data and diverse binary media such as audio, video, images, and applets. By
encapsulating methods with data structures, an ORDBMS server can execute complex analytical
and data manipulation operations to search and transform multimedia and other complex objects.
As an evolutionary technology, the object/relational (OR) approach has inherited the robust
of its object-oriented cousin. Database designers can work with familiar tabular structures and
Query and procedural languages and call interfaces in ORDBMSs are familiar: SQL3, vendor
procedural languages, and ODBC, JDBC, and proprietary call interfaces are all extensions of
RDBMS languages and interfaces. And the leading vendors are, of course, quite well known:
131
(viii) Semi structured Data Model
Unlike other data models, where every data item of a particular type must have the same set of
attributes, the semi structured data model allows individual data items of the same type to have
different set of attributes. In semi structured data model, the information about the description of
the data (schema) is contained within the data itself, which is sometimes called self-describing
data. In such databases there is no clear separation between the data and the schema and thus,
allowing data of any type. Semi structured data model has recently emerged as an important
There are data sources such as the Web, which is to be treated as databases; however,
The need of flexible format for data exchange between heterogeneous databases.
Semi structured data model facilitates data exchange among heterogeneous data sources. It helps
to discover new data easily and store it. It also facilitates querying the database without knowing
(i) Useful for the Personnel: Now that you know all the different data models, it is only
right that you know the advantages of data models in DBMS. A system of DBMS consists of
database administrators and managers that oversee the entire operation of DBMS. Primarily, the
duties are making sure primary schedule is run daily, loading program releases and the
technicians and programmers with the job of finding errors in the software for testing.
132
(ii) Useful For Records Interrogation: Programs for records interrogation are designed to
provide information to the end users through many programs such as general inquiry programs,
report generators and Query. The Query program is the most popular one, allowing end users to
develop basic skills of programming by constructing simple programs of data using a processor
for query language for data extraction. For records interrogation, Query programs happen to be
quite powerful.
(iii) Useful for Catalog Programs: In a system of DBMS, end users are able to catalog
program favorites to delete, edit or view data. Each of the users is able to copy routines to a
catalog file that is user defined for managing databases. The system of catalogs is a personal tool
used to run programs by end users without having a specialist of applications design programs
for them.
(iv) Useful for Accessing Data: Typically, there are centralized databases for DBMS
needing to create program access or interruption from a programmer. In the software, the record
structures and database are already built in. In this area, the advantage is access to data records
and structures.
5.4 Summary
Models are a blue-print of plan and play a major role in success of any project. In fact models are
more suitable than picture to express the thought because they include some logic and reasoning
in picture to achieve success of any project of an organization. To carry on this idea an early
proposal for a standard terminology and general architecture of database as a system was
produced in 1971 by the DBTG (Data Base Task Group) appointed by the Conference on data
Systems and Languages (CODASYL). The DBTG recognized the need for a two level approach
133
with a system view called the schema and user view called subschema. The American National
Standard Institute Terminology and Architecture (ANSI-SPARC) in 1975 recognized the need
for a three level approach with a system catalog. Therefore, they proposed relational data model.
With the technological improvements and as the need arises further other models are introduced
1. Elmasri & Navathe: Fundamentals of Database systems, 3rd Edition, Addison Wesley,
New Delhi.
2. Korth & Silberschatz : Database System Concept, 4th Edition, McGraw Hill International
Edition.
4. .J.Date: An Introduction to Databases Systems, 7th Edition, Addison Wesley, New Delhi.
5. Alesian Leon, Mathews Leon: Database Management systems, Vikas Publication House
1. What do you mean by a Data Model? Discuss the different types of data models used.
2. Define data model. Under which categories data models are broadly classified? What is
3. Discuss the different data model along with their advantages and disadvantages.
4. Draw a comparative chart among Hierarchical, Network and Relational data models.
134
Chapter – 6: Relational Algebra
135
6.1 Introduction
Relational algebra is a formal language describing how new relations are created from old ones.
It is a useful tool for describing queries on a database management system. To defining the data
structure and constraints, a data model must include a set of operation to manipulate the data. A
basic set of relational model operations constitutes the relational algebra. In modern relational
database management systems each relation is stored as a table. Each row in the table represents
one tuple from the relationship, and each column one attribute. In that sense, we can think of
relational algebra as a language that can be used to describe operations for creating new tables
6.2 Objective
Relational algebra is a query language that is being used to explain basic relational operations
and their principles. Most of the currently used relational database management systems work
with SQL queries. Relational algebra is a good example of procedural language. This helps in
understanding the essential features of the relational model. Relational algebra explores the
various concepts of integrity that apply to the relational model. This chapter will teach us that
Relational algebra is a formal language describing how new relations are created from old ones.
It is a useful tool for describing queries on a database management system. In relational database
management systems each relation is stored as a table. Each row in the table represents one tuple
from the relationship, and each column one attribute. In that sense, we can think of relational
algebra as a language that can be used to describe operations for creating new tables from
136
existing tables in a database management system. Give below are some of the important reasons
Similar to normal algebra (as in 2+3*x-y), except we use relations as values instead of
The inner, lower-level operations of a relational DBMS are, or are similar to, relational
Some advanced SQL queries requires explicit relational algebra operations, most
Relations are seen as sets of tuples, which means that no duplicates are allowed. SQL
SQL is declarative, which means that you tell the DBMS what you want, but not how it is
to be calculated. A C++ or Java program is procedural, which means that you have to
state, step by step, exactly how the result should be calculated. Relational algebra is
The schema of a relation is similar to the format or structure of a table. The schema of a relation
is the set of attributes that forms a tuple for that relation. For example, the following describes
Student(student #, first name, last name, street address, city, state, zip, phone, major, GPA)
When we wish to obtain information from a database, we use a language like SQL to create a
query for the database management system to process. The response from the database will be a
137
result set with a particular schema. The definition above says ―Relational algebra is a formal
language describing how new relations are formed from existing relations.‖ If we think of the
tables in the database as the ―existing relations‖, and the result set of the query as the ―new
relations‖, then relational algebra is a language that can be used to describe data base queries that
Relational algebra is a procedural query language, which takes instances of relations as input and
yields instances of relations as output. It uses operators to perform queries. An operator can be
either unary or binary. The algebra operation thus produces new relations, which can be further
manipulated using operations of the same algebra. Relational algebra specifies the operations to
perform on existing relation to derive result relations. It defines the complete schema for each of
the result relations. The relational algebraic operations can be divided into two basic groups.
1) Relational-Oriented Operations
2) Set-Oriented Operations
This group consists of operations developed specifically for relational databases; these include
SELECT, PROJECT and JOIN. These operations are explained in upcoming paragraphs:
This operation is used to select only some of the tuples from a relation that satisfy a selection
condition (predicate) as shown in figure given below. It can be consider like a filter of rows from
a relation on the basis of certain criteria. In relational algebra SELECT operation is denoted by
138
σ <Selection Condition>(R)
Where σ is symbol, denote SELECT operation, and the selection condition is a Boolean
Consider a relational Employee. We can retrieve the rows of employee those work for
department finance and getting salary more that Rs. 25000/=. We can individually specify each
The boolean expression specified in < selection condition> is made up of a attribute name an
operator and a constant value or attribute name e.g. in above example SALARY is attribute
name, the operator is greater than (>) and 25000 is a constant value.
139
The SELECT operation is Unary; it means it is applied to a single relation. The degree of the
relation (number of displayed attributes) resulting from the SELECT operation is the same as
Example
Query: Retrieve the Id, Name, Age of Students who live in Kurukshetra.
STUDENT
Result
The project operation selects certain columns from the table and discards the other columns. The
projection operation is used to either reduce the number of attributes in the resultant relation or to
reorder attributes. The projection of a relation is defined as a projection of all tuples over some
set of attributes. In relational algebra PROJECT operation is denoted by the symbol π (pi).
140
In general project operation expression is given by
Where π is symbol, denoted PROJECT operation, and the attribute list is a list of attributes from
Consider a relation Employee. We can retrieve only some columns (say Name, Age and Salary
only) of employee table. Then the simplest PROJECT operation expression is given as follows:
ΠNAME,AGE,SALARY (EMPLOYEE)
ΠSEX,SSN (EMPLOYEE)
The result of the project operation has only the attributes specified in <attribute list> and in the
same order as they appear in the list. The PROJECT operation removes any duplicates tuples.
The PROJECT operation is Unary; it means it is applied to a single relation. The number of
tuples in a relation resulting from PROJECT operation is always less than or equal to total
Example
STUDENT
ΠID,NAME (STUDENT)
Result
ID NAME
The Join operation is used to combine related tuples from two relations into single tuples. The
tuples from the operand relations that participate in the operation and contribute to the result are
related. The join operation allows the processing of relationships existing between the operand
142
In general JOIN operation expression is given by
Where ∞ is symbol, denote JOIN operation, and the join condition is a Boolean expression
Let we want to retrieve the name of the manager of each department. We need to combine each
department tuples with the employee tuple whose SSN matched the MGRSSN value in
The JOIN operation is Binary; it means it is always applied with two relations results of the
Generally, a JOIN operation performs with equality comparison only. Such a JOIN where the
only equal (=) comparison operator is used, is called EQUIJOIN. The result of EQUIJOIN
always has one or more pairs of attributes that have identical values.
If the two join attributes have the same name in both relations, such type of join is known as
Example
Query: Retrieve the information of student who enrolls in at least one course.
STUDENT
ID NAME CITY
143
100 Vinod Kaithat
ENROL
EID COURSEID
200 101
100 113
300 101
Result
There are several types of joins, but the most basic type is the Cartesian join. Other joins,
including the natural join, the equi-join and the theta join, are variations of the Cartesian join in
which special rules are applied. Cartesian product will be discussed later in this chapter. Rest of
144
a) Natural Join
A natural join is performed on two relations that share at least one attribute, and is defined as
follows:
The natural join of relation A with relation B is a new relation formed by matching all tuples
from relation A one by one with all tuples from relation B one by one, but only where the value
of the shared attributes are the same. Each shared attribute is only included once in the schema
of the result set. A natural join can only be performed on two relations that have at least one
shared attribute.
b) Theta Join
A theta join is similar to a Cartesian join, except that only those tuples are included that meet a
The theta join of relation A with relation B is a new relation formed by matching all tuples from
relation A one by one with all tuples from relation B one by one, but only where the tuples meet
a specified condition, called the theta predicate. If the relations share any attributes, then each
shared attribute is only included once in the schema of the result set.
The general symbol for a theta join is composite symbol, similar to the symbol for a natural join
subscripted with the Greek letter theta: ⋈θ .We would write C = A ⋈θ B. In practice, the theta
c) Equi-Join
An equi-join, which is similar to both a theta join and a natural join, is defined as follows:
The equi-join of relation A with relation B is a new relation formed by matching all tuples from
relation A one by one with all tuples from relation B one by one, but only where the tuples meet
145
a specified condition of equality, called the equi-join predicate. If the relations share any
attributes, then each shared attribute is only included once in the schema of the result set.
The symbolism for an equi-join is similar to the symbol for a natural join subscripted with the
equal sign: ⋈= .We would write C = A ⋈= B. Just as with the theta join, in practice the equal
The difference between an equi-join and a theta join is that the condition must be one of equality
in an equi-join. The difference between an equi-join and a natural join is that the two relations
Example:
We wish to match groups of people waiting for tables at a restaurant with the available tables, on
the condition that the number of people in the group equals the number of seats at the table.
Let W = the relation with data for all the groups waiting for tables
Let T = the relation with data for all of the available tables
M = W ⋈group.size = table.seats T
In this notation the attribute names are shown in their more complex form, relation.attribute, so
that group.size refers to the size attribute of the group relation, and table.seats refers to the seats
In general, several relation algebra operations are applied one after another. In this situation we
can write the operation as a nesting algebra expression or apply one operation at a time by
creating intermediate result. For example, to retrieve the first name,lastname, and the salary of all
146
employee who work in the department ‗FINANCE‘. We must be applied SELECT and
PROJECT operations. The combine relation algebraic operation can be given as follows:
These are the traditional set theory operations that include UNION, INTERSECTION, SET
DIFFERENCE and CARTESIAN PRODUCT. These are binary operation it means these are
applied in two sets. These operations are applied on only those relations that have same number
of attributes of same data type. This condition is called union compatibility. In other words two
relation R and S are said to be union compatible if they have same degree n and each pair of
(i) Union
The Union of relation A and relation B is a new relation containing all of the tuples contained in
either relation A or relation B. Union can only be performed on two relations that have the same
schema. The symbol for union is ∪.In relational algebra we would write something like R3 =
R1 ∪ R2. In other words Union is a binary operation performed on two relations and the new
table contains all the rows from both the tables, but duplicate in two tables will be shown only
once in the resultant tables. Two tables must be of same degree i.e. same number of columns and
Example:
147
R1
Mukta 1 34
Satvik 2 25
Aryan 3 18
R2
Arpit 5 13
Sidhant 10 17
Prerna 15 41
Satvik 2 25
Aryan 3 18
R3 = R1 U R2
Mukta 1 34
Satvik 2 25
Aryan 3 18
Arpit 5 13
Sidhant 10 17
Prerna 15 41
148
The union operation is both commutative and associative.
(ii) Intersection
Like union, intersection means pretty much the same thing in relational algebra that it does in
The Intersection of relation A and relation B is a new relation containing all of the tuples that are
contained in both relation A and relation B. Intersection can only be performed on two relations
that have the same schema. The symbol for intersection is ∩. In relational algebra we would
Example:
R1
Mukta 1 34
Satvik 2 25
Aryan 3 18
R2
Arpit 5 13
Sidhant 10 17
Prerna 15 41
Satvik 2 25
149
Aryan 3 18
R3 = R1 ∩ R2
Satvik 2 25
Aryan 3 18
Unlike union, however, intersection is not considered as a basic operation, but a derived
operation, because it can be derived from the basic operations. We will look at difference next,
R1 ∩ R2 = R1 – (R1 – R2)
For our purposes, however, it really doesn‘t matter that intersection is a derived operation.
The difference operation also means pretty much the same thing in relational algebra that it does
The difference between relation A and relation B is a new relation containing all of the tuples
that are contained in relation A but not in relation B. Difference can only be performed on two
relations that have the same schema. The symbol for difference is the same as the minus sign -
Example:
150
R1
Mukta 1 34
Satvik 2 25
Aryan 3 18
R2
Arpit 5 13
Sidhant 10 17
Prerna 15 41
Satvik 2 25
R3 = R1 - R2
Mukta 1 34
Aryan 3 18
R3 = R2 – R1
Arpit 5 13
Sidhant 10 17
Prerna 15 41
A-B ≠B-A
(A - B) - C ≠ A - (B - C)
151
(iv) Cartesian Join
The PRODUCT operation combines information from two relations pair wise on tuples. The
Cartesian product of two relations is the concatenation of tuples that belong to the two relations.
In other words the Cartesian product of two relations results in a new relation that includes every
It is not required that the two tables should be union compatible or that are of same degree. It is a
Specify the Cartesian product of two sets X (for example the points on X-axis) and Y (for
example the points on Y-axis), denoted X * Y, is the set of all possible ordered pairs whose first
component is a member of X and whose second component is a member of Y (e.g. the whole of
Imagine that we have two sets, one composed of letters and one composed of numbers, as
follows:
S1 = { a, b, c, d} and S2 = { 1,2,3}
The cross product of the two sets is a set of ordered pairs, matching each value from S1 with
S1 x S2 = { (a,1), (a2), (a3), (b1), (b2), (b3), (c,1), (c2), (c3), (d1), (d2), (d3) }
One exception is with the empty set, which acts as a ―zero‖ and for equal sets.
In relational algebra, the cross product of two relations is also called the Cartesian Product or
Cartesian Join.
Example
152
R1
Mukta 1 34
Satvik 2 25
Aryan 3 18
R2
ID Hobby
101 Music
102 Dance
103 Cricket
R1* R2
153
Aryan 3 18 102 Dance
Although Cartesian Joins form the conceptual basis for all other joins, they are rarely used in
actual database management systems because they often result in a relation with a large amount
of data. Consider the case of a table with data for 40,000 students, with each row needing 300
bytes of storage space, and a table for 2,000 advisors, with each row needing 200 bytes. The
two original tables would need about 12,000,000 and 400,000 bytes of storage space (12
megabytes and 400 kilobytes). The Cartesian join of these two would have 80,000,000 records,
each with nearly 600 bytes of storage space for a total of 48,000,000,000 bytes (48 gigabytes).
Another reason that Cartesian joins are not used often is this: What is the value of a Cartesian
The other types of joins, which are based on the Cartesian join, are used more often, and are
(v) Division
The division is a binary operation that is written as R ÷ S. The result consists of the restriction of
tuples in R to the attribute names unique to R, i.e. in the header of R but not in the header of S,
for which it holds that all their combinations with tuples in S are present in R.
The division is very useful for a special kind of query such as ―Retrieve the name of the student
154
6.3.5 Sample Queries Using Relational Algebra:
Consider the following Relation, their respective attributes and tuple to solve queries in
Relational Algebra. Only one specimen tuple has been taken. You can have more tuple as per
requirement.
Employee
Fname Mname Lname SSN BDate Address Sex Salary SuperSSN Dno.
. . . . . . . . . .
. . . . . . . . . .
Department
. . . .
. . . .
Dept_Location
DNumber DLocation
5 Old Campus
. .
. .
Works_On
155
12345 1 32
. . .
. . .
Project
. . . .
. . . .
Dependent
. . . . .
. . . . .
QUERY 1
Retrieve the name and address of all employees who work for the 'Research' department.
156
This query could be specified in other ways; for example, the order of the JOIN and SELECT
operations could be reversed, or the JOIN could be replaced by a NATURAL JOIN after
QUERY 2
For every project located in 'Stafford', list the project number, the controlling department
number, and the department manager's last name, address, and birth date.
QUERY 3
Find the names of employees who work on all the projects controlled by department number 5.
QUERY 4
Make a list of project numbers for projects that involve an employee whose last name is 'Smith',
157
SMITH_MGR_PROJS (PNO) ← Π PNUMBER (SMITH_MANAGED_DEPTS * PROJ ECT)
QUERY 5
List the names of all employees with more than two dependents.
This query cannot be done in the basic (original) relational algebra. We have to use the
We assume that dependents of the same employee have distinct DEPENDENT_NAME values.
T 2 ← (σNO_OF_DEPS >(T1)
QUERY 6
This is an example of the type of query that uses the MINUS (SET DIFFERENCE) operation.
QUERY 7
MGRS(SSN) ← Π MGRSSN(DEPARTMENT)
158
The same query can in general be specified in many different ways. For example, the operations
can often be applied in various orders. In addition, some operations can be used to replace others;
for example, the INTERSECTION operation in Query 7 can be replaced by a NATURAL JOIN.
6.3.6 Equivalence
The same relational algebraic expression can be written in many different ways. The order in
A ×B ⇔ B × A
A∩B⇔B∩A
A ∪B ⇔ B ∪ A
πa1(A) ⇔ πa1(πa1,etc(A))
While equivalent expressions always give the same result, some may be much easier to evaluate
those others. When any query is submitted to the DBMS, its query optimizer tries to find the
Relational Algebra: The result of every expression is a relation. It has a rigorous foundation has
simple semantics. It is used for reasoning, query optimization etc. Relational algebra is the
mathematical basis for relational databases developed by E.F. Codd. It is a kind of set theory that
gives a solid provable framework for software design that involves lots of data that must be
managed.
159
SQL: The Structured Query Language (SQL) is the common language of most database software
such as MySql, Postgresql, Oracle, DB2, etc. This language translates the relational theory into
practice but imperfectly, SQL is a language that is a loose implementation of relational theory
and has been further modified in its actual implementation by the Relational Database
Management System (RDBMS) software that uses it. It is a superset of relational algebra. It has
convenient formatting features etc. It provides aggregate functions. It has complicated semantics.
It is an end-user language.
6.4 Summary
Most commercial relational database system offers a query language. A query language is a
language in which user requests information from the database. The query language may be of
two types, i.e. Procedural Query Language and Non-Procedural Query Language. Relational
Algebra is procedural query language. It is an offshore of first order logic, deals with a set of
relations closed under operators. Operators operate on one or more relations to yield a relation.
Relation algebra is a pure mathematics and an algebraic structure to mathematical logic and set
theory.
1. Elmasri & Navathe: Fundamentals of Database systems, 3rd Edition, Addison Wesley,
New Delhi.
2. Korth & Silberschatz : Database System Concept, 4th Edition, McGraw Hill International
Edition.
160
4. C.J.Date: An Introduction to Databases Systems, 7th Edition, Addison Wesley, New
Delhi.
1. What is Relational Algebra? How many types of operation it support, define each?
4. What are the uses of Relational Algebra? Explain the different set-oriented operations
5. Discuss the equivalence in relational algebra. How you will compare relational algebra
with SQL?
161
Chapter – 7: An Introduction to SQL
Writer: Dr. Kanwal Garg
Vetter: Prof. Rajender Nath
Structure:
7.1 Introduction
7.2 Objective
7.3 Presentation of Content
7.3.1 About SQL
7.3.2 SQL Data Types
7.3.3 SQL Commands
(i) Data Definition Language (DDL)
(ii) Data Manipulation Language (DML)
(iii) Transaction Control Language (TCL)
(iv) Data Control Language (DCL)
7.3.4 Views in SQL
(i) Creating Views
(ii) Updating a View
7.3.5 Queries and Sub-Query in SQL
7.3.6 Constraints in SQL
7.3.7 SQL Indexes
7.3.8 Sample SQL queries examples
7.4 Summary
7.5 Suggested Reading/ Reference Material
7.6 Self Assessment Questions (SAQ)
162
7.1 Introduction
SQL is a data sublanguage used to organize, manage and retrieve data from a relational database,
which is managed by RDBMS. The origin of SQL and the development of relational database
has same revolution path. Dr. E.F. Codd, an IBM researcher, developed the relational database
concept in June 1970. SQL was conceived in an IBM San Jose Research Laboratory in the mid
1970s as a database language for the new relational database model. In the Late 1970s IBM was
ready to develop a relational database system, SQL/DS RDBMS. Upon the news of this
development, vendors rushed to develop their own RDBMS. A small company, Relational
Software Incorporation beat IBM to the market with its own RDBMS. Relational Software
7.2 Objective
SQL gives you everything you need to create, maintain and control your database. Some user
will never have to create a database and will be connected with the querying processes in SQL
found in DML language. Other will not only need to create database but will also have to
maintain and administer database. For these users, SQL provides DDL, DML and DCL.
SQL stands for Structured Query Language. SQL is used to communicate with a database.
According to ANSI (American National Standards Institute), it is the standard language for
relational database management systems. SQL statements are used to perform tasks such as
update data on a database, or retrieve data from a database. Some common relational database
163
management systems that use SQL are: Oracle, Sybase, Microsoft SQL Server, Access, Ingres,
etc.
Oracle uses the table for storing the information in rows and columns. Each column can only
contain one type of data which we must define. A data type is an attribute that specifies the type
of data that the object can hold. The data type fall into following categories:
Varchar/Varchar2
Variable-length character string. Max size is specified in parenthesis.
(size)
SQL commands are instructions, coded into SQL statements, which are used to communicate
with the database to perform specific tasks, work, functions and queries with data.
SQL commands can be used not only for searching the database but also to perform various other
functions like, for example, you can create tables, add data to tables, or modify data, drop the
table, set permissions for users. SQL commands are grouped into four major categories
164
(i) Data Definition Language (DDL) - These SQL commands are used for creating,
modifying, and dropping the structure of database objects. The commands are CREATE,
TRUNCATE- Remove all records from a table, including all spaces allocated for the
(ii) Data Manipulation Language (DML) - These SQL commands are used for storing,
retrieving, modifying, and deleting data. These Data Manipulation Language commands
(iii) Transaction Control Language (TCL) - These SQL commands are used for managing
changes affecting the data. These commands are COMMIT, ROLLBACK, and
SAVEPOINT.
SAVEPOINT- Identify a point in a transaction to which you can later roll back.
165
(iv) Data Control Language (DCL) - These SQL commands are used for providing security
The create table statement is used to create a new table. Here is the formats of a simple create
table statement:
Syntax:
Note: You may have as many columns as you'd like, and the constraints are optional [ ] =
optional.
To create a new table, enter the keywords create table followed by the table name, followed by
an open parenthesis, followed by the first column name, followed by the data type for that
166
important to make sure you use an open parenthesis before the beginning table, and a closing
parenthesis after the end of the last column definition. Make sure you separate each column
definition with a comma. All SQL statements should end with a ";".
The table and column names must start with a letter and can be followed by letters, numbers, or
underscores - not to exceed a total of 30 characters in length. Do not use any SQL reserved
keywords as names for tables or column names (such as "select", "create", "insert", etc).
Data types specify what the type of data can be for that particular column. If a column called
"Last_Name", is to be used to hold names, then that particular column should have a "varchar/
What are constraints? When tables are created, it is common for one or more columns to have
constraints associated with them. A constraint is basically a rule associated with a column that
the data entered into that column must follow. For example, a "primary key" constraint specifies
that no two records can have the same value in a particular column. They must all be unique and
cannot have null values. The other two most popular constraints are "not null" which specifies
Example:
(first varchar(15),
last varchar(20),
age number(3),
address varchar(30),
city varchar(20),
state varchar(20));
167
(b) ALTER TABLE STATEMENT
ALTER TABLE command can be used to add, delete or modify columns in an existing table.
Syntax:
Example:
Syntax:
Example:
Syntax:
Example:
168
(c) DROP TABLE SATEMENT
Syntax:
Example:
The drop table command is used to delete a table and all rows in the table. To delete an entire
table including all of its rows, issue the drop table command followed by the table_name. Drop
table is different from deleting all of the records in the table. Deleting all of the records in the
table leaves the table including column and constraint information. Dropping the table removes
By using rename command table name wil be changed to new name. The data of the table will
not be lost.
Syntax:
Example:
Truncate command removes all rows from a table, but the table structure and its columns,
constraints, indexes and so on remains. In SQL, truncate table command quickly removes all
Syntax:
169
truncate table ―table_name‖;
Example:
The SELECT is used to query the database and retrieve selected data that match the specific
Syntax:
FROM ―table_name‖
[WHERE Clause]
[GROUP BY clause]
[HAVING clause]
[ORDER BY clause];
= Equal
Note: You may have as many clause as you'd like, and the clause are optional [ ] = optional.
Example:
170
select * from employee;
In a table, a column may contain many duplicate values; and sometimes you only want to list the
different (distinct) values. The DISTINCT keyword can be used to return only distinct (different)
values.
Syntax:
Example:
The insert statement is used to insert or add a row of data into the table. There are two basic
Syntax1:
first_column,...last_column)
values (first_value,...last_value);
In the example below, the column name first will match up with the value 'Satvik', and the column
Syntax2:
You may not need to specify the column(s) name in the SQL query if you are adding values for
all the columns of the table. But make sure the order of the values is in the same order as the
columns in the table. The SQL INSERT INTO syntax would be as follows:
171
insert into ―table_name‖ values (first_value1,second_value2,...last_value);
Example 1:
Example 2:
To insert records into a table, enter the key words insert into followed by the table name,
followed by a closing parenthesis, followed by the keyword values, followed by the list of values
enclosed in parenthesis. The values that you enter will be held in the rows and they will match up
with the column names that you specify. Strings should be enclosed in single quotes, and
The update statement is used to update or change records that match specified criteria. This is
172
Syntax1:
update "tablename"
[,"nextcolumn" = "newvalue2"...]
In the above syntax Where clause is introduced, only when we want to update the table data
based on specified condition. If we want to update the attribute value for all the rows of a
Syntax2:
update "tablename"
[,"nextcolumn" = "newvalue2"...]
Example1:
update employee
Note: The example 1 will replace attribute (first) value for the above first tuples as Aryan only
Example2:
update employee
Note: The example 2 will replace attribute (first) value for all tuples as Aryan.
The delete statement is used to delete records or rows from the table.
173
Syntax:
Example:
Note: if you leave off the where clause, all records will be deleted.
To delete an entire record/row from a table, enter "delete from" followed by the table name. If
delete statement is followed by the where clause which contains the conditions to delete. Then
those rows will be delete for which the where condition is true.
(a) Commit
The commit statement saves all changes made to the database since the last commit or rollback
command. In Oracle, changes made to the database are not permanent until you tell the oracle to
make it permanent. The commit statement makes permanent any change to the database during
Syntax:
Commit:
Example:
Commit:
174
(b) Rollback
The rollback statement is the reverse of commit statement. It undoes some or all database
changes made during the current transaction. The rollback command is the transaction control
command used to undo transaction that have not already been saved to the database. The rollack
command can only be used to undo transaction since the last Commit or Rollback command was
issued.
Syntax:
Rollback:
Example:
Rollback:
(c) Savepoint
Savepoint is the a special mark inside a transaction that allows all command that are executed
after it was established to be rolled back, restoring the transaction state to what it was at the time
of the Savepoint. The Savepoint statement defines a Savepoint with in a transaction. Changes
made after a Savepoint can be undone at any time prior to the end of the transaction. A
(a) Grant
This statement deals with who has access to your database? The GRANT command enables you
to grant privileges to user. You can grant privilege on seeing, adding, deleting, referencing and
using of a table.
175
If you decide that Mukta can see your tables in your database, you would use GRANT SELECT
statement. If you desire, you can allow her to insert or update your data in tables. The command
Syntax:
Example:
Grant Select on Book to Mukta; (Here Book is an entity on which privilege has been granted for
Mukta.)
(b) Revoke
Whatever you grant, however, you may also revoke. The REVOKE statement enables you to
Syntax:
Example:
Revoke Select on Book to Mukta; (Here Book is an entity on which privilege has been revoked
for Mukta.)
A view is nothing more than a SQL statement that is stored in the database with an associated
name. A view is actually a composition of a table in the form of a predefined SQL query. A view
can contain all rows of a table or select rows from a table. A view can be created from one or
many tables which depend on the written SQL query to create a view.
Views, which are kind of virtual tables, allow users to do the following:
Structure data in a way that users or classes of users find natural or intuitive.
176
Restrict access to the data such that a user can see and (sometimes) modify exactly what
Summarize data from various tables which can be used to generate reports.
Database views are created using the CREATE VIEW statement. Views can be created from a
single table, multiple tables, or another view. To create a view, a user must have the appropriate
Syntax:
FROM table_name
WHERE [condition];
Note: You can include multiple tables in your SELECT statement in very similar way as you use
177
The query may not contain GROUP BY or HAVING.
All NOT NULL columns from the base table must be included in the view in order for
So if a view satisfies all the above-mentioned rules then you can update a view.
A query is a means to retrieve meaningful information from the database. There are different
ways to execute query as discussed earlier in this chapter. A Subquery or Inner query or Nested
query is a query within another SQL query and embedded within the WHERE clause.
A sub-query is used to return data that will be used in the main query as a condition to further
Sub-queries can be used with the SELECT, INSERT, UPDATE, and DELETE statements along
with the operators like =, <, >, >=, <=, IN, BETWEEN etc.
A sub-query can have only one column in the SELECT clause, unless multiple columns
are in the main query for the sub-query to compare its selected columns.
An ORDER BY clause cannot be used in a sub-query, although the main query can use
an ORDER BY. The GROUP BY clause can be used to perform the same function as the
ORDER BY in a sub-query.
Sub-queries that return more than one row can only be used with multiple value
178
The SELECT list cannot include any references to values that evaluate to a BLOB,
The BETWEEN operator cannot be used with a sub-query; however, the BETWEEN
The SQL CONSTRAINTS are some restrictions in the form of rules which defines some
conditions that restricts the column to remain true while inserting or updating or deleting data in
the column. Constraints can be specified when the table created first with CREATE TABLE
statement or at the time of modification of structure of an existing table with ALTER TABLE
statement.
The SQL CONSTRAINTS are used to implement the rules of the table. If there is any action
which violates the rules so defined by the SQL constraints, then the action is aborted by the
constraint. Some CONSTRAINTS can be used along with the SQL CREATE TABLE statement.
5. Check constraint
6. Default constraint
179
(i) Primary Key: Primary Key of a relational table uniquely identifies each record in the
table. It can be either be a normal attribute that is guaranteed to be unique such as in a school
name should be same of any student but roll number never be same of any student in a school.
(ii) Foreign Key: One of the most important concepts in database is creating relationships
between database tables. These relationships provide a mechanism for linking data stored in
multiple tables and retrieving it in an efficient manner. In order to create a link between two
tables we must specify a foreign key in one table that references a column in another table.
(iii) Unique Key: Unique key constraint is used to make sure that there is no duplicate value
in that column. Both unique key and primary key enforces the uniqueness of column but there is
one difference between them Unique key constraint allow one null value but primary key does
not null value. In a table we create one primary key but we can create more than one unique key
in Sql Server.
(iv) Not null constraint: Not null constraint is used to restrict the insertion of null value in
that column. Not null constraint is used for that column which is not ignorable.
(v) Check Constraint: This constraint is used to check value at the time of insertion like as
salary of any employee is always greater than zero. So we can create a check constraint on
employee table for all the field values which are greater than zero.
(vi) Default Constraint: The Default constraint is used to set a specific value of column if
we are not passing the value at the time of insertion. Through this constraint we set the default
value of column.
Index in sql is created on existing tables to retrieve the rows quickly. When there are thousands
of records in a table, retrieving information will take a long time. Therefore indexes are created
180
on columns which are accessed frequently, so that the information can be retrieved quickly.
Indexes can be created on a single column or a group of columns. When an index is created, it
first sorts the data and then it assigns a ROWID for each row.
ON table_name (column_name1,column_name2...);
In Oracle there are two types of SQL index namely, implicit and explicit.
They are created when a column is explicity defined with PRIMARY KEY, UNIQUE KEY
Constraint.
NOTE:
1) Even though sql indexes are created to access the rows in the table quickly, they slow down
DML operations like INSERT, UPDATE, DELETE on the table, because the indexes and tables
both are updated along when a DML operation is performed. So, use indexes only on columns
181
Relational Diagram
Write the SQL code that will create the table structure for a table named EMP_1. This
EMP_INITIAL CHAR(1),
EMP_HIREDATE DATE,
JOB_CODE CHAR(3),
182
EMP_YEARS NUMBER(3),
Write the SQL code to enter the first two rows for EMP_1 table.
INSERT INTO EMP_1 VALUES (‗101‘, ‗News‘, ‗John‘, ‗G‘, ‘08-Nov-00‘, ‗502‘);
INSERT INTO EMP_1 VALUES (‗102‘, ‗Senior‘, ‗David‘, ‗H‘, ‘12-Jul-89‘, ‗501‘);
After inserting multiple rows in table EMP_1, the records in the table are shown as below:
Assuming that the data shown in the EMP_1 table have been entered, write the SQL code
SELECT *
FROM EMP_1
Write the SQL code that will save the changes made to the EMP_1 table.
COMMIT;
Write the SQL code to change the job code to 501 for the person whose personnel number
is 107. After you have completed the task, examine the results, and then reset the job code
UPDATE EMP_1
183
To see the changes:
SELECT *
FROM EMP_1
To reset, use
ROLLBACK;
Write the SQL code to delete the row for the person named William Smithfield, who was
hired on June 22, 2004 and whose job code classification is 500.
Write the SQL code that will restore the data to its original status; that is, the table should
contain the data that existed before you made the changes in Questions 5 and 6.
ROLLBACK;
Write the SQL code to create a copy of EMP_1, naming the copy EMP_2. Then write the
SQL code that will add the attributes EMP_PCT and PROJ_NUM to its structure. The
EMP_PCT is the bonus percentage to be paid to each employee. The new attribute
characteristics are:
There are two way to get this job done. The two possible solutions are shown next.
184
Solution A:
EMP_INITIAL CHAR(1),
Solution B:
185
Write the SQL code to enter an EMP_PCT value of 3.85 for the person whose employee
UPDATE EMP_2
Using a single command sequence, write the SQL code that will enter the project number
UPDATE EMP_2
Using a single command sequence, write the SQL code that will enter the project number
higher.
UPDATE EMP_2
Let the table look like as below after the above database operations:
186
Write the SQL code that will change the PROJ_NUM to 14 for those employees who were
hired before January 1, 1994 and whose job code is at least 501.
UPDATE EMP_2
Write the SQL code required to list all employees whose last names start with Smith. In
other words, the rows for both Smith and Smithfield should be included in the listing.
SELECT *
FROM EMP_2
7.4 Summary
The Structured Query Language, or SQL, is one of the most powerful tools available today when
it comes to working with data sets and getting the information you need from databases. SQL is a
bit different from programming, that it requires that you ask the database for the information you
are looking for without regard for how that information will be retrieved. You only want the
information in the manner in which you have requested it. The SQL engine will take care of
1. Abbey, Abramson & Corey: Oracle 8i-A Beginner's Guide, Tata McGraw Hill Publishing
Company Ltd.
187
2. Ivan Bayross: SQL, PL/SQL-The Program Language of ORACLE,BPB Publication,
New Delhi.
3. Elmasri & Navathe: Fundamentals of Database systems, 3rd Edition, Addison Wesley,
New Delhi.
4. Korth & Silberschatz: Database System Concept, 4th Edition, McGraw Hill International
Edition.
1. What are SQL commands? Discuss the different components of SQL with syntax and
suitable examples.
2. What are SQL DDL commands? How integrity constraints are achieved by SQL?
3. What is the difference between DROP and DELETE command? Explain with examples.
188
Chapter – 8: Functional Dependencies
Writer: Dr. Kanwal Garg
Vetter: Prof. Rajender Nath
Structure:
8.1 Introduction
8.2 Objective
8.3 Presentation of Content
8.3.1 Functional Dependency
8.3.2 Importance of Dependencies
8.3.3 Types of Functional Dependency
8.3.4 Closure Set of Functional Dependency
8.3.5 Armstrong‘s Axioms
8.3.6 Minimal Functional Dependencies or Irreducible Set of Dependencies
8.4 Summary
8.5 Suggested Reading/ Reference Material
8.6 Self Assessment Questions (SAQ)
189
8.1 Introduction
The purpose of database design is to arrange the data field into an organized structure such that it
generates set of relationships and stores information without unnecessary redundancy. In fact, the
redundancy and database consistency are the most important logical criteria in database design.
A bad database design may result into repetitive data and information and an inability to
represent desired information. It is therefore, important to examine the relationships that exist
among the data of an entity to refine the database design. In the present chapter, functional
dependency concepts have been discussed to achieve the minimum redundancy without
8.2 Objective
To develop a good description of the data, its relationships and constraints, it is necessary to
understand the concept of functional dependency (FD). FD produces a stable set of relations that
is a faithful model for the enterprise. Such models are highly flexible. It also helps in reducing
redundancy, saving memory space and to maintain consistency among the data. Database
dependencies are important to understand because they provide the basic building blocks used in
database normalization.
Defining functional dependency is an important part of relational database design and contributes
to aspect normalization.
Functional dependency is a relationship that exists when one attribute uniquely determines
another attribute. If R is a relation with attributes X and Y, a functional dependency between the
190
attributes is represented as X →Y, which specifies Y is functionally dependent on X. Here X is a
determinant set and Y is a dependent attribute. Each value of X is associated precisely with one
Y value.
Functional dependency defines Boyce-Codd normal form and third normal form in
tuple and determines the value of all other attributes in the relation. In some cases, functionally
The left-hand set of functional dependency cannot be reduced, since this may change the
Reducing any of the existing functional dependency might change the content of the set
"X" such that for a given value of a determinant ‖Y‖ the value of the attribute ―X‖ is uniquely
defined.
The concept of describing the whole database as a single universal relation schema.
Formal definition:
that are subsets of R specifies a constraint on the possible tuples that can form a relation
191
state r of R. The constraint is that, for any two tuples t1 and t2 in r that have t 1[X] =
o That is, the values of the Y component of a tuple in r depend on the values of the X
component.
o The set of attributes X is called the left-hand side of the FD, and Y is called the right-
hand side .
o X functionally determines Y in a relation schema R if and only if, whenever two tuples of
r(R) agree on their X-value, they must necessarily agree on their Y-value.
If X is a candidate key of R, then there exists a FD from X to Y for any subset of attributes Y
of R.
Relation extensions r(R) that satisfy the functional dependency constraints are called legal
particular legal relation state (extension) r of R. That is, a functional dependency must hold
In a Functional Dependency:
X is a determinant
X determines Y
Y is functionally dependent on X
X→Y
192
X →Y is trivial if Y X
As per Figure 8.1 given below, C# is a determinant of Cname, Ccity and Cphone" is thus also
"Cname, Ccity and Cphone are functionally dependent on C#". Given a particular value of
Cname value, there exists precisely one corresponding value for each of Cname, Ccity and
Cphone. This is more clearly seen via the following functional dependency diagram:
Similarly in Figure 8.2, "(C#, P#, Date) is a determinant of Qnt" is thus also "Qnt is
functionally dependent on the set of attributes (C#, P#, Date)". The set of attributes is also
Example 1:
Student_ID there will be only one addresss. Therefore the FD will be written as:
Student_ID → Student_address
193
but vice-versa is not true, because several students can live against one address.
Example 2:
In an Employee relation; Social Security Number determines Employee Name and Salary,
because correspond to one SSN there will be only one Emp_name and Salary. Therefore the FD
but vice-versa is not true, because against several employee can have the same name and salary.
A B
1 1
2 4
3 9
4 16
2 4
7 9
8 10
Since for reach value of A, there is associated one and only one value for B. Hence
A→B
194
A B
1 1
2 4
3 9
4 16
2 4
7 9
8 9
As per the definition of Functional dependency, ―An attribute in a relational model is said to be
functionally dependent on another attribute in the table if it can take only one value for a given
value of the attribute upon which it is functionally dependent.‖ Since for A = 3 there is
Supplier Table
S1 Sumit 20 Panipat
S2 Ankit 10 Amritsar
S3 Amit 10 Amritsar
Part Table
195
P2 Bolt Green 17 Amritsar
Shipment Table
S1 P1 270
S1 P2 300
S1 P3 700
S2 P1 270
S2 P2 700
S2 P2 300
Status - Status of the city e.g. A grade cities may have status 10, B grade cities
Here, Sname is FD on Sno. Because, Sname can take only one value for the given value of Sno
(e.g. S1) or in other words there must be one Sname for supplier number S1.
FD is represented as:
Sno→ Sname
196
Similarly, city and status are also FD on Sno, because for each value of Sno there will be only
FD is represented as:
Sno → City
Sno → Status
In this case Qty is FD on combination of Sno, Pno because each combination of Sno and Pno
Dependency Diagrams
A dependency diagram consists of the attribute names and all functional dependencies in a given
Sno City
Sname Status
Sno - Sname
197
Sname - Sno
Sno - City
Sno - Status
Sname - City
Sname - Status
City - Status
Pname
Color
Pno
Weight
Pno - Pname
Pno - Color
Pno - Weight
Sno
Qty
Pno
198
Here following functional dependencies exist in Shipment table is:
The two most important things to remember about functional dependency (fd) are:
(1) Fd‘s are determined by the meaning of the attributes and their role in the "real world" which
(2) Fd‘s are in turn used to group the attributes together to form the normalized relations of the
database.
Which describe the arrangement of room and time for classes taught by the instructors.
These attributes are used to model part of the "real world" in which we have classes at certain
time in certain room lectured by certain instructors. We have certain constraints about the objects
(such as class, time, etc.) in the "real world" and such constraints are in turn represented in terms
of functional dependencies.
(2) At any time and in a given room, there is at most one class being taught there.
199
(a) C → I
(b) TR → C
(c) CT → R
(d) IT →C
Given below is a database which has some tuples violating the above functional dependencies.
It is easy to see that the tuples marked (a), (b), (c), (d) violate the functional dependencies (a),
Database dependencies are important to understand because they provide the basic building
For a table to be in second normal form (2NF), there must be no case of a non-prime
attribute in the table that is functionally dependendent upon a subset of a candidate key.
For a table to be in third normal form (3NF), every non-prime attribute must have a non-
200
For a table to be in fourth normal form (4NF), it must have no multivalued dependencies.
When all the non-key attributes of a relation ‗R‘ are dependent on the key attributes, the
The term full functional dependency is used to describe the minimum set of attributes in the
determinant of an FD. The rules for full functional dependency are that if the set of attributes Y
are to be fully dependent on the set of attributes X, the following must hold:
Example 1:
Let A and B are two attributes of a relation ‗R‘, where B is a non-key attribute which is
functionally dependent on A (key attribute), but not on any proper subset of A. i.e. A→ B. If we
remove any attribute from the relation ‗R‘ then it will violate the concept of functional
dependency.
Example 2:
Let there is a relation Employee with attributes (Emp_id, Emp_name, Emp_addr, Emp_phone).
Here we can see that the emp_id is a primary key attribute and all other attributes are non-key
attributes which are fully functional dependent of primary key. We can say that the relation is in
201
Emp_name
Emp_addr
Emp_id
Emp_phone
Partial Functional Dependency indicates that if A and B are the attributes of a Relation ‗R‘. B is
Partial Functional Dependent on A (A→ B) if there is some attribute that can be removed from A
Example: Let there is a relation Employee _project with attributes (Ecode, Pcode and Dept).
Ecode and Pcode are composite key attribute and Dept is a non-key attribute. Here
Ecode,PCode→ Dept, states that dept is functional dependent upon composite key attribute. If
we remove pcode from composite key and still Ecode → Dept exists then we can say that the
composite)".
202
For the Transaction relation, we may now say that:
Cname is not fully functionally dependent on (C#, P#, Date), it is only partially dependent on it
A transitive functional dependency can occur only in a relation that has three or more attributes.
Let A, B, C are the three attributes in a Relation ‗R‘. Suppose all three of the following
conditions holds:
A→ B
A Multi-Value Dependency (MVD) occurs when two or more independent multi valued facts
about the same attribute occur within the same table. It means that if in a relation R having A, B
and C as attributes, B and C are multi-value facts about A, which is represented as A →B and
A→C, then multi value dependency exist only if B and C are independent on each other.
If t1 and t2 are tuples such that t1.X = t2.X, then there are tuples t3 and t4 such that
203
1. t1.X = t3.X = t4.X
where Z = R - (X U Y)
Examples: For example, imagine a car company that manufactures many models of car, but
always makes both red and blue colors of each model. If you have a table that contains the model
name, color and year of each car the company manufactures, there is a multivalued dependency
in that table. If there is a row for a certain model name and year in blue, there must also be a
A functional dependency X→Y is said to be a trivial Functional Dependency if Y, the right hand
Example: The functional dependency (Ecode, Pcode)→Ecode is trivial because the set{Ecode}
(for the R.H.S. of the functional dependency) is a subset of (Ecode, Pcode) (for the L.H.S. of the
functional dependency).
On the other hand, the functional dependency (Ecode, Pcode) → Ecode, Ename is NON-trivial
because the set {Ecode, Ename } is NOT a subset of the attribute set {Ecode, Pcode }.
Let a relation ‗R‘ have some functional dependencies ‗F‘ specified. The closure of F (usually
written as F+) is the set of all functional dependencies that may be logically derived from ‗F‘.
Often ‗F‘ is the set of most obvious and important functional dependencies and F+, the closure,
is the set of all the functional dependencies including F and those that can be deduced from F.
204
The closure is important and may, for example, be needed in finding one or more candidate keys
of the relation.
To determine the set X+ of attributes that are functionally determined by X based on ―R‖, X+ is
Output: X+.
(1) X+ ¬ X.
(2) Repeat
found ¬ false;
if (Y Í X+)
then begin
found ¬ true;
X+ ¬ X+ È Z;
end;
To determine a systematic way to infer dependencies, we must discover a set of inference rules
that can be used to infer new dependencies from a given set of dependencies. William W.
Armstrong (1974) established a set of rules which can be sued to infer the functional
205
Table 8.1 Inference Rules
Inference
Axiom Name Axiom Example
Rule
if a is set of attributes, b
IR1 Reflexivity SSN,Name → SSN
⊆ a, then a →b
ca→cb Phone
SSN→Name and
if a → b and a → c
IR4 Union or Additivity SSN→Zip then
holds then a→ bc holds
SSN→Name,Zip
SSN→Name,Zip then
Decomposition or if a → bc holds then
IR5 SSN→Name and
Projectivity a → b and a → c holds
SSN→Zip
Amount
206
Inference rules IR1 through IR3 are known as Armstrong’s inference rules.
schema R, any dependency that we can infer from F by using IR1 through IR3 holds in every
By complete, we mean that using IR1 through IR3 repeatedly to infer dependencies until no
more dependencies can be inferred results in the complete set of all possible dependencies
To determine F+ (as shown in section 8.3.4), we need rules for deriving all functional
dependencies that are implied by F. A set of rules that may be used to infer additional
dependencies was proposed by Armstrong in 1974 as shown in Table 8.1. These rules (or
axioms) are a complete set of rules in· that all possible functional dependencies may be derived
The reflexivity rule is the simplest (almost trivial) rule. It states that each subset of X is
For example: {Employee ID, Employee Address} → {Employee Address} is trivial, here
The argumentation ('u rule is also quite simple. It states that if Y is determined by X then a set of
attributes W and Y together will be determined by W and X together. Note that we use the
207
notation WX to mean the collection of all attributes in W and X and write WX rather than the
For example: Rno - Name; Class and Marks is a set of attributes and act as
The transitivity rule is perhaps the most important one. It states that if X functionally determines
For example: Rno →City and City →Status, then Rno →Status should be holding true.
Further axioms may be derived from the above although the above three axioms are sound and
complete in that they do not generate any incorrect functional dependencies (soundness) and they
do generate all possible functional dependencies that can be inferred from F (completeness). The
1. X→Y , Given
2. X→Z, Given
3. X→XZ, Augment 2 by X
4. XZ→Y Z, Augment 1 by Z
1. X→Y Z, Given
208
2. Y Z→Y , Reexivity
Based on the above axioms and the .functional dependencies specified for relation student, we
Often a very large list of dependencies can be derived from a given set F since Rule 1 itself will
lead to a large number of dependencies. Since we have seven attributes (sno, Sname, address,
cno, cname, instructor, office), there are 128 (that is, 2^7) subsets of these attributes. These 128
subsets could form 128 values of X in functional dependencies of the type X ~ Y. Of course,
each value of X will then be associated with a number of values for Y (Y being a subset of x)
Leading to several thousand dependencies. These large numbers of dependencies are not
Although we could follow the present procedure and compute the closure of F to find all the
functional dependencies, the computation requires exponential time and the list of dependencies
is often very large and therefore not very useful. There are two possible approaches that can be
taken to avoid dealing with the large number of dependencies in the closure. 'One' is to deal with
one attribute or a set of attributes at a time and find its closure (i.e. all functional dependencies
209
relating to them). The aim of this exercise is to find what attributes depend on a given set of
attributes and therefore ought to be together. The other approach is to find the minimal· covers.
In discussing the concept of equivalent FDs, it is useful to define the concept of minimal
dependencies so that only the minimal numbers of dependencies need to be enforced by the
A functional depending set S is irreducible if the set has three following properties:
Each left set of a functional dependency of S is irreducible. It means that reducing anyone
attribute from left set will change the content of S (S will lose some information).
Sets of functional dependencies with these properties are also called canonical or minimal.
Let F1 and F2 be two sets of functional dependencies. If F1 º F2, then we say the F1 is a cover of
F2 and F2 is a cover of F1. We also say that F1 covers F2 and vice versa. It is easy to show that
which the right hand side of each fd has only one attribute.
(2) The left hand side of each fd does not have any redundant attribute, i.e., for every fd X → A
in F where X is a composite attribute, and for any proper subset Z of X, the functional
210
(3) F is reduced (without redundant fd‘s). This means that for every X → A in F, the set F - {X
→ A} is NOT equivalent to F.
It is easy to see that for each set F of functional dependencies, there exists a set of functional
(1) Let F‘ = {X→A | X→A ЄF and A is a single attribute}. For each fd X → A1,A2, ... An Є F
(n > 1), put the fd‘s X→ A1, X→ A2, ..., X → An into F‘, where Ai is a single attribute.
(2) While
Z → AЄ(F‘)+,
do
replace X → A with Z → A.
It is important to note that for the above algorithm, the ordering between step (2) and step (3) is
critical.If you first perform step (3) and then perform step (2) of the algorithm, the resulting set
It should be pointed out that for a set of functional dependencies F, there may be more than one
minimal covers of F.
211
(1) Break down the right hand side of each fd’s. After performing step (1) in the algorithm, we
(2) Eliminate redundancy in the left hand side. The fd CD → A is replaced by C → A. This is
(3) Remove redundant fd’s. The fd C → D is eliminated because it can be derived from C → B
Let the relation R be R(ABCDE) and the set of functional dependencies be F = {AB → C,
(1) Break down the right hand side of each fd’s. F‘ = {AB → C, ABC → D, AE → B,
AE → C, BC → A, BC → E}.
(2) Eliminate redundancy in the left hand side. The fd ABC → D is replaced by AB → D
because AB+ = ABCDE, and hence the attribute C in fd ABC → D is redundant. Note that we
could also replace ABC -> D by BC → D, because BC+ = BCADE. But we do NOT need to
include both AB → D and BC → D in F‘: one of the two is sufficient. No other composite left
hand side of fd‘s can be reduced further, and thus we get F‘ = {AB → C, AB → D, AE → B,
AE → C, BC → A, BC → E}.
212
(3) Remove redundant fd’s. The fd AE → C is redundant because we can derive it from AE →
Note: If we choose to replace ABC → D by BC→D in step (2) above, we would get an
8.4 Summary
given relation. It is a kind of integrity constraint that generalizes the concepts of a key. Let X and
Y are two attributes of a relation. Given the value of X, if there is only one value of Y
a property of the relation schema R, not of a particular legal relation state ‗r‘ of ‗R‘. Hence a FD
cannot be inferred automatically from a given relation state ‗r‘ but must be defined explicitly by
1. Elmasri & Navathe: Fundamentals of Database systems, 3rd Edition, Addison Wesley,
New Delhi.
2. Korth & Silberschatz : Database System Concept, 4th Edition, McGraw Hill International
Edition.
3. Abbey, Abramson & Corey: Oracle 8i-A Beginner's Guide, Tata McGraw Hill Publishing
Company Ltd.
4. S.K.Singh: Database Systems Concept, Design and Applications, 2006, Pearson Education,
ISBN: 81-7758-567-3.
213
6. Self Assessment Questions (SAQ)
1. What is Functional Dependency? Define the term full and partial functional dependency
3. What do you mean by functional dependency? Discuss the different type of functional
dependency.
214
Chapter – 9: Normalization
Writer: Dr. Kanwal Garg
Vetter: Prof. Rajender Nath
Structure:
9.1 Introduction
9.2 Objective
9.3 Presentation of Content
9.3.1 Bad Database Design
9.3.2 Database Anomalies
9.3.3 Normalization
(i) Rules of Normalization
9.3.4 Normal Forms
(i) First Normal Form (1NF)
(ii) Second Normal Form (2NF)
(iii) Third Normal Form (3NF)
(iv) Boycee- Codd Normal Form (BCNF)
9.3.5 Example First and Second Normal Form
9.4 Summary
9.5 Suggested Reading/ Reference Material
9.6 Self Assessment Questions (SAQ)
215
9.1 Introduction
Normalization is a rigorous design tool that is based on the mathematical theory of relations
which will result in very practical operational implementations. A properly normalized set of
relations actually simplifies the retrieval and maintenance processes and the effort spent in
were simply seen as file structures of some vague file system, then the power and flexibility of
RDBMS cannot be exploited to the full. Good database design needless to say, is important.
that a database structure is suitable for general purpose querying and free from database
anomalies. Dr. E.F. Codd, the inventor of the relational model, introduced the concept of
normalization and what we now know as the first normal form in 1970. Dr. Codd went on to
define second and third normal forms in 1971, and Codd and Raymond F. Boycee defined the
9.2 Objective
Normalization is a logical database design that involves organizing the data into more than one
table. Normalization improves the performance by reducing redundancy in database tables. The
basic objectives of normalization are to reduce redundancy, which means that information is to
be stored only once in a relation. Storing information several times leads to wastage of storage
space and increase in the total size of the data stored. There are certain goals of normalization
process:
216
Eliminating the columns that are not dependent on key attribute.
E.Codd has identified certain structural features in a relation which create retrieval and update
problems. Suppose we start off with a relation with a structure and details like:
This is a simple and straightforward design. It consists of one relation where we have a single
tuple for every customer and under that customer we keep all his transaction records about parts,
up to a possible maximum of 9 transactions. For every new transaction, we need not repeat the
customer details (of name, city and telephone), we simply add on a transaction detail.
We have set a limit of 9 (or whatever reasonable value) transactions per customer. What
For customers with less than 9 transactions, it appears that we have to store null values in
The transactions appear to be kept in ascending order of P#s. What if we have to delete,
for customer Codd, the part numbered 1- should we move the part numbered 2 up (or
217
rather, left)? If we did, what if we decide later to re-insert part 2? The additions and
Let us try to construct a query to "Find which customer(s) bought P# 2" ? The query would have
to access every customer tuple and for each tuple, examine every of its transaction looking for
Alternatively, why don't we re-structure our relation such that we do not restrict the number of
This way, a customer can have just any number of Part transactions without worrying about any
upper limit or wasted space through null values (as it was with the previous structure).
Constructing a query to "Find which customer(s) bought P# 2" is not as cumbersome as before as
It seems a waste of storage to keep repeated values of Cname, Ccity and Cphone.
If C# 1 were to change his telephone number, we would have to ensure that we update ALL
occurrences of C# 1's Cphone values. This means updating tuple 1, tuple 2 and all other
inconsistent state.
Suppose we now have a new customer with C# 4. However, there is no part transaction yet
with the customer as he has not ordered anything yet. We may find that we cannot insert this
218
new information because we do not have a P# which serves as part of the 'primary key' of a
Suppose the third transaction has been canceled, i.e. we no longer need information about 25 of
P# 1 being ordered on 26 Jan. We thus delete the third tuple. We are then left with the following
relation:
But then, suppose we need information about the customer "Martin", say the city he is located in.
Unfortunately as information about Martin was held in only that tuple and having the entire tuple
deleted because of its P# transaction, meant also that we have lost all information about Martin
As illustrated in the above instances, we note that badly designed, un-normalized relations waste
A serious problem with the relation as base relation is the problem of suffering from anomalies
such as insertion, deletion and update anomalies as explained below. To understand these
anomalies let us consider a relation ‗Deptt‘ with attributes i.e. {Deptt_id, Deptt_name,
Deptt_course)
Deptt
219
102 DCSA M.SC
202 UIET IT
Insertion Anomalies: It may not be possible to store information unless some other
information is stored as well e.g. if Deptt_name offered one more Deptt_course i.e. M.Phil.,
We cannot enter this data into the table until a student opt for this course.
Deletion Anomalies: It may not be possible to delete some information without loosing
some other information as well. e.g. if department want to close a specific course, but we
cannot do so until all the students offered that course are deleted respectively.
Update Anomalies: If one copy of such repeated data is updated, an inconsistency is created
unless all copies are similarly updated e.g. if we want to change the name of a department,
but we cannot do so, until the name is respectively changed to other tables storing such data
as well.
9.3.3 Normalization
It is a technique used to design relational database. In order to design a relational model we have
Normalization is a process in which data can be defined as a process during which redundant
220
Normalization is typically a refinement process after the initial exercise of identifying the data
objects that should be in the database, identifying their relationships and defining the tables
Normalization is a specific relational database analysis and design technique used to model
groups of related data within an organization. Its purpose is to ensure data stored within the
database adheres to best practices by following a set of rules with the purpose of eliminating
redundancies and optimizing the process of information retrieval. Normalization leaves us with a
structure that groups like data into relational models referenced by keys and linked to other
Normalization is represented by a logical set of steps that follow simple rules that are applied to
each stage of the modeling process. At the highest level the stages are separated into something
Initially there were only three normal forms, First Normal Form (1NF), Second Normal Form
(2NF) and Third Normal Form (3NF), but over time three more were added. In general terms the
first three are more commonly used in database modeling. The additional three are identification
of potential redundancies that could be considered but however when applied practically can lead
In addition we have something called Un-Normalized Form (UNF), though not generally
considered as part of the Normalization rules, is representative of the very first stages of the
Normalization process.
We can identify each of the normal forms as follows and will define each in detail thereafter:
221
1. Un-Normalized Form (UNF) – Data Modeling
directly linked to the database design. Some rules that should be followed to achieve a good
222
Each table should store data for a single type of entity.
Normalization depends on certain specified constraints, and rules that support the codd‘s
RDBMS rules. One of the constraints between two sets of attributes from the database is the
towards normalization.
The Normal forms are applicable to individual tables; to say that an entire database is, in normal
form is to say that all of its tables are in the normal form. In the upcoming sections we are
Normalization works through a series of stages called normal forms. In a good database design
we need some guidance to decompose the relation into smaller relation. To provide such,
guidance several normal forms have been proposed. The normal forms based on functional
dependency are first normal form (1NF), second normal form (2NF), third normal form (3NF)
and Boycee-Codd normal Form (BCNF). These all normal forms are based on Primary key.
IT states that the domain of an attribute must include only atomic value and that the value of any
attribute in a tuple must be a single value from the domain of that attribute. Hence, 1NF
disallows having a set of values, a tuple of values or combination of both as an attribute value for
a single tuple.
In other words, ―A relation is said to be in the first normal form, if every attribute of that relation
223
Consider the following non-normalized relation.
Employee
System.
The above relation does not fulfill the definition of first normal form. Because the attribute
{Project} is a multi-valued attribute. There are three techniques to achieve this non-normalized
1. Horizontal Expansion:
Expand the number of attribute if the maximum number of values is known for an attribute. For
example if it is known that at the most three projects can be allocated to one employee. Then we
can create new attributes as {Project1, Project2, Project3} for every value of Project attribute.
Employee
Engineering System
Information
System
224
103 Satvik System Analyst Null Null
and
Programming
This solution has the disadvantage of introducing null values, if most employees has fewer
2. Vertical Expansion:
Expand the key attribute so that there will be separate tuple in the original relation for each value
of the project.
Employee
This solution has the disadvantage of introducing redundant data in the relation. Therefore this
To convert the non-normalized relation into normalized table, we may remove the attribute that
violates the definition of 1NF and place that attribute in a separate relation along with the
primary key. The Primary key of this relation is an attribute or a set of attribute that uniquely
225
define a tuple in a relation. This decomposes the non-normalized relation into two relations
Emp_Proj
101 Arpit 10
101 Arpit 11
101 Arpit 12
102 Aryan 13
102 Aryan 14
103 Satvik 15
Proj_Info
ProjectId Project_Name
10 Networking
11 Software Engineering
12 Operating System
14 Marketing
In the above three techniques, the third one is superior because it does not have redundancy and
226
(ii) Second Normal Form (2NF)
A relation is said to be in the second normal form, if it is in the first normal form and non-key
attributes are fully functional dependent on the key attribute. Concept of Full Functional
Dependency and Partial Functional Dependency are already explained in Chapter 8 Section 3.2.
Order
In the above relation ―Order‖, the attributes Orderno and Itemno is a composite key attributes
and the attributes Orderdate and units are the non-key attributes. In this relation Orderdate is
functionally dependent on Orderno, because for each tuple of Orderno we have unique value of
Orderdate. But, for each value of Itemno, there is more than one value of Orderdate. For
example, for attribute Itemno, value 1000, we have two values of Orderdate i.e. ‗01/01/2014‘ and
‗02/01/2014‘. Hence Orderdate is not functionally dependent on Itemno. Therefore, this relation
―Order‖ is not in second normal form. For the relation to be in second normal form, the non-key
attributes must be fully functional dependent on the whole of the primary key. To cover this non-
Find and remove the attribute that are functionally dependent on only a part of the key
and not on the whole key, and place them in the different table.
227
Group the remaining attributes.
To convert this relation into 2NF, we must remove the attributes that are not fully functional
dependent on whole key and place them in a different table along with the attribute that is
functionally dependent on. In the above example since Orderdate is not fully functional
dependant on whole of the key i.e. Orderno + Itemno, therefore we can place Orderdate along
with Orderno in a separate table called Orderinfo, and the attributes Orderno, Itemno and Units
Orderinfo
Orderno Orderdate
100 01/01/2014
101 02/01/2014
102 03/01/2014
103 04/01/2014
Iteminfo
Hence the resultant relation fulfills the definition of second normal form. Therefore normalized
228
(iii) Third Normal Form (3NF)
A relation is said to be in third normal form when it is already in the second normal form and if
all the non key attributes of the relation are independent of all other non-key fields of the same
table. In other words, it requires that data stored in a table should be transitively functionally
dependent only on the primary key, and not on any other field in the table. Concept of
Hostel
In this relation, IdNo is the primary key attribute and all other non-key attributes should be
functionally dependent on it. So, it is in the second normal form. It is clear that all non-key
attribute are functionally dependent on the primary key attribute. Also a non-key attribute i.e.
Course is finctional dependent on other non-key attribute i.e. Hostelno. Therefore this relation is
not in 3NF. Therefore to cover this non-normalized relation into normalized relation, following
Find and remove the attributes that are functionally dependent on attributes that are not
229
Student_Detail
Course
Course Hostelno
MCA 1
MSc 2
BSc 3
MBA 4
Hence the resultant relation fulfills the definition of third normal form. Therefore normalized
The BCNF was introduced as the simpler form of 3NF, because the 3NF was inadequate in some
Where the multiple candidate keys (multi attribute key) are composite; and
Where the multiple candidate keys (multi attribute key) are overlapped it means it has at
230
A relation is said to be in BCNF; if it is in 3NF and no dependency of an attribute of a multi
Assume that a relation has more than one possible multi attribute key. Assume further that the
multi attribute keys have a common attribute. If an attribute of a composite key is dependent on
Consider a relation ―Teacher‖, where a teacher can work in more than one department,
percentage time he spent in each department is given but each department has only one HOD.
Teacher
In the above relation, TeacherId and Department is a composite key attribute. The attribute HOD
and PercentTime are functionally dependent on composite key attribute. Also TeacherID and
HOD are the composite key attribute. The attribute Department and PercentTime are functional
dependent on this composite key. Further HOD is functional Dependent on Department. Hence
In order to normalize the relation into BCNF, we have to create a new relation from the old
relation by breaking into two sub-tables i.e. Relation ―Department‖ and ―HOD‖.
231
Department
100 Computers 50
200 Mathematics 60
200 Physics 40
300 History 30
HOD
Department HOD
Hence the resultant relation fulfills the definition of BCNF. Therefore normalized relations are
achieved as above.
Where attribute Sid is the primary key, Sname is student name, Phone is student's phone number
and Courses-taken is a table contains course-id, course-description, credit hours and grade for
each course taken by the student. More precise definition of table Course-taken is:
232
According to the definition of first normal form relation Student-courses is not in first normal
form because one of its attribute Courses-taken is itself a table and is not a simple attribute.
To clarify it more assume the above tables contain the data as shown below:
Student-courses
St-100-Course-taken
St-200-Course-taken
St-300-Course-taken
233
1. Insertion anomaly means that that some data cannot be inserted in the database. For
example we cannot add a new course to the database of example-1, unless we insert a student
2. Update anomaly means we have data redundancy in the database and to make any
modification we have to change all copies of the redundant data or else the database will
contain incorrect data. For example in our database we have the Course description
taken tables. To change its description to "New Database Concepts" we have to change it in
all places. Indeed one of the purposes of normalization is to eliminate data redundancy in the
database.
3. Deletion anomaly means deleting some data cause other information to be lost. For example
if student Russell is deleted from St-100-Course-taken table we also lose the information that
To convert the above structure to first normal form relations, all non-simple attributes
combining each row of Student-courses with all rows of its corresponding course table that was
taken by that specific student. Following is Student-courses table in first normal form.
Grade)
To cheque the resultant table fulfills the properties of all Normal form.
234
Notice that the primary key of this table is a composite key made up of two parts; Sid and
Course-id. Note that pk1 following an attribute indicates that the attribute is the first part of the
primary key and pk2 indicates that the attribute is the second part of the primary key.
Student-courses
Examination of the above Student-courses relation reveals that Sid does not uniquely identify a
row (tuple) in the relation hence cannot be the primary key. For the same reason Course-id
cannot be the primary key. However the combination of Sid and Course-id uniquely identifies a
row in Student-courses, Therefore (Sid, Course-id) is the primary key of the above relation.
The primary key determines every attribute. For example if you know both Sid and Course-id for
any student you will be able to retrieve Sname, Phone, Course-description, Credit-hours and
Grade, because these attributes are dependent on the primary key. Figure 1 below is the graphical
representation of the functional dependency between the primary key and attributes of the above
relation.
Note that the attribute to the right of the arrow is functionally dependent on the attribute in the
left of the arrow. Thus the combination (Sid, Course-id) is the determinant (that determines other
235
attributes) and attributes Sname, Phone, Course-description, Credit-hours and Grade are
dependent attributes.
other attributes. In addition to the (Sid, Course-id) there are two other determinants in the above
Student-courses relation. These are; Sid and Course-id attributes. Note that Sid alone determines
236
both Sname and Phone, and attribute Course-id alone determines both Credit-hours and
Course_description attributes.
Attribute Grade is fully functionally dependent on the primary key (Sid, Course-id) because both
parts of the primary keys are needed to determine Grade. On the other hand both Sname, and
Phone attributes are not fully functionally dependent on the primary key, because only a part of
the primary key namely Sid is needed to determine both Sname and Phone. Also attributes
Credit-hours and Course-Description are not fully functionally dependent on the primary key
The new relation Student-courses still suffers from all three anomalies for the following reasons:
1. The relation contains redundant data (Note Database_Concepts as the course description
2. The relation contains information about two entities Student and course.
Following is the detail description of the anomalies that relation Student-courses suffers from.
1. Insertion anomaly: We cannot add a new course such as IS247 with course description
programming techniques to the database unless we add a student who to take the course.
2. Update anomaly: If we change the course description for IS380 from Database Concepts to
New_Database_Concepts we have to make changes in more than one place or else the
database will be inconsistent. In other words in some places the course description will be
New_Database_Concepts and in any place were we forgot to make the changes the
3. Deletion anomaly: If student Russell is deleted from the database we also loose information
237
The above observation indicates that having a single table Student-courses for our database
causing problems (anomalies). Therefore we break the table to smaller table to get a higher
attributes to be fully functionally dependent on the primary key. To do that we need to project
(that is we break it down to two or more relations) Student-courses table into two or more tables.
However projections may cause problems. To avoid such problems it is important to keep
attributes, which are dependent on each other in the same table, when a relation is projected to
smaller relations. Following this principle and examination of Figure-1 indicate that we should
PROJECT Student-courses ON (Sid, Sname, Phone) creates a table call it Student. The relation
PROJECT Student-courses ON (Sid, Course-id, Grade) creates a table call it Student-grade. The
238
Courses (Course-id::pk, Course-Description)
100 IS380 A
100 IS416 B
200 IS380 B
200 IS416 B
200 IS420 C
300 IS417 A
All these three relations are in second normal form. Examination of these relations shows that we
have eliminated the redundancy in the database. Now relation Student contains information only
related to the entity student, relation Courses contains information related to entity Courses only,
and the relation Student-grade contains information related to the relationship between these two
entity.
239
1. Insertion anomaly: Now a new Course with course-id IS247 and Course-description can be
inserted to the table Course. Equally we can add any new students to the database by adding
their id, name and phone to Student table. Therefore our database, which made up of these
2. Update anomaly: Since redundancy of the data was eliminated no update anomaly can
occur. To change the course-description for IS380 only one change is needed in table
Courses.
3. Deletion anomaly: the deletion of student Russell from the database is achieved by deleting
Russell's records from both Student and Student-grade relations and this does not have any
side effect because the course IS417 untouched in the table Courses.
9.4 Summary
The normal forms of relational database theory provide criteria for determining a table‘s degree
of vulnerability to logical inconsistencies and anomalies. The higher the normal form applicable
240
normalization is to produce a stable set of relations that is a faithful model for operations of the
enterprises.
1. Elmasri & Navathe: Fundamentals of Database systems, 3rd Edition, Addison Wesley,
New Delhi.
2. Raghu Ramakrishnan & Johannes Gehrke: Database Management Systems, 2nd edition,
Delhi.
2. What do you understand by database anomalies? Write the procedure to generate First
normal form.
3. ―Every relation in BCNF is also in 3NF, but a relation in 3NF is not necessarily in
4. What do you mean by Normalization? Discuss the normal forms based on Primary key.
241
Chapter – 10: An Introduction to MS-Access
Writer: Dr. Kanwal Garg
Vetter: Prof. Rajender Nath
Structure:
10.1 Introduction
10.2 Objective
10.3 Presentation of Content
10.3.1 Interface Elements in MS-Office Access 2007
10.3.2 Tool Bars and Their Icons
(i) Getting Started with Microsoft Office Access
(ii) The Ribbon
242
10.1 Introduction
Microsoft Access is a relational database management system that comes as a part of Microsoft
Office Suite. Ms- Access is graphical user interface (GUI) application software, which is very
easy; yet powerful to manage large volumes of data. It generally manages data related to
different environments like scientific, inventory, financial, payroll, education, hospitality and
various other environments. MS- Access can be used at a client end or at a server end, in a client
10.2 Objective
The extension of MS-Access file is .mdb. In a single file of MS-Access, we can create multiple
database objects i.e. tables, queries, forms, reports, data access pages, macros and modules. MS-
Access 2007 comprises a number of elements that define how we interact with the product.
These elements were chosen to help to find the commands that executes faster. The most
significant interface element in MS- Access 2007 is called the Ribbon. The Ribbon is the strip
across the top of the program window that contains groups of commands. The Office Fluent
Ribbon provides a single home for commands and is the primary replacement for menus and
toolbars. On the Ribbon are tabs that combine commands in ways that make sense. In Office
Access 2007, the main Ribbon tabs are Home, Create, External Data, and Database Tools. Each
tab contains groups of related commands, and these groups surface some of the additional new
GUI elements, such as the gallery, which is a new type of control that presents choices visually.
243
Getting Started with Microsoft Office Access: The page that is displayed when you
start Access from the Windows Start button or from a desktop shortcut.
The Office Fluent Ribbon: The area at the top of the program window where you can
choose commands.
Contextual Command Tab: A command tab that appears depending on your context the
object that you are working on or the task that you are performing.
Gallery: A control that displays a choice visually so that you can see the results that you
will get.
Quick Access Toolbar: A single standard toolbar that appears on the Ribbon and offers
Navigation Pane: The area on the left side of the window that displays your database
objects. The Navigation Pane replaces the Database window from earlier versions of Access.
Tabbed Documents: Your tables, queries, forms, reports, pages, and macros are displayed
as tabbed documents.
Status Bar: The bar at the bottom of the program window that displays status information
Mini Toolbar: An on-object element that transparently appears above text that you have
When you start Office Access 2007 by clicking the Windows Start button or a desktop shortcut
(but not when you click on a database), the Getting Started with Microsoft Office Access page
244
appears as shown in Figure 10.1. This page shows what you can do to get started in Office
Access 2007.
The Office Fluent Ribbon is the primary replacement for menus and toolbars and provides the
main command interface in MS-Office Access 2007. One of the main advantages of the Ribbon
is that it consolidates, in one place, those tasks or entry points that used to require menus,
toolbars, task panes, and other GUI components to display. This way, you have only one place in
When you open a database, the Ribbon appears at the top of the main MS-Office Access 2007
The Ribbon contains a series of command tabs that contain commands as shown in Figure 10.2.
In MS-Office Access 2007, the main command tabs are Home, Create, External Data, and
Database Tools. Each tab contains groups of related commands, and these groups surface some
of the additional new GUI elements, such as the gallery, which is a new type of control that
245
The commands on the Ribbon take into account the currently active object. For example, if you
have a table opened in Datasheet view and you click Form on the Create tab, in the Forms group,
MS-Office Access 2007 creates the form, based on the active table. That is, the name of the
You can use keyboard shortcuts with the Ribbon. All of the keyboard shortcuts from an earlier
version of MS-Access continue to work. The Keyboard Access System replaces the menu
accelerators from earlier versions of Access. This system uses small indicators with a single
letter or combination of letters that appear on the Ribbon and indicate what keyboard shortcut
actives the control underneath. When you have selected a command tab, you can browse the
1. Start MS-Access.
The following table shows a representative sampling of the tabs and the commands available on
each tab. The tabs and the commands available change depending on what you are doing.
246
Home Select a different view.
Work with records (Refresh, New, Save, Delete, Totals, Spelling, More).
Find records.
Create a list on a SharePoint site and a table in the current database that links
Export data.
247
Database Tools Launch the Visual Basic editor or run a macro.
In MS- Access we have a variety of options for creating/ opening a database. Such options are
given below. We can open a New Blank Database, can create a New Database from a featured
template, create a new database from a Microsoft Office Online Template, and open a recently
1. Start Access from the Start menu or from a shortcut. The Getting Started with Microsoft
2. On the Getting Started with Microsoft Office Access page, under New Blank Database,
3. In the Blank Database pane, in the File Name box, type a file name or use the one that is
4. Click Create.
The new database is created, and a new table is opened in Datasheet view.
248
Figure 10.3: Creating a New Database
Once you have created a blank database with a database name, you can create the following six
Queries - a command for viewing or analyzing data in different ways or a result of the
command.
Reports - an object that present the data in an organized way according to your specification.
Macros - a set of one or more actions that each performs a particular operation, such as
opening a form or printing a report. Macros can help you to automate common tasks. For
example, you can run a macro that prints a report when a user clicks a command button.
Module - a collection of small programs and procedures that are stored together as a unit.
249
(ii) Create a New Database from a Featured Template
1. Start Access from the Start menu or from a shortcut. The Getting Started with Microsoft
2. On the Getting Started with Microsoft Office Access page, under Featured Online
3. In the File Name box, type a file name or use the one that is provided for you.
4. Optionally, check the Create and link your database to a Windows SharePoint Services
5. Click Create (or) Click Download. MS- Access will create a new database from the
1. Start Access from the Start menu or from a shortcut. The Getting Started with Microsoft
2. On the Getting Started with Microsoft Office Access page, in the Template Categories
pane, click a category and then, when the templates in that category appear, click a
template.
3. In the File Name box, type a file name or use the one that is provided for you.
4. Click Download.
1. Start Access.
2. On the Getting Started with Microsoft Office Access page, under Open Recent Database,
click the database that you want to open.MS-Access will open the database.
250
10.3.4 Creating a Table:
To create a blank (empty) table in datasheet view, on the Ribbon you can:
Figure 10.5 shows a Datasheet View with column headings ID and Add New Field across the top
of the datasheet. Data can be entered directly into it. After entering data and hit the Enter key, the
column heading Add New Field automatically changes to Field1 and the next column‘s heading
becomes Add New Field. At the same time, an ID number will be assigned to that row. When
you save the new datasheet, Microsoft Access will analyze your data and automatically assign
the appropriate data type and format for each field. Because the names of each field are not
a) Renaming Fields:
1. Place the cursor over the column heading you want to rename and double click. The column
heading will appear highlighted and the cursor will be blinking (edit mode).
251
2. Type the name you want to use and then press the Enter key.
3. Repeat the first two steps for the second column, and so on.
As the column corresponds to the field, the row corresponds to the record. Now we are ready to
add the information. Say that, if we are doing a database of a company, the first table we may
have is Employee. And the fields of Employee may contain SSN, LastName, FirstName, and so
252
b) Summarizing Datasheet View
In Design View new fields can be added, define how each field appears or handles data, and
create a primary key. To create a blank (empty) table in design view, you can:
Click CreateTable Design as shown in Figure 10.4, Design View as shown in Figure 10.8
will appear.
In this view, we can specify detailed properties for each field. This includes the length and
type of information used in the field. But if we were to enter data into the table, we must use
Datasheet View or Forms. The design view for the example Employee table mentioned
There are three columns on the top portion of the window. The Field Name is the name of the
fields. For example, SSN, FirstName, LastName are proper field names for the Employee
253
table. The name for a field must follow MS Access object-naming rules. The Data Type is
254
It provides a list of data types that we can choose from, including Text, Memo, Number, Date,
and so on. The Description column allows us to describe the field and it is optional. This allows
new users to easily understand the specifications and meaning of your fields. Table 10.2
You can set up properties of fields in the Field Properties window at the bottom half pane. Table
Before we save the table and quit, we need to specify the primary key. In our Employee table,
SSN will be good for primary key. To define SSN as the primary key, click the Field Selector as
shown in Figure 10.8 for the SSN field. Field Selector is the gray bar on the left side of the Table
Design grid by each field. When we click here, the whole row appears highlighted. Then click
menu EditPrimary Key or click the Primary Key button (i.e. the key symbol, shown in Figure
10.9) on the toolbar in design view, a key symbol will appear on the Field Selector. Save the
255
Table 10.3: Data Types in MS Access
256
(c) Summarizing Design View
To create a Contacts, Tasks, Issues, Events or Assets table, you might want to start with the table
templates for these subjects that come with Office Access 2007. To choose a template for your
10.3.5 Relationships
The tables in a database may be linked to each other by the creation of relationships between
specific fields in the database. These relationships can be viewed in the Relationships window:
257
Figure 10.11: Relationship
Microsoft Access has a wizard named the Table Wizard that will create a table for you. This
wizard gives you suggestions about what type of table you can create (for example, a Mailing
List table, a Students table, a Tasks table, and so on) and gives you many different possible
names for fields within these tables. To use the Table Wizard to create a table, follow these
steps:
2. In the Database window, click Tables under Objects, and then click New.
If you want to modify the table that the Table Wizard creates, open the table in Design view
258
(ii) Creating a Table by Entering Data in a Datasheet
In Microsoft Access, you can also create a table by just entering data into columns (fields) in a
datasheet. If you enter data that is consistent in each column (for example, only names in one
column, or only numbers in another column), Access will automatically assign a data type to the
fields. To create a table by just entering data in a datasheet, follow these steps:
2. In the Database window, click Tables under Objects, and then click New.
3. In the New Table dialog box, double-click Datasheet View. A blank datasheet is
4. Rename each column that you want to use. To do so, double-click the column name, type
You can insert additional columns at any time. To do so, click in the column to the right
of where you want to insert a new column, and then on the Insert menu, click Column.
5. Enter your data in the datasheet. Enter each kind of data in its own column. For example,
if you are entering names, enter the first name in its own column and the last name in a
separate column. If you are entering dates, times, or numbers, enter them in a consistent
format. If you enter data in a consistent manner, Microsoft Access can create an
appropriate data type and display format for the column. For example, for a column in
which you enter only names, Access will assign the Text data type; for a column in which
you enter only numbers, Access will assign a Number data type. Any columns that you
259
6. When you have added data to all the columns that you want to use, click Save on the File
menu.
7. Microsoft Access asks you if you want to create a primary key. If you have not entered
data that can be used to uniquely identify each row in your table, such as part numbers or
an ID numbers, it is recommended that you click Yes. If you have entered data that can
uniquely identify each row, click No, and then specify the field that contains that data as
your primary key in Design view after the table has been saved. To define a field as your
primary key after the table has been saved, follow these steps:
a. Open the table that Access created from the data that you entered in datasheet in
Design view.
b. Select the field or fields that you want to define as the primary key.
To select one field, click the row selector for the desired field.
To select multiple fields, hold down the CTRL key, and then click the row
If you want the order of the fields in a multiple-field primary key to be different from the
order of those fields in the table, click Indexes on the toolbar to display the Indexes
window, and then reorder the field names for the index named Primary Key.
As mentioned earlier, Microsoft Access will assign data types to each field (column)
based on the kind of data that you entered. If you want to customize a field's definition
260
(iii) Creating a Table by Entering Data in Design View
If you want to create the basic table structure yourself and define all the field names and data
types, you can create the table in Design view. To do so, follow these steps:
2. In the Database window, click Tables under Objects, and then click New.
4. In the <Table Name>: Table dialog box, define each of the fields that you want to include
a. Click in the Field Name column, and then type a unique name for the field.
b. In the Data Type column, accept the default data type of Text that Access assigns or
click in the Data Type column, click the arrow, and then select the data type that you
want.
c. In the Description column, type a description of the information that this field will
contain. This description is displayed on the status bar when you are adding data to
the field, and it is included in the Object Definition of the table. The description is
optional.
d. Once you have added some fields, you may need to insert a field between two other
fields. To do so, click in the row below where you want to add the new field, and then
on the Insert menu, click Rows. This creates a blank row in which you can add a new
field.
To add a field to the end of the table, click in the first blank row. After you have
added all the fields, define a primary key field before saving your table. A primary
261
key is one or more fields whose value or values uniquely identify each record in a
e. Select the field or fields that you want to define as the primary key. To select one
field, click the row selector for the desired field. To select multiple fields, hold down
the CTRL key, and then click the row selector for each field.
If you want the order of the fields in a multiple-field primary key to be different from the order
of those fields in the table, click Indexes on the toolbar to display the Indexes dialog box, and
then reorder the field names for the index named Primary Key.
You do not have to define a primary key, but it is usually a good idea. If you do not define a
Primary key, Microsoft Access asks if you want Access to create one for you when you save the
table. When you are ready to save your table, on the File menu, click Save, and then type a
One of the most useful features of Access is its ability to interface with data from many other
programs. In fact, it‘s difficult to summarize in a single article all the ways in which you can
move data into and out of Access. For example, here are just a few ways in which you might use
To accumulate and store data over the long term, occasionally exporting data to other
262
(i) External Data Operations in Access
In many programs, you use the Save As command to save a document in another format, so that
you can open it in another program. In Access, however, the Save As command is not used in the
same way. You can save Access objects as other Access objects, and you can save Access
databases as earlier versions of Access databases, but you cannot save an Access database as,
say, a spreadsheet file. Likewise, you cannot save a spreadsheet file as an Access file (.accdb).
Instead, you use the commands on the External Data tab in Access to import or export data
(ii) Types of Data That Access Can Import, Link To, Or Export
A quick way to learn about the data formats that Access can import or export is to open a
database and then explore the External Data tab on the ribbon.
1. The Import & Link (1 given in Figure 10.12) group displays icons for the data formats
2. The Export (2 given in Figure 10.12) group displays icons for all the formats that Access
263
3. In each group, you can click More (3 given in Figure 10.12) to see more formats that
If you don‘t see the exact program or data type that you need, chances are your data can be
exported by the other program into a format that Access understands. For example, most
programs can export columnar data as delimited text, which is then easily imported into Access.
The following table shows which formats can be imported into, linked to, or exported out of
Access:
format
Excel
Access
Server)
(delimited or fixed-
width)
264
attachments)
Microsoft Office No, but you can save a No, but you can save a Yes (you can export as
Word Word file as a text file Word file as a text file Word Merge or as Rich
note)
Outlook.
1. Open the database that you want to import or link data into.
2. On the External Data tab, click the type of data that you want to import or link to. For
3. In most cases, Access starts the Get External Data wizard. In the wizard, you may be
265
Indicate whether the first row contains column headings, or whether it should be treated
as data.
Choose whether to import the structure only, or the structure and the data together.
If importing, specify whether you want Access to add a new primary key to the new
4. On the last page of the wizard, Access usually asks you if you want to save the details of
the import or link operation. If you think you‘ll need to perform the same operation on a
recurring basis, select the Save import steps check box, fill in the information, and then
click Close. Then, you can click Saved Imports on the External Data tab to re-run the
operation.
After you have completed the wizard, Access notifies you of any problems that might have
occurred during the import process. In some cases, Access might create a new table called
Import Errors, which contains any data that it was unable to import successfully. You can
examine the data in this table to try to find out why the data did not import correctly.
266
2. In the Navigation Pane, select the object that you want to export the data from. You can
export data from table, query, form, and report objects, although not all export options are
On the External Data tab, click the type of data that you want to export to. For example to
export data in a format that can be opened by Microsoft Excel, click Excel.
In most cases, Access starts the Export wizard. In the wizard, you may be asked for
information such as the destination file name and format, whether to include formatting
4. On the last page of the wizard, Access usually asks you if you want to save the details of
the export operation. If you think you will need to perform the same operation on a
recurring basis, select the Save export steps check box, fill in the information, and then
click Close. Then, you can click Saved Exports on the External Data tab to re-run the
operation.
10.4 Summary
MS-Access is a powerful RDBMS that is used to create and manage your databases. It is a
graphical user interface application software, which is very easy and powerful to manage large
volume of data. It has many built in features to assist you in constructing and viewing your
267
information related to different environment like scientific, inventory, financial, payroll,
education, hospitality and various other environments.. The information can be viewed, sorted,
manipulated, retrieved and printed in various ways. Ms-Access gives you a platform where you
can retrieve accurate and fast information. The extension of MS-Access file is .mdb.
1. http://www.officetutorials.com
1. Explain the steps to create table in design view. Discuss the process of creating relationships.
4. Discuss the format and program whose data may be imported in and exported to MS-Access.
268
Chapter – 11: Database Operation in MS-Access
Writer: Dr. Kanwal Garg
Vetter: Prof. Rajender Nath
Structure:
11.1 Introduction
11.2 Objective
11.3 Presentation of Content
11.3.1 Queries
(i) Creating Queries
(ii) Query Wizard: A Select Query
(iii) Design View of an Existing Query
(iv) Creating a Query Totally in Design View
(v) Use a Query Wizard to Create a Crosstab Query
(vi) Create a Parameter Query in Design View
(vii) Creating Action Queries in Design View
(viii) Make-Table Queries
11.3.2 Reports
(i) Views
(ii) Report Wizard
(iii) Report Tool
(iv) Report Design
11.3.3 Forms
3.3.1 New Form Options
3.3.2 Design View of Forms
3.3.3. Design View Form Sections
3.3.4 Design View Info
11.4 Summary
11.5 Suggested Reading/ Reference Material
11.6 Self Assessment Questions (SAQ)
269
11.1 Introduction
This chapter provides essential tools/ operations such as queries, form and reports of any DBMS
or RDBMS package, as all the information is not required at one time. One always needs
selective data. Therefore query can filter data from a single table or group of related tables.
Forms provide an interactive way to data entry into the table. We can view, modify or delete data
stored in the table by using a form. The user can choose the design of the form from various
Reports are used to present data in a predefined or user-defined format. They are generally
prepared for presenting data in hard copy form using a printing device. Reports take data from
database tables and present it, in a way the user wants. One can group data on certain fields or
11.2 Objective
The idea of this chapter is to make the student familiar with different database operation such as
queries, reports and forms. For this purpose, the author of this chapter provides an overview of
MS-Access 2007 concentrating on the said aspects. The screen-shots are provided at the
11.3.1 Queries
A query is a way to define a permanent filter to retrieve data or to create an action that performs
on records. Queries are also called dyna-sets for dynamic subsets of a table.
A Select Query retrieves and displays records from tables according to what field you pick
270
and what criteria you place on the query.
A Crosstab Query will display sums, counts, and averages from one field in a table and
show this in a datasheet with fields on the left and across the top.
An Action Query performs operations on the records to match your criteria and include
A Parameter Query prompts you for information to use to activate the query. It can help
A query may be created at any time after you have a table. Access to all query options is found
on the Create Ribbon. Click the Create Ribbon and the Query Wizard button. From this point on
Select the fields you need to show in your query and send
them across.
table or query and add the fields to the list already chosen.
In the next box choose whether you want Detail which shows
271
Click Summary and then Summary Options to see the numerical fields listed and the option
for calculations.
The Detail/Summary screen only appears if one of your fields has numerical data.
Give the new query a name and choose to open the query or go directly into Design view.
Click finish. It looks just like a datasheet, but it gives you filtered data, on command, without
redoing a filter.
You may click on the design view icon to edit this query further with design view of a query.
Note that in Access 2007 a table and a query can also make grand totals. You no longer need
In query design view a query grid shows the fields you have selected and the field list from the
If you open an existing query, the object it is based on shows in the area at the top.
The query properties are down the left side of the grid and change depending on the type of
query selected.
272
Add more fields to your query either by clicking on an empty field cell down arrow and
choosing a field to add, or by dragging the field directly from the field list at the top and
Make the query do an alphanumeric or numeric sort on any chosen field by clicking on the
If the box in the show line is not checked this field will not show on your finished query.
A hidden field can still be used as a sort field or a limiting field if criteria are set.
Queries can pull data from multiple tables or queries. If you have a true relational database,
your query may be used to pull together all data from all tables into one large query.
Click on the Show Table icon and select a second table or query.
Choose the field from the popup list in the field cell or drag any field from the table or query
Select the table(s) or query to base your new query on. Close the pop-up ―Show Table‖ box
to continue.
A special Query Tools/ Design Ribbon opens on the ribbon bar when you close the Show
Table box.
To create a simple query, select the fields to query and set the criteria. Drag and drop fields
to the bottom grid or choose them from the drop down list that appears when you click on the
273
Query options listed will depend on the query type you have started.
Create an expression on the criteria line to filter out unwanted data or type in the exact data
If you click the View button on the Ribbon, you return to datasheet view to PREVIEW your
You must close Design View of the query and choose to SAVE the query before you can
really be finished with the design and ready to run the query or use your tables or reports.
274
Choose to turn your simple query into an action or crosstab query by choosing that option
from the Query type group on the Query Tools/ Design Ribbon.
Crosstab queries will display sums, counts, and averages from one field in a table and show this
in a datasheet with fields on the left and across the top. Use the wizard to help make your
If you want the fields ―grouped‖ use more than one field. The grouping is by which field is
chosen first.
The last step is to choose the field you want calculated in the crosstab.
275
Next step is the selection of the function to use. Choose from average, count, first, last,
Click next and name your crosstab query. Run the query to see how it looks.
If the query doesn't include the data you want, start over and do it again.
Click on design view to see how the fields and info are set up.
o You can't put criteria below a value field. Put the criteria under a second field and make it
o Save and re-open the query. Access creates extra fields and moves criteria when
necessary.
o If you create an incorrect expression, Access tries to correct your expression or gives you
a warning message and refuses to save if you cannot correct the problem.
o Do not use the crosstab query wizard if you are querying multiple tables. Use Design
View instead to create the query and then choose crosstab query from the query types on
A parameter is a question set to ask for criteria before running the query. It can be added to any
existing query. Make a copy of the select query we created, and turn it into a parameter query.
Open a query in design view, and check to be sure all desired fields are chosen.
276
Type in a question requesting the needed parameter as the criteria of that same field.
Example: [Enter the computer type:] on the criteria line under the Type field. Include
brackets!
query.
The action queries must all be started in Design View as per the steps given below:
Add the table or query to use in your query and close the Show Table window.
Choose the action query type you want from the Query Tools/ Design Ribbon which opens
Start a new Design Query, and choose the table or query to base it on.
Select the Make-Table option from the query types on the design ribbon.
Give your new table a name and tell whether you are adding it to the current database,
277
Choose the fields to add to the new table. (Add
purchased.)
is not affected.
Table queries include (a) delete queries, (b) append query and (c) update query as discussed
below:
a) Delete Queries:
Start a new Design Query and choose the newly made table as the base.
Choose only fields needed for criteria from the new table.
Set criteria on the date purchased field, so that computers with date purchased before
Note that ―#‖ signs will appear around the date automatically if you forget to add them.
Save and name the query. Check for the ―#‖ signs.
Run the Delete query and then look at the table. The oldest computers are gone.
278
b) Append Queries:
Start a Design Query and choose the original table as a base to pull your data from.
Next you are asked to select the table to append the data onto. Choose the one created by
Choose the same four fields from the original table. (Add manufacturer, computer type,
Set the criteria so computers with a date purchased before 11/22/2005 will append.
(<#11/22/2005#)
Run the Append query and then look at the computers appended to the table. All
When you use an Append query, be sure the fields of data match in the two tables.
c) Update Queries:
Start a new Design Query, and choose the appended make-table as a base.
Select the Update Query option from the query type group.
Choose only the field you are updating and the field you need to set the criteria. (If you
are not limiting criteria, you won't need the second field.)
Set the criteria. If you need to refer back to data in the table to check for criteria, click to
open the table from the Navigation Pane. If necessary, block and paste the data from the
field you want to update. Example: use manufacturer: "Dell" for the criteria.
When you paste criteria in, ―equals‖ is understood as the given and the quotation marks
279
Criteria with periods in it may confuse Access and require you to put in the quotation
Set the field update to information. Example: [cost] + 1000; adds 1000 to each cost
amount on the Dells. The dollar sign and decimal are unneeded and will be ignored. DO
NOT use a comma in the dollar amount. Just use the field name plus the amount to
increase or the field name minus to reduce a cost: [cost] + 1000 or [cost]-500.
Save and name the query. Run the Update query, and then look at the computer costs that
10.3.2 Reports
An Access Report is a formatted, stylized way to print out any part of your database information.
Information in a report can be sorted, queried, formatted, calculated, or summarized. Your report
(i) Views
View options have changed in Access 2007. Check out the options on these four views. Each
Report View gives the access tabbed view of your finished report
lined up with any other tabbed open objects from the database.
Print Preview takes you to the new print preview interface with
setting up and modifying a report than ever seen before. You can do almost anything to fix
280
Design View has also changed in 2007, but still looks similar to previous versions. You can
use it to add more controls, edit control sources, and change properties.
Click the Create ribbon and look at the Report Group. You‘ll see several options for creating
reports. Click the Report Wizard Tool and let Access lead you through the steps.
If you click on the table or query you wish to base the report on, will
be given in the selection box. You may still change to another table
or query.
Go down the field list choosing which fields you need and
Choose a sort order for any field except the grouping field
not show.
count as well.
281
Decide whether you want Summary Only (one grand total‐no list) or the Detail and Summary
The layout of your data is the next set of options. Also choose page orientation and whether
The report name becomes the title of the report. It can be changed later if needed.
Preview the report or modify takes you to Design View. Click Finish.
If the report doesn't look right, delete it, and start the wizard again.
If you do not choose to use groupings, the wizard gives you options for columnar, tabular, or
just- ified reports. You can make a report look more like a set of forms.
Go to File\Page Setup to modify margins, orientation and set the number of columns.
Switch to Layout View or Design View to make necessary modifications to the report before
282
Click the print icon or File\Print for all other printing options.
The old Auto Report as available in Ms-Access 2003 is missing, but the new Report Tool Ms-
Access 2007 creates a report just as easily. It gives you an instant download of all fields in the
table or query you have selected to base it on. The report also opens in Layout View which gives
you full editing options. Select a table or query first, and click on Create Report.
All existing fields in the chosen table or query appear on the new report showing in Layout
Any extra fields may be manually selected and deleted in Layout View.
Field column sizes may be increased or decreased by dragging the edges of a field.
Three Report Layout Tools (Contextual Tabs) used in modifying your report open automatically
as given below:
283
Figure 11.10: Report layout - I
Format: fonts, formatting, grouping, totals, gridlines, logos, page numbers, auto formatting
Arrange: control layout options, alignment, positions, and property sheet tools are on the
arrange ribbon.
Page Setup: change paper size, orientation, margins, columns, and other page set options.
Turn on totals in your report or add a page number. The list of option is on the ribbon.
Click the Add Existing Fields button to get the Field List Pane turned on. Extra fields may be
added by dragging onto the report from the Field List pane.
284
The Auto format gallery has an extensive set of report styles to click and apply.
When you are finished making design changes to your report, click on Report View to see
If you initially start your report in Design View, it will not have a Record Source associated with
it, and you will have to manually assign one. All other report methods allow you to choose the
Right‐click the box at the upper left corner of the ruler bars with a black button in it. You‘ll
select the report and the shortcut menu that pops up gives you the Properties option. Click to
285
If you select properties, but do not see the word REPORT in selection type, click the drop
The first item in the properties box asks for a Record Source. Click the down arrow to see the
list, and choose the table or query you want for the report source.
Now you are ready to open the field list and add controls to your design grid.
When a control is added there are two pieces, the label and the control data. When you select
either item they can be moved together or separately. You can see dark blocks in the upper left
corners, but only one item has the yellow selection box showing. The items move together when
the compass is anywhere else on the box except on the dark block of the upper left corner.
To move the label only into the Page Header section, do the following:
On the toolbar click the paste icon or ctrl V, and the label appear.
The simplest way is to use the Report Wizard. If you use grouping the next window will give
you a button for summary options. Choose from the choices and view your report. To add a
• Drag the bottom edge of the grid down to create space for the expression.
• Select the text box icon and draw the box in the section you want.
• Delete the label or move it to the left to use as a label box with the word ―Sum‖ in it.
286
• The expression =Sum ([fieldname]) may be typed directly into the text box. For average type
=Avg ([fieldname]).
• Put the sum expression in the group footer for subtotals on your groups.
• Put the sum expression in the report footer for a grand total on the final page.
• Open the properties of the text box as a Format must also be set in properties. Use the drop
down menu to select from the list: Currency, Long integer, etc.
• Create a Text Box by clicking that icon, and using the mouse to ―draw‖ a rectangle.
• Click inside and type in the following code: ="Page" & [Page] & "of " & [Pages].
• Create a Text Box by clicking that icon, and using the mouse to ―draw‖ a rectangle.
Locate a file and insert it directly into the report. Access now has the capability to shrink a
picture to fit whatever size area you have. If the graphic is added to the report header it appears
only on page one. If it is added to the page header, it appears on every page.
287
• Lines may be added to your document in design view and layout view. Some
10.3.3 Forms
An Access Form may be created to use as a simple interface to input records one at a time. It can
Now when you click Form all fields from the chosen object are added to a basic form in
Layout View.
The two Form Layout Tools Ribbons appear with more formatting and arrangement options.
Layout View of a form is similar to the option and design features you used in Report
Layout. Add an auto format, labels, graphics, backgrounds, or other options to your form.
Split Form: It creates a columnar form and includes the datasheet on a split screen with all fields
288
Edit the table as well or move back and forth.
Blank Form – Creates a blank form in Layout View with the table field list turned on. You may
289
Move and resize your chosen fields.
Form Wizard – Allows you to step through the process answering the questions and
creating a custom form of only the items you choose. Under More Forms follow the
following steps:
Decide the arrangement and position of the info on the form from columnar, tabular,
datasheet or justified.
Form Design – Design brings up the blank design grid similar to the report design where you
Use the Form tool or Form wizard to create an easy form. If you need to seriously edit the form,
it would be simpler to edit the original fields and then recreate the form. Use Design View to do
Use Design Form to create a personalized, custom form. Most forms are small and do not require
more than the detail section of the form. Creating a Form from design view is a complex task.
290
Click on Create\ Form Design to open the form.
Double click the black square dot in the upper left corner.
The Property Sheet opens and you must choose the Record Source.
Now you are ready to click Add existing fields and to see the pane open with all table fields.
Form Header and Footer are not normally turned on, but may be found under Report Design
Tools\Arrange.
A title for your form and any graphic image you may wish to add may be put in either the
Detail section is for the actual data you need to fill in for your table.
Each item in the form is represented in design view by a control the same as a report.
Unbound Controls contain a label or text box. You can calculate in an unbound control.
Calculated Controls are values that are calculated and not used in most forms.
To edit any of your form, follow the steps in design view as given below:
291
• Add or remove controls from your form.
• From the Command Button Wizard click through the Category options and choose an action.
• Select all controls on your form with the pointer, and apply different fonts, font sizes,
• Resize the form and move the controls to the right half of the form to have room to insert a
• Add lines and boxes to the form. Click the line/border width icon to change the line width.
• Use AutoFormat to add a style or right click to do a fill color in the background.
• If you change any field to a lookup field after your form (or report) is created, the form will
not automatically update this field. You will need to recreate the form or change it yourself in
• Open the field list and drag the new lookup field onto your form.
• Add a combo box or list box to a form in design view by dragging the new field from the
field list onto the design view grid. Properties are set for you.
• One common problem with many forms is the size of the fill in box. This is based on the
size of the field in the original table. Remember that 2007 has gone back to 255 characters as
292
default size. To decrease the field size, go back to the properties of each field while editing
in table design view. You will have to recreate the form to finish resizing.
11.4 Summary
The new user interface in Office Access 2007 comprises a number of elements that define how
you interact with the product. These new elements were chosen to help you master Access, and
to help you find the commands that you need faster. The new design also makes it easy to
discover features that otherwise might have remained hidden beneath layers of toolbars and
menus. And you will get up and running faster, thanks to the new Getting Started with Microsoft
Office Access page, which provides you with quick access to our new getting started experience,
The most significant new interface element is called the Ribbon, which is part of the Microsoft
Office Fluent user interface. The Ribbon is the strip across the top of the program window that
contains groups of commands. The Office Fluent Ribbon provides a single home for commands
and is the primary replacement for menus and toolbars. On the Ribbon are tabs that combine
commands in ways that make sense. In Office Access 2007, the main Ribbon tabs are Home,
Create, External Data, and Database Tools. Each tab contains groups of related commands, and
these groups surface some of the additional new UI elements, such as the gallery, which is a new
type of control that presents choices visually. Queries, Reports and Forms in Ms- Access are the
database operations which help the user to retrieve the meaningful information from the
database. Query is a request for retrieving data from table that satisfies a particular condition.
Forms provide an environment where the user can edit, insert and modify the existing records.
Data so stored in the database can be reported to top level management in the form of report,
293
11.5 Suggested Reading/ Reference Material
3. Sandra Nees, Microsoft Access 2007: Forms and Reports, Creator and Presenter Booth
Library, EIU.
4. Sandra Nees, Microsoft Access 2007: Queries, Creator and Presenter Booth Library, EIU.
4. What do you mean by reports? What are the uses of reports? Discuss the procedure of
5. What are Forms? Why forms are used in Ms-Access? How they are different from tables?
294