Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 139

Database Management

Systems
Some important Definitions:

• Data: Data are raw facts. The word “raw” is used to indicate
that the facts have not yet been processed to reveal their
meaning.
For example: Ram, 5

• Information: Information is data that has been given meaning


by way of relational connection.
For example: Ram is 5 years old

• Database: A database is a collection of related information


that is organized so that it can easily be accessed, managed,
and updated .
– Ex. the names, telephone numbers and addresses of all the people
you know
• Database Management System(DBMS): It is a software
system that allows access to data contained in a database.
Objective of DBMS
• To provide a convenient and effective method
of defining ,storing and retrieving the
information contained in the database.
Database Management System
Components of Database system
The major components of Database system are:
• Data
• Hardware
• Software
• Users
Some more definitions
• Entity: In relation to a database , an entity is a single person, place, or thing
about which data can be stored.

• Entity Set: it is a group of similar objects of concern to an organization for


which it maintains data.

• Attribute: Each entity has attributes, or particular properties that describe


the entity.
For example: Student is an entity and class in which he study is an attribute.

• Domain:In database technology, domain refers to the description of an


attribute's allowed values
For example: The classes can be 1 to 6 semester only is the domain of
attribute class.
Architecture of DBMS
The architecture of DBMS is divided into three levels:
• External or user level
• Conceptual or global level
• Internal or physical level

The view at each of these levels is described by a scheme. A


scheme is an outline or plan that describes the records and
relationships existing in the view. In database literature, we
use the word schema instead of scheme.
In other words,
Schema is a description of data at some level (e.g., tables,
attributes, constraints, domains)
EXTERNAL LEVEL (highest level)

• The user’s view of the database.


• Consists of a number of different external views of the DB.
• Describes part of the DB for particular group of users.
• Provides a powerful and flexible security mechanism by
hiding parts of the DB from certain users. The user is not
aware of the existence of any attributes that are missing
from the view.
• It permits users to access data in a way that is customized to
their needs, so that the same data can be seen by different
users in different ways, at the same time.
CONCEPTUAL LEVEL
• The logical structure of the entire database.
• What data is stored in the database.
• The relationships among the data.
• Complete view of the data requirements of the organization, independent
of any storage consideration.
• Represents:
- entities, attributes, relations
- constraints on data
- semantic information on data
- security, integrity information
• Supports each external view: any data available to a user must
be contained in, or derivable from the conceptual level.
INTERNAL LEVEL
• Physical representation of the DB on the computer.
• How the data is stored in the database.
• Physical implementation of the DB to achieve
optimal run–time performance and storage space
utilization.
- Storage space allocation for data
- Record description for storage
- Record placement
- Data compression, encryption
Example of three level architecture
Database Instance
The data in the Database at any particular
point in time.
Mapping between Views
A view can be used to limit the portion of the
database that is known and accessible to a
given application.

Two mappings are required in a database


system with three different views:
1. A mapping between the external and
conceptual views gives the correspondence
among the records and relationships of the
external conceptual views. The external view
is an abstraction of the conceptual view. It
describes the contents of the database as
perceived by the user or application program
of that view. The user of the external view
sees and manipulates a record corresponding
to the external view.
2. There is a mapping from conceptual view to
an internal view. The conceptual view is an
abstraction of the internal view. Mapping
between the conceptual and the internal
levels specifies the method of deriving the
conceptual record from the physical database.
Example to show mapping
Data Independence
The ability to modify a scheme definition in one level without
affecting a scheme definition in a higher level is called data
independence.

1. There are two kinds:

Logical data independence


• The ability to modify the conceptual scheme without
causing application programs to be rewritten.
• Immunity of external schemas to changes in the
conceptual schema.
.
• Usually done when logical structure of
database is altered
For example, the addition or removal of new
entities, attributes, or relationships to the
conceptual schema should be possible
without having to change existing external
schemas or having to rewrite existing
application programs.
Physical data independence
• The ability to modify the internal scheme without
having to change the conceptual or external
schemas.
• Modifications at this level are usually to improve
performance.
For example, a change to the internal schema, such
as using different file organization or storage
structures, storage devices, or indexing strategy,
should be possible without having to change the
conceptual or external schemas.
Classification of DBMS users

The users of a database system can be


classified in the following groups, depending
upon their degree of expertise or the mode of
their interactions with the DBMS:
• End users
• Application Programmers
• Database Administrator
End users

• The term end-user is used to describe the


person who accesses the database in the
course of their day-to-day work.
Application Programmers
• Professional programmers who are
responsible for developing application
programs or user interfaces utilized by naïve
fall into this category.
• For example: Programs could be written in
HTML.
Database Administrator
• Centralized control of the database is exerted
by a person or group of persons under the
supervision of a high level administrator. This
person or group is referred as database
administrator(DBA).
• They are the users who are the most familiar
with the database and are responsible for
creating, modifying and maintaining its three
levels.
Functions of DBA
DBA performs following functions:
1. Defining the conceptual schema: It is the
data administrator’s job to decide exactly
what information is to be held in the
database i.e. to identify the entities of
interest to the enterprise and to identify the
information to be recorded about those
entities.
Functions of DBA contd.
2. Defining the internal schema: DBA is
responsible for the definition and
implementation of the internal level, including
the storage structure and access methods to
be used for the optimum performance of the
DBMS.
Functions of DBA contd.
3. Liaising with users: It is the business of the DBA to
liaise with users to ensure that the data they need is
available and further specifies the external view of
the various users and applications.

4. Defining security and integrity constraints: DBA is


responsible for granting permission to the users of
the database so that it is not accessible to
unauthorized users. Moreover measures are taken to
maintain integrity of database i.e. to prevent entry of
irrelevant data.
Functions of DBA contd.
5. Defining dump and Reload policies: The DBA
must define and implement an appropriate
damage control scheme, typically involving
(a) Periodic unloading or “dumping” of the
database to backup storage.
(b) Reloading the database when necessary
from the most recent dump.
Functions of DBA contd.
6. Monitoring performance and responding to
changing requirements: DBA is responsible for
organizing the system in such a way as to get
the performance that is best for the
enterprise and making the appropriate
adjustments as per changing needs.
DBMS Facilities/DBMS Languages
Two main types of facilities are provided by a
DBMS:
• The data definition facility or data definition
language(DDL)
• The data manipulation facility or data
manipulation language(DML)
DDL
• It can be used to define the conceptual schema.
• It includes all the entity sets and their associated
attributes as well as the relationships among the
entity sets.
• It also includes any constraints that have to be
maintained.
• All the above definitions are known as the metadata
i.e., data about the data in the database.
• Metadata is stored in compiled form known as data
dictionary, directory or system catalog.
DML
• The language used to manipulate data in the
database is called data manipulation language.
• It involves retrieval of data from the database,
insertion of new data into the database and deletion
or modification of existing data.
• Data manipulation operations are known as query
which is a statement in the DML that requests the
retrieval of data from the database.
Components of DBMS
The major components are:
• Data Definition Language Compiler
• Data Manager
• File Manager
• Disk Manager
• Query Processor
• Telecommunication System
• Data Files
• Data Dictionary
• Access aids
Components of DBMS
Data Definition Language Compiler
• The DDL compiler converts the data definition
statements into a set of tables.
• These tables contain the metadata concerning
the database and are in a form that can be
used by other components of the DBMS.
Data Manager
• It is the central software component of the DBMS.
• It converts operations in the user’s queries coming
directly via the query processor or indirectly via an
application program from the user’s logical view to a
physical file system.
• It is responsible for interacting with the file system.
• It enforces constraints to maintain the integrity and
security of data,.
File Manager
• Responsible for the structure of the files and
managing the file space.
• Responsible for locating the block containing
the required record, requesting this block
from disk manager and transmitting the
required block to the data manager.
Disk Manager
• It is the part of operating system and all
physical input and output operations are
performed by it.
• It transfers the block or page requested by the
file manager.
Query Processor
• It is used to interpret the user’s query and
convert it into an efficient series of operations
in a form capable of being sent to the data
manager for execution.
• It uses data dictionary to find the structure of
the relevant portion of the database.
Telecommunication System
• Online users of a computer system
communicate with it by sending and receiving
messages over communication lines. These
messages are routed via an independent
software system called a telecommunication
system.
Data Files
Data files contain the data portion of the
database.
Data Dictionary
• Data Dictionary is a collection of data
describing the content, source, definition,
structure, and business and derivation rules
regarding the data within an organization. It is
also called Metadata. Metadata is “data about
the data”.
Access Aids
To improve the performance of a DBMS, a set of
access aids in the form of indexes are usually
provided in a database system.
Advantages of DBMS
1. Reduction of Redundancies
2. Shared data
3. Integrity
4. Security
5. Conflict Resolution
6. Data independence
Reduction of Redundancies

Centralized control of data by the DBA avoids


necessary duplication of data and effectively
reduces the total amount of data storage
required.
Shared data
A database allows the sharing of data under
its control by any number of application
programs or users.
Integrity
Data integrity means that the data contained
in the database is both accurate and
consistent. Therefore, data values being
entered for storage could be checked to
ensure that fall within a specified range and
are of the correct format.
Conflict Resolution
Since the database is under the control of the
DBA, she or he should resolve the conflicting
requirements of various users and
applications.
Security

It includes:
• Authentication
• Authorization
Disadvantages of DBMS
• Problems associated with centralization
• Cost of software/hardware and migration
• Complexity of backup and recovery
Some definitions
• Record/Row/ Tuple: In a database, a record holds all the
information about one item or subject. Data is stored in records;
each record consists of a group of related data values.
• Record Type: Records are classified into record types, where each
record type describes the structure of a group of records that store
the same type of information.
• Field/Column/Attribute: A field is an item of information which is
stored for all records in a database.
• Table: A table consists of rows and columns of information.
Example
Here employee is record type with fields
name, address, city, phone#
EMPLOYEE
Data Models
• A collection of tools for describing
– data
– data relationships
– data semantics
– data constraints
• Entity-Relationship model
• Relational model
Importance of Data models

• Data models representations, usually graphical, of


complex real-world data structures.
• Facilitate interaction among the designer, the
applications  programmer and the end user.
• End-users have different views and needs for data.
• Data model organizes data for various users.
Data Model
A 'data model' is the theoretical foundation of a
database and fundamentally determines in
which manner data can be stored, organized
and manipulated in a database system. It
thereby defines the infrastructure offered by a
particular database system.
Record based data models
A record based data model is used to specify the
overall logical structure of the database. In this
model the database consists of a no. of fixed formats
of different types. Each record type defines a fixed
no. of fields having a fixed length.
There are 3 principle types of record based data
model. They are:
1.Hierarchical data model.
2.Network data model.
3.Relational data model.
Hierarchical Model
• In a hierarchical model, data is organized into a tree-like structure.
• It was the first DBMS model.
• One of the first hierarchical databases-Information Management
System (IMS) was developed by North American Rockwell
Company and IBM. IMS became the world’s leading mainframe
hierarchical database system in the 1970s and early 1980s.
• General structure
Hierarchical model contd.
• The database keeps track of the different record
types, their attributes, and the hierarchical
relationships between them.
• The attribute which assigns records to levels in the
database structure is called the key .
• This model is having a parent child relationship.
• The record type at the top of the tree is known as
the root.
• Root may have any number of dependents, each of
these may have any number of lower level
dependents, and so on.
Hierarchical model contd.
Depicts a set of one-to-many (1: M)
relationships between a parent and it’s
children segments
- Each parent can have many children.
-Each child has only one parent.
Hierarchical model
Advantages of hierarchical model
• Conceptual Simplicity
• Database Security
• Database Integrity
• Efficiency
Conceptual simplicity
Since the database is based on the
hierarchical structure, the relationship
between the various layers is logically simple.
Thus the design of a hierarchical database is
simple.
Database Security
Hierarchical model was the first database
model that offered the data security that is
provided and enforced by the DBMS.
Database Integrity
Since the hierarchical model is based on the
parent/child relationship, there is always a
link between the parent segment and the
child segments under it. The child segments
are always automatically referred through its
parent.
Efficiency
Hierarchical database model is a very efficient
one when the database contains a large
number of 1:N relationships.
Disadvantages of Hierarchical Model
• Complex implementation
• Difficult to manage
• Lacks structural independence
• Complex applications programming and use
• Implementation limitations
Complex Implementation
Although the hierarchical database model is
conceptually simple and easy to design it is
quite complex to implement. The database
designers should have very good knowledge
of the physical data storage characteristics.
Difficult to Manage
If you make any changes in the database
structure of a hierarchical database, then you
need to make the necessary changes in all the
application programs that access the
database. Thus maintaining the database and
the applications can become very difficult.
Lacks structural independence
Structural independence exists when the
changes to the database structure does not
affect the DBMS’s ability to access data. So if
the physical structure is changed the
applications will also have to be modified.
Complex applications
programming and use
Due to the structural dependence and the
navigational structure, the application
programmers and the end users must know
precisely how the data is distributed physically
in the database in order to access data. This
requires knowledge of complex pointer
systems.
Implementation limitations
Many of the common relationships do not
conform 1:N format required by the
hierarchical model.
Network Model
• Developed in mid 1960s as part of work of CODASYL
(Conference on Data Systems Languages) which
proposed programming language COBOL (1966) and
then network model (1971) .
• General Structure
Network Model contd.
• In network database terminology, a relationship is a
set. Each set is made up of atleast two types of
records: an owner/superior/parent record and a
member/dependent/child record.
• Links are hierarchical relations between records of
two types.
• It allows a given record type may have any number
of superiors.
• A superior can have any number of immediate
dependents.
• This model depicts many to many relationship.
Network Model contd.
• The network model was evolved to handle non-
hierarchical relationship.
• In the network data model, the database consists of
a collection of set-type occurrences.
• Each set-type occurrence has one occurrence of
OWNER RECORD, with zero or more occurrences of
MEMBER RECORDS.
• The member sets belonging to different owners are
disjoint.
To define a network database one needs to define:
(a) the database record types which consist of data items, and
(b) the set-types. 
Example of Network model
Advantages of Network model
• Conceptual Simplicity
• Capability to handle more relationship types
• Ease of data access
• Data integrity
• Database Standards
Capability to handle more
relationship types
Network model can handle one to many as
well as many to many relationships.
Ease of data access

An application can access an owner record


and all the member records within a set and if
one member in the set has two owners , then
one can move from one owner to another.
Data integrity
The network model does not allow a member
to exist without an owner. Thus a user must
first define the owner record and then the
member record. This ensures data integrity.
Disadvantages of Network model
• System Complexity
• Absence of structural independence
System Complexity
In a network model, data are accessed one
record at a time. This makes it essential for
the database designers, administrators, and
programmers to be familiar with the internal
data structures to gain access to the data.
Therefore, a user friendly database
management system cannot be created using
the network model.
Relational Model
• The relational model was introduced by E.F. Codd in
1970 as a way to make database management systems
more independent of any particular application.
• The data is stored in two-dimensional tables (rows and
columns).
• The data is manipulated based on the relational theory
of mathematics.
• General Structure
• Four key terms are used extensively in
relational database models: relations,
attributes, tuples and domains.
• A relation is a table with columns and rows.
• The named columns of the relation are called
attributes,
• Domain is the set of values the attributes are
allowed to take.
• Rows of the tables are known as tuples.
KEY
• Tables can also have a designated single
attribute or a set of attributes that can act as a
"key"
• used to uniquely identify each tuple in the table.
• A key that can be used to uniquely identify a row
in a table is called a primary key.
• Keys are commonly used to join or combine data
from two or more tables.
Example
Example
• Relational model supports
• 1:1,
• 1:m ,
• m :n relationship.
Advantages of Relational model
• Structural Independence
• Conceptual Simplicity
• Design, Implementation, maintenance and
usage ease
• Ad hoc query capability
Structural Independence

The relational model does not depend on the


navigational data access system thus freeing
the database designers, programmers and end
users from learning the details of data
storage. Changes in the database structure do
not affect the data access.
Ad hoc query capability
The presence of very powerful, flexible and
easy to use query capability is one of the main
reasons for the immense popularity of the
relational model.

The query language of relational model is


SQL(Structured Query Language).
Disadvantages of Relational Model
• Hardware overheads
• Ease of design can lead to bad design
• ‘Information island’ phenomenon
Hardware overheads
Relational database systems hides the
implementation complexities and the physical
data storage details from the users.

For making things easier for the users, the


relational database systems need more
powerful hardware-computers and data
storage devices.
Ease of design can lead to bad design
The relational database is an easy-to-design and
use system.
The design inefficiencies will not come to light
when the database is designed and when there
is only a small amount of data.
As the database grows, the poorly designed
database will slow the system down and will
result in performance degradation and data
corruption.
‘Information island’ phenomenon
• Since relational database systems are easy to
implement and use. This will create a situation where
too many people or departments will create their
own databases and applications.
• These information islands will prevent the
information integration that is essential for the
smooth and efficient functioning of the organization.
• These individual databases will also create problems
like data inconsistency, data duplication, data
redundancy .
Traditional File System Vs DBMS
• Self contained nature of database systems (database contains
both data and meta-data).
• Data Independence: application programs and queries are
independent of how data is actually stored.
• Data sharing.
• Controlling redundancies and inconsistencies.
• Secure access to database; Restricting unauthorized access.
• Enforcing Integrity Constraints.
• Backup and Recovery from system crashes.
• Support for multiple-users and concurrent access
DBMS File Processing System
A database management system File-processing system coordinates only the
coordinates both the physical and the physical access.
logical access to the data
A database management system reduces Data written by one programming a file-
the amount of data duplication by processing system may not be readable by
ensuring that a physical piece of data is another program.
available to all programs authorized to
have access to it

A database management system is File-processing system is designed to allow


designed to allow flexible access to data predetermined access to data (i.e., compiled
(i.e., queries) programs).

A database management system is A file-processing system is usually designed


designed to coordinate multiple users to allow one or more programs to access
accessing the same data at the same different data files at the same time.
time.  In a file-processing system, a file can be
accessed by two programs concurrently only
if both programs have read-only access to
the file.
File system
program-1
data description-1
File-1
program-2
File-2
data description-2

program-3 File-3
data description-3

File System approach

Application program-1
with data semantics
Description
Application program-2 Manipulation
with data semantics
Control
….. Database
Application program-3 .
with data semantics

DBMS approach
E-R Model
• E-R model is a high level conceptual data model
developed by Chen in 1976 to facilitate database
design.
• A conceptual data model is a set of concepts that
describe the structure of a database and the
associated retrieval and update transactions on the
database.
• The major components of E-R model are:
• Entities
• Attributes
• Relationships
Entities
• An entity is viewed as the atomic real world
item.
• A collection of similar entities is known as
entity set.
For example: employees
• Entities are represented as
Attributes
Each entity has attributes i.e. the particular
properties that describe it. A particular entity
will have a value for each of its attributes.
Attributes are represented as:
• Attributes can be classified as:
• Simple
• Composite
• Single-valued
• Multi-valued
• Derived
Simple attributes
• Attributes that are not divisible are called
simple or atomic attributes.
• It is an attribute composed of a single
component with an independent existence.
• Examples are: roll no of a student
Composite attributes
• Composite attributes can be divided into
smaller subparts, which represent more basic
attributes with independent meanings.
• For example, the address attribute of the
employee entity can be divided into street
number, area, city, pincode.
Single valued attributes
When an attribute holds a single value for a
particular entity it is called single valued
attribute.
For example semester attribute for a
particular class of students.
Multi valued attributes
• It is one that holds multiple values for a
particular entity.
• It may have lower and upper bounds on the
number of values allowed for each individual
entity.
• For example, a student entity can have
multiple values for hobby attribute. Suppose
hobby of a student may have minimum of one
hobby and maximum of five hobbies.
Stored and Derived attribute
• In some cases two (or more) attribute values are
related- for example, the Age and Birthdate
attributes of a person.
• For a particular person we can determine the value
of Age from the current date and the Birthdate . The
Age attribute is hence called a derived attribute and
Birthdate is called a stored attribute.
• A derived attribute is one that represents a value
that is derivable from the value of a related attribute
or set of attributes, not necessarily in the same
entity.
E-R Diagram Conventions for
representing attributes and entities
• The entities are represented by a rectangular box
with the name of the entity in the box.
• An attribute is shown as an ellipse attached to a
relevant entity by a line and labeled with the
attribute name.
• The entity name is written in uppercase where as the
attribute name is written in lowercase.
• The primary keys(key attributes) are underlined.
• The attributes are connected using lines to the
entities. If the attribute is simple or single valued a
single line is used.
• If the attribute is derived a dashed ellipse is used.
• If it is multivalued then double ellipse is used.
• If the attribute is composite, its component
attributes are shown as ellipses emanating from the
composite attribute.
Example of various attributes
Relationships
• A relationship is an association between
entities.
• The relationships that exists between the
entities relates data items to each other in a
meaningful way.
• In addition the relationship entity could have
attributes of its own.
• Usually the relationship name is an active verb,
but passive verbs are also used.
• Relationships are represented by diamond shaped
symbols with the relationship name inside the diamond.
Terms associated with entities and
relationships
The terms are:
o Degree
o Connectivity
o Cardinality
o Dependency
o Participation
Degree

• The degree of a relationship indicates the number of


associated entities. It can be classified as:
– Unary
– Binary
– Ternary
– N-ary
A unary relationship exists when an association is maintained
within a single entity.
A binary relationship exists when two entities are associated.
A ternary relationship exists when three entities are associated
may be prerequisites for other subjects.

Examples
Binary relationship
Ternary relationship
Connectivity
• Relationships can be classified as one to one,
one to many and many to many. The term
connectivity is used to describe this
relationship.

• A relationships connectivity is represented by


a 1, M or N next to the related entity.
one to one (1:1)
one to many (1:M)
many to many (M:N)
Cardinality

• The cardinality of a relationship is the number of


instances of entity B that can be associated with
entity A. 
• There is a minimum cardinality and a maximum
cardinality for each relationship, with an unspecified
maximum cardinality being shown as N.
•   Cardinality limits are usually derived from the
organisations policies or external constraints. 
Example

DEPARTMENT 1 M EMPLOYEE
has
The company policy does not allow more than
100 employees in a department. Therefore
the cardinality rule governing the
DEPARTMENT-EMPLOYEE relationship is
expressed as “One department can have a
maximum of 100 employees.”
Dependency

• Entities are classified as weak and strong


entity types.
• Weak Entity
• A Weak Entity type is dependent on the existence of
another entity type.
• They do not have key attributes of their own.
• They are also known as child, dependent or
subordinate entities.
• A weak entity is represented by a double lined
rectangle.
• Strong Entity
• A Strong Entity type is not dependent on the existence
of another entity type.
• They do have key attributes of their own.
• They are also known as parent, owner or dominant
entities.

To link strong entities with the weak entities we use


identifying relationships represented by double
diamond.
Example
INVOICE (Invoice#, Date, Customer#)
INVOICE LINE (Invoice#, Part#, Quantity)
Participation
• There are two ways an entity can participate in a
relationship-totally or partially. The participation is
also known as mandatory or optional.
• Total/Mandatory participation
• The participation is mandatory if an entity’s existence
requires the existence of an associated entity in a
particular relationship.
• Total participation of an entity is represented by
double lines.
• Optional entity/partial participation
• The participation is said to be optional if the occurrence
of one entity does not require the occurrence of
another corresponding entity in a relationship.
• An optional entity is shown by drawing a small circle(O)
on the side of the optional entity.
Example
Example of total participation
Aggregation
• Used when we have to model a relationship
involving (entity sets and) a relationship set.
• Aggregation allows us to treat a relationship set
as an entity set for purposes of participation in
(other) relationships.
• treat a relationship as an entity
Consider the ER diagram
ER diagram with aggregation

You might also like