Module 1 DBMS

You might also like

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 96

MODULE 1

Introduction to Databases
Database Management System
Subject Code: 21CS53
• Introduction to Databases
CONTENTS
– Introduction
– Characteristics of database approach
– Advantages of using the DBMS approach
– History of database applications
– Overview of Database
• Languages and Architectures
– Data Models, Schemas, and Instances
– Three schema architecture and data independence
– Database languages and interfaces
– The Database System environment
• Conceptual Data Modelling using Entities and Relationships
– Entity types, Entity sets, attributes, roles, and structural constraints
– Weak entity types
– ER diagrams
– Examples
– Specialization and Generalization
Introduction to Database
• Data is nothing but known facts that can be
recorded and that have implicit meaning.
• A Database is a collection of related data.
• Example consider the names, telephone numbers, and
addresses of the people it can be recorded in an indexed
address book or can be stored on a hard drive in a computer
and software such as Microsoft Access or Excel. Collection of
related data with an implicit meaning is a Database.
Properties of Database
 A database represents some aspect of the real world, sometimes
called the mini world or the universe of discourse (UoD).
 Changes to the mini world are reflected in the database.
 A database is a logically coherent collection of data with some
inherent meaning.
 A database is designed, built, and populated with data for a
specific purpose.
 It has an intended group of users and some preconceived
applications in which these users are interested.
 A database can be of any size and complexity.
 A database may be generated and maintained manually or it may
be computerized.
DATA BASE MANAGMENT SYSTEM(DBMS)
A database management system (DBMS)
is a collection of programs that enables users to
create and maintain a database.
“ The DBMS is a general-purpose
software system that facilitates the processes
of defining, constructing, manipulating, and
sharing databases among various users and
applications. ”
• Defining a database involves specifying the data
types, structures, and constraints of the data to
be stored in the database.
• Constructing the database is the process of
storing the data on some storage medium that is
controlled by the DBMS.
• Sharing a database allows multiple users and
programs to access the database simultaneously.
• An application program accesses the database
by sending queries or requests for data to the
DBMS.
A simplified database system environment
Traditional File Processing v/s DBMS

• In traditional file processing, data definition is


typically part of the application programs
themselves. Hence, these programs are
constrained to work with only one specific
database, whose structure is declared in the
application programs.
Characteristics of the Database Approach
 Self-describing nature of a database system.
 Insulation between programs, data and data
abstraction.
 Support of multiple views of the data.
 Sharing of data and multiuser transaction
processing.
1.Self-Describing Nature of a Database System
 A fundamental characteristic of the database approach is that the
database system contains not only the database itself but also a complete
definition or description of the database structure and constraints.
 The definition is stored in the DBMS catalog, which contains information
such as the structure of each file, the type and storage format of each data
item and various constraints on the data.
 The information stored in the catalog is called meta-data and it describes
the structure of the primary database.
 Whenever a request is made to access, the Name of a STUDENT record,
the DBMS software refers to the catalog to determine the structure of the
STUDENT file and the position and size of the data item within a STUDENT
record.
2.Insulation between programs, data and data abstraction.
• In file processing, the structure of data is embedded
in the application programs, so any changes to the
structure of a file may require changing all programs
that access that file.
• For example, a file access program may be written in
such a way that it can access only STUDENT records of
the structure ,If we want to add another piece of data
to each STUDENT record, say the Birth dates, such a
program will no longer work and must be changed.
• DBMS access programs do not require such
changes in most cases. The structure of data
files is stored in the DBMS catalog separately
from the access programs. We call this
property program-data independence.
• By contrast, in a DBMS environment, we only
need to change the description of STUDENT
records in the catalog to reflect the inclusion
of the new data item Birth_date; no programs
are changed.
3.Support of Multiple Views of the Data
• A database typically has many types of users, each of whom
may require a different perspective or view of the database.
• A view may be a subset of the database or it may contain
virtual data that is derived from the database files.
• A multiuser DBMS whose users have a variety of distinct
applications must provide facilities for defining multiple views.
• For example, one user of the database may be interested only
in accessing and printing the transcript of each student.
• A second user, who is interested only in checking that students
have taken all the prerequisites of each course for which the
student registers.
3.Sharing of Data and Multiuser Transaction Processing
• A multiuser DBMS, as its name implies, must allow multiple users to
access the database at the same time.
• This is essential if data for multiple applications is to be integrated
and maintained in a single database.
• A fundamental role of multiuser DBMS software is to ensure that
concurrent transactions operate correctly and efficiently.
• A transaction is an executing program or process that includes one
or more database accesses, such as reading or updating of database
records.
• The DBMS must enforce several transaction properties. The isolation
property ensures that each transaction appears to execute in
isolation from other transactions, even though hundreds of
transactions may be executing concurrently.
Database Users
Users may be divided into
 Those who actually use and control the
database content, and those who design,
develop and maintain database applications
called “Actors on the Scene”.
 Those who design and develop the DBMS
software and related tools, and the computer
systems operators called “Workers Behind the
Scene”.
Actors on the Scene
• Database Administrators.
• Database Designers.
• End Users
Database Administrator
• Chief administrator, who oversees and
manages the database system (including the
data and software).
• Duties include authorizing users to access the
database, coordinating/monitoring its use,
acquiring hardware/software for upgrades, etc.
• The DBA is accountable for problems such as
security breaches and poor system response
time.
Database Designers
• Responsible for identifying the data to be stored
and for choosing an appropriate way to organize
it.
• Database designers typically interact with each
potential group of users and develop views of the
database that meet the data and processing
requirements of these groups.
• The final database design must be capable of
supporting the requirements of all user groups.
End Users
• These are persons who access the database for
querying, updating, and report generation. There are
several categories of end users:
– Casual end users: use database occasionally, needing
different information each time; use query language to
specify their requests; typically middle- or high-level
managers.
– Naive/Parametric end users: Biggest group of users;
frequently query/update the database using standard
transactions that have been carefully programmed and
tested in advance.
• Examples: Bank tellers t check account balances.
• Sophisticated end users
• Include engineers, scientists, business analysts and
others who thoroughly familiarize themselves with the
facilities of the DBMS in order to implement their own
applications to meet their complex requirements.
• Stand-alone users:
• Maintain personal databases by using ready-made
program packages that provide easy-to-use menu-
based or graphics-based interfaces. Ex: user of a tax
package that stores a variety of personal financial data
for tax purposes.
System Analysts and Application
Programmers (Software Engineers)

• System Analysts: Determine needs of end


users, especially naive and parametric users,
and develop specifications for canned
transactions that meet these needs.
Application Programmers: Implement, test,
document, and maintain programs that satisfy
the specifications mentioned above.
Workers behind the Scene

• DBMS system designers and implementers.


• Tool developers.
• Operators and maintenance personnel.
DBMS System designers and Implementers
• A DBMS is a very complex software system that
consists of many components or modules,
modules for implementing the catalog,
Query language processing,
Interface processing,
Accessing and buffering data
Controlling concurrency
Handling data recovery and security.
• Responsible for Design and implement the DBMS
modules and interfaces as a software package.
• Tool developers
Design and implement tools that
facilitate database modelling and design,
database system design and improved
performance.
• Operators and maintenance personnel
(system administration personnel):
Responsible for the actual running and
maintenance of the hardware and software
environment for the database system.
Advantages of Using the DBMS Approach
• Controlling Redundancy.
• Restricting Unauthorized Access.
• Providing Storage Structures and Search Techniques for
Efficient Query Processing.
• Providing Backup and Recovery
• Providing Multiple User Interfaces.
• Representing Complex Relationships among Data.
• Enforcing Integrity Constraints.
• Permitting Inferencing and Actions Using Rules.
• Additional Implications of Using the Database Approach
• Controlling Redundancy
• Data redundancy leads to wastage of storage space,
duplicate and a higher likelihood of the introduction of
inconsistency.
• A DBMS should provide the capability to automatically
enforce the rule that no inconsistencies are introduced
when data is updated.
• Restricting Unauthorized Access
• A DBMS should provide a security and authorization
subsystem, which the DBA uses to create accounts and
to specify account restrictions. Then, the DBMS should
enforce these restrictions automatically
• Providing Storage Structures and Search Techniques
for Efficient Query Processing
• Database systems must provide capabilities for efficiently
executing queries and updates.
• The query processing and optimization module of the DBMS is
responsible for choosing an efficient query execution plan for each
query based on the existing storage structures.
• Providing Backup and Recovery
• A DBMS must provide facilities for recovering from hardware or
software failures.
• The backup and recovery subsystem of the DBMS is responsible for
recovery.
• The recovery subsystem could ensure that the transaction is
resumed from the point at which it was interrupted so that its full
effect is recorded in the database.
• Providing Multiple User Interfaces
• Many types of users with varying levels of technical
knowledge use a database, a DBMS should provide a
variety of user interfaces.
• For example, query languages for casual users,
programming language interfaces for application
programmers, forms and/or command codes for
parametric users.
• Representing Complex Relationships Among Data
• A DBMS must have the capability to represent a variety
of complex relationships among the data, to define new
relationships as they arise, and to retrieve and update
related data easily and efficiently.
Enforcing Integrity Constraints
Most database applications have
certain integrity constraints that must hold for
the data.
Permitting Inference and Actions Via Rules:
In a deductive database system, one may
specify declarative rules that allow the
database to denote new data.
When Not to Use a DBMS?
• DBMS may involve unnecessary overhead costs that would not be there in
traditional file processing.
• The overhead costs of using a DBMS are due to the following:
• High initial investment in hardware, software, and training.
• Overhead for providing security, concurrency control, recovery,
and integrity functions.
• Therefore, it may be more desirable to use regular files under
the following circumstances:
• Simple, well-defined database applications that are not expected
to change at all .
• Embedded systems with limited storage capacity, where a general-
purpose DBMS would not fit .
• No multiple-user access to data.
A Brief History of Database Applications
• Early Database Applications
– The Hierarchical and Network Models were introduced in mid
1960s and dominated during the 70’s
• Relational Model based Systems
– Relational model was originally introduced in 1970,
was researched and experimented in IBM Research
and several universities.
• Relational DBMS Products emerged in the 1980s.
• Data on the Web and E-commerce Applications.
• Script programming languages such as PHP and
JavaScript allow generation of dynamic Web pages
that are partially generated from a database
• New functionality is being added to DBMS in the
following areas:
• Scientific Applications.
• XML (extensible Markup Language)
• Image Storage and Management
• Audio and Video data management
• Data Warehousing and Data Mining
• Spatial data management
• Time Series and Historical Data Management
• Also allow database updates through Web pages
Question Bank
1. Define the database and briefly explain the implicit properties of the database.
2. Define the following terms:
i) data ii) database
iii) DBMS iv) program-data independence
v) Canned transaction.
(Canned transactions: Is the process of constantly querying and updating the
database, using standard types of queries and updates. which are frequently
used by Naive end users to constantly querying and updating database.)
3. Briefly discuss the advantages of using the DBMS.
4. Discuss the main Characteristics of the database approach and how does it differ
from Traditional file systems?
5. What are the different types of database end users? Discuss the main activities of
each.
Overview of Database Languages
and Architectures
Data Models
• A data model is a collection of concepts that
can be used to describe the structure of a
database.
• Structure of a database means the data types,
relationships, and constraints that apply to the
data.
Categories of Data Models
• High-level or conceptual data models.
• Low-level or physical data models.
• Representational or implementation data models.
High-level or conceptual data models.
• Provides concepts that are close to the way
many users perceive data.
• Uses concepts such as entities, attributes, and
relationships.
• An entity represents a real-world object or concept,
such as an employee or a project from the mini world
that is described in the database.
Low-level or Physical data models.
• Provides concepts that describe the details of
how data is stored on the computer storage
media, typically magnetic disks.
• Concepts provided by low-level data models
are generally meant for computer specialists,
not for end users.
3. Representational (or implementation) data
models
• Provide concepts that may be easily
understood by end users but that are not too
far removed from the way data is organized in
computer storage.
• In between high level and low level.
• Hides many details of data storage on disk but
can be implemented on a computer system
directly.
Schemas
• The description of a database is called the
database schema, which is specified during
database design and is not expected to change
frequently.
Three schema architecture and data
Independence

• The goal of the three-schema architecture is


to separate the user applications from the
physical database.
An example of the three levels
Customer_Loan
Cust_ID : 101
Loan_No : 1011 External
Amount_in_Dollars : 8755.00

CREATE TABLE Customer_Loan (


Cust_ID NUMBER(4)
Loan_No NUMBER(4) Conceptual
Amount_in_Dollars NUMBER(7,2))

Cust_ID TYPE = BYTE (4), OFFSET = 0 Internal


Loan_No TYPE = BYTE (4), OFFSET = 4
Amount_in_Dollars TYPE = BYTE (7), OFFSET = 8

46
• The internal level has an internal schema
• Describes the physical storage structure of the database.
• Uses a physical data model and describes the complete details of
data storage and access paths for the database.
• The conceptual level has a conceptual schema
• Describes the structure of the whole database for a community of
users .
• Hides the details of physical storage structures and concentrates on
describing entities, data types, relationships, user operations and
constraints
• Representational data model is used to describe the conceptual schema
when a database system is implemented .
• External or View Level
• Describes the part of the database that a particular user group is
interested in and hides the rest of the database from that user group.
Data Independence

We can define two types of data independence:

• Logical data independence


• Physical data independence
Database Languages and Interfaces
• The DBMS must provide appropriate
languages and interfaces for each category of
users.
• Once the design of a database is completed
and a DBMS is chosen to implement the
database the first step is to specify conceptual
and internal schemas for the database and
mappings between two.
• Data Definition Language (DDL)
• Storage Definition Language (SDLP)
• View Definition Language (VDL)
• Data Manipulation Language (DML)
• Host language
Data Definition Language (DDL)
• The data definition language (DDL) is used by the DBA
and by database designers to define schemas.
Storage Definition Language (SDL)
• Storage definition language is used when clear
separation is maintained between the conceptual and
internal levels, the DDL is used to specify the
conceptual.
• The storage definition language (SDL), is used to specify
the internal schema.
• The mappings between the two schemas may be
specified in either one of these language schema only.
View Definition Language (VDL)
• View definition language is used to specify
user views and their mappings to the
conceptual schema.
• In relational DBMSs, SQL is used in the role of
VDL to define user or application views as
results of predefined queries.
Data Manipulation Language (DML)
• Data manipulation languages (DML) are used to
perform manipulation operation such as
retrieval, insertion, deletion, and modification
of the data.
There are two main types of DMLs:
• High-level or nonprocedural DML
• Can be used on its own to specify complex database
operations concisely.
• Low Level or Procedural DML
• Must be embedded in a general purpose programming
languages.
• Host language
• DML commands, whether high level or low level are
embedded in a general purpose programming language
that language is called the host language .
• A high-level DML used in a standalone interactive
manner is called a query language.
DBMS Interfaces
User-friendly interfaces provided by a DBMS
may include the following:
• Menu-Based Interfaces for Web Clients or Browsing.
• Forms-Based Interfaces.
• Graphical User Interfaces.
• Natural Language Interfaces.
• Speech Input and Output.
• Interfaces for Parametric Users
• Interfaces for the DBA.
ER Modeling - Notations
ER Modeling -Notations
Entity An Entity is an object or concept about
which business user wants to store
information.
A weak Entity is dependent on another Entity to
Entity
exist. Example Order Item depends upon Order
Number for its existence. Without Order Number
it is impossible to identify Order Item uniquely.

Attribute Attributes are the properties or characteristics


of an Entity

A key attribute is the unique, distinguishing


Attribute
characteristic of the Entity

A multi-valued attribute can have more than


Attribute one value. For example, an employee Entity
can have multiple skill values.

62
ER Modeling -Notations
A derived attribute is based on another attribute. For
Attribute
example, an employee's monthly salary is based on the
employee's basic salary and House rent allowance.

Relationships illustrate how two entities share


Relationship
information in the database structure.

To connect a weak Entity with others, you should


Relationship
Relationship
use a weak relationship notation.

63
ER Modeling -Notations
Customer Cardinality specifies how many instances of an Entity
relate to one instance of another Entity. M,N both
1
represent ‘MANY’ and 1 represents ‘ONE’
Cardinality

1 M
Account Transaction

supervise
In some cases, entities can be self-linked.
For example, employees can supervise
other employees
Employee

64
Composite attribute
floor building

DOB
Name Address

E# Designation
Employee

65
Unary Relationship

Manages
Employee

66
Role names
• Role names may be added to make the meaning
subordinate
more explicit

Manages
Employee
M
Manager

1
67
Binary Relationship

M Works 1
Employee Department
for

68
Ternary Relationship

Medicine

Doctor Prescription Patient

69
Relationship participation

1 head 1
Employee department
of

l otal
partia T

70
Attributes of a Relationship
Medicine

Number of days

dosage

Doctor Prescription Patient

71
Weak entity
E# Id name
----

1 has N dependant
Employee

The dependant entity is represented by a double lined rectangle and


the identifying relationship by a double lined diamond

72
Company Database

73
Case Study – ER Model For a college DB
Assumptions :

• A college contains many departments


• Each department can offer any number of courses
• Many instructors can work in a department
• An instructor can work only in one department
• For each department there is a Head
• An instructor can be head of only one department
• Each instructor can take any number of courses
• A course can be taken by only one instructor
• A student can enroll for any number of courses
• Each course can have any number of students 74
Steps in ER Modeling
• Identify the Entities

• Find relationships

• Identify the key attributes for every Entity

• Identify other relevant attributes

• Draw complete E-R diagram with all attributes including Primary Key

• Review your results with your Business users


75
Steps in ER Modeling
Step 1: Identify the Entities
• DEPARTMENT

• STUDENT

• COURSE

• INSTRUCTOR

76
Steps in ER Modeling

Step 2: Find the relationships


• One course is enrolled by multiple students and one student enrolls for multiple
courses, hence the cardinality between course and student is Many to Many.

M N
COURSE ENROLLED STUDENT
BY

• The department offers many courses and each course belongs to only one department,
hence the cardinality between department and course is One to Many.

DEPARTMENT COURSE
1 OFFERS M

• One department has multiple instructors and one instructor belongs to one and only
one department , hence the cardinality between department and instructor is one to
Many.

DEPARTMENT 1 HAS M INSTRUCTOR

77
Steps in ER Modeling

Step 2: Find the relationships(Cont..)

• Each department there is a “Head of department” and one instructor is “Head of department
“,hence the cardinality is one to one .

DEPARTMENT HEADED INSTRUCTOR


1 BY 1

• One course is taught by only one instructor, but the instructor teaches many courses, hence
the cardinality between course and instructor is many to one.

COURSE M OFFERS 1 INSTRUCTOR

78
Steps in ER Modeling
Step 3: Identify the key attributes

• Deptname is the key attribute for the Entity “Department”, as it identifies the
Department uniquely.
• Course# (CourseId) is the key attribute for “Course” Entity.
• Student# (Student Number) is the key attribute for “Student” Entity.
• Instructor Name is the key attribute for “Instructor” Entity.

Step 4: Identify other relevant attributes

• For the department entity, the relevant attribute is location


• For course entity, course name, duration,prerequisite
• For instructor entity, room#, telephone#
• For student entity, student name, date of birth

79
Steps in ER Modeling
Step 5: Draw the E-R diagram

ER diagram for the


University

80
Department
Location
Name

Department

Pre Requisite 1 1 1
Headed
Has
Offers by
Course#

N 1 N

N 1
Duration Is taught
Course Instructor
by

N
Course Instructor
Room#
Name Name

Enrolled Telephone#
by

Student
Date of Birth

Student# Student Name


Case Study – Online Retail Application

• Draw an ER diagram of Online Retail Application which allows customer to purchase


items from a Retail shop.

• A customer can register to purchase an item. The customer will provide bank account
number and bank name ( the customer may have multiple account no ).

• After registration each customer will have unique customer Id, user id and password.

• Customer can purchases one or more item in different quantities . The items can be of
different classes based on their price.

• Based on the quantity , price of item and discount(if any) on the purchased items, the bill
will be generated. Bank account number is required to settle the bill.

• The application also mentions the information of suppliers who supply the items to the
retail shop. The retail shop may give order to supply the items based on some statistics
they maintain about different items.
82
Steps in ER Modeling
Step 1: Identify the Entities
• CUSTOMER

• ITEM

• SUPPLIER

• BILL

83
Steps in ER Modeling
Step 2: Find the relationships

• Customer can purchase an item and each purchase will be corresponding to a bill. So it is a
ternary relation ship.

• Items can be ordered to one or more suppliers. One supplier may take order of many items. So
many to many relationship between item and supplier.

• One customer can pay many bill and one bill can be paid by only one customer. So one to
many relation ship between customer and bill.

84
Steps in ER Modeling
Step 3: Identify the key attributes

 Customer entity will be identified by CustomerId


 Item entity will be identified by ItemId
 Supplier entity will be identified by SupplierId
 Bill entity will be identified by BillId

85
Steps in ER Modeling
Step 4: Identify other relevant attributes of Entities and
Relationships

• For Customer entity the relevant attributes will be


(CustomerId,CustomerName, DateOfRegistration, UserId,
Password, AccountNo)

• For Item entity the relevant attributes will be


(ItemId, ItemName, UnitOfMeasurement, UnitPrice, Discount,
QuantityOnHand, SupplierId,ReOrderLevel,ReOrderQuantity,Class)

• For Supplier entity the relevant attributes will be


(SupplierID, SupplierName, SupplierContactNo)

• For Bill entity the relevant attributes will be


( BillId, AccountNo, BillAmount, BillDate)
86
Steps in ER Modeling
Step 4: Identify other relevant attributes of
entities and Relationships (Cont..)

• For Purchase Relation the relevant attributes will be


(QtyPurchased, NetPrice)

• For OrderedTo relation the relevant attributes will be


(OtyOfOrder, OrderDate, DeliveryDate, DeliveryStatus)

• For Pays relation the relivent attributes will be


(AccountNo)

87
Steps in ER Modeling
Step 5:
Draw complete E-R diagram with all attributes

Microsoft Office
Word Document

88
CustomerName
AccountNo BillDate

DateOfRegistration BillId

UserId

1 N
Customer Pays Bill
Password

CustomerId AccountNo

Purchases

QtyPurchas NetPrice
ed
OrderStatus
OrderDate
Class DeliveryDate
QtyOfOrder
ItemId

ItemName Item M N
OrderedTo Supplier

QtyOnHand
SupplierId
UnitPrice SupplierContact
No
SupplierId
UnitOfMeasurement SupplierName

ReOrderLevel Discount

ReOrderQuantity
Merits and Demerits of ER Modeling
Merits
• Easy to understand. Represented in Business Users Language. Can be understood by
non-technical specialist.
• Intuitive and helps in Physical Database creation.
• Can help in database design.
• Gives a higher level description of the system.

Demerits
• Physical design derived from E-R Model may have some amount of redundancy
which may lead to inconsistency.
(This will be discussed when we study Normalization on day two)

• Sometime diagrams may lead to misinterpretations because of limited information


present in the diagram.
94
Summary of ER Modeling
• Most of the application errors are because of miscommunication between the application
user and the designer and between the designer and the developer.

• It is always better to represent business findings in terms of picture to avoid


miscommunication

• It is practically impossible to review the complete requirement document by business


users.

• An E-R diagram is one of the many ways to represent business findings in pictorial format.

• E-R Modeling will also help the database design

• E-R modeling has some amount of inconsistency and anomalies associated with it.

95
Summary
• Traditional File Approach
• Advantages of a DBMS
• Three layers of abstraction
• Users of DBMS
• Database Models
• Types of Databases
• Relational Model Basics
• Keys
• Conceptual Design
– ER Modelling
– ER Modelling Notations
– ERD Case study
– Merits & Demerits of ER Modeling

96

You might also like