Module-1 Database System Concepts and Architecture

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 71

1

BCSE302L – Database
Management
Systems
Module 1: Database System Concepts
& Architecture
Dr. K.P. Vijayakumar,
VIT Chennai
2

Module 1
Module:1 Database System Concepts and Architecture
Need for database systems – Characteristics of Database
Approach – Advantages of using DBMS approach - Actors on
the Database Management Scene: Database Administrator -
Classification of database management systems - Data Models
– Schemas and Instances - Three-Schema Architecture - The
Database System Environment - Centralized and Client/Server
Architectures for DBMSs – Overall Architecture of Database
Management Systems.
Basic Definitions 3

 Data
 Database
 Database management System (DBMS)

Image Source: https://www.theteflcentre.com/news/skills-reading-6-tasks-for-


reading-activities-matching-words-to-definitions
Basic Definitions 4

 Data
Known facts that can be recorded and
have an implicit meaning.

Image Source: https://analyticsindiamag.com/10-best-data-cleaning-tools-get-data/


Basic Definitions 5

 Database
A collection of related data.

Image Source: https://tdwi.org/articles/2019/03/11/dtw-all-five-database-


requirements-for-digital-transformation.aspx
Basic Definitions 6

 Database Management System


 Collection of Programs that enable the users
to create and manipulate the data.
 General purpose software system
 provide a way to store and retrieve
database information that is both
convenient and efficient

Image Source: https://www.webinhindi.com/2019/04/what-is-dbms-in-hindi.html


Basic Definitions 7

 Database Management System


 General purpose software system
 Defining – data types, structures, and
constraints
 Constructing – process of storing the data
 Manipulating – Querying, Updating
 Sharing databases – multiple users
 Database definition
 Database Catalog or Dictionary or meta-data

Image Source: Elmasri


Need/Purpose of the 8

Database System
Disadvantages of File Processing System
 Data Redundancy and inconsistency
 Difficulty in accessing data
 Data Isolation – scattered data
 Integrity Problems -consistency constraints
 Atomicity Problems – all or none
 Concurrent Access Anomalies
 Security Problems
Need/Purpose of the 9

Database System
Disadvantages of File Processing System
 Data Redundancy

Student File Hostel File


Name : Ram Name : Ram
Reg.No.: 22BCE1001 Reg.No.: 22BCE1001
Dept : SCOPE Dept : SCOPE
Year : 2 Year : 2
Address: ABC Address: ABC
Mobile: 9123456789 Mobile: 9123456789
Courses: Block : A
… Room No: 101
… …
Need/Purpose of the 10

Database System
Disadvantages of File Processing System
 Data inconsistency

Student File Hostel File


Changed Student Name : Ram Name : Ram
address is not Reg.No.: 22BCE1001 Reg.No.: 22BCE1001
reflected in Hostel Dept : SCOPE Dept : SCOPE
Year : 2 Year : 2
file
Address: XYZ Address: ABC
Mobile: 9123456789 Mobile: 9123456789
Courses: Block : A
… Room No: 101
… …
Student File
Need/Purpose of the 11

Name : Ram
Reg.No.: 22BCE1001
Database System
Dept : SCOPE Disadvantages of File Processing System
Year : 2
Address: ABC
 Difficulty in accessing data
Mobile: 9123456789
Courses:

Program 1 : Extract and display the list of students who
… live in chennai
Name : Vijay
Reg.No.: 22BCE1002 Program 2 : Extract and display the list of students who
Dept : SCOPE registered DBMS course
Year : 2
Address: ABC …
Mobile: 9123456788 Program N : …
Courses:

Need/Purpose of the 12

Database System
Disadvantages of File Processing System
To retrieve the  Data Isolation – scattered data
appropriate data
is difficult. Student File Hostel File
Name : Vijay
Name : Vijay
Reg.No.: 22BCE1001
Reg.No.: 22BCE1001
Dept : SCOPE
Dept : SCOPE
Year : 2
Year : 2
Address: ABC
Block : A
Mobile: 9123456789
Room No: 101
Email: vijay22@vit.ac.in
Dues in Rs: 15000
Courses:


Customer File
Need/Purpose of the 13

Name : Ram
AC.No.: 221001
Database System
Branch : VIT Disadvantages of File Processing System
Address: ABC
Mobile: 9123456789  Integrity Problems -consistency constraints
Balance : 100000

… Constraint 1 : Balance never fall below zero
Name : Vijay
Ac. No.: 221002
Branch : VIT Constraint 2 : Balance never fall below 5000
Year : 2 …
Address: ABC Constraint N : …
Mobile: 9123456788
Balance : 150000 Add/Update appropriate code in the various application programs


A - 50000
B - 40000 Need/Purpose of the 14

A
5000
B Database System
Disadvantages of File Processing System
Transaction to transfer $5000 from  Atomicity Problems – all or none
account A to account B:
 If the transaction fails after step 3 and before
1. read(A)
step 6, money will be “lost” leading to an
2. A := A – 5000
inconsistent database state.
3. write(A)
4. read(B) ----- Failure
5. B := B + 5000
6. write(B)

Should be Due to failure


A - 45000 A - 45000
B - 45000 B - 40000
Inconsistent
Course Registration File
Need/Purpose of the 15

Courses:
Database System
Disadvantages of File Processing System
DBMS:  Concurrent Access Anomalies
Total : 60 Student 1 :
Occupied : 59
Left : 1
… Occupied is viewed as 59 and register DBMS course
… Access at a timeStudent 2 :

Occupied is viewed as 59 and register DBMS course

Leads to an incorrect or inconsistent state


Customer File
Need/Purpose of the 16

Name : Ram
AC.No.: 221001
Database System
Branch : VIT Disadvantages of File Processing System
Address: ABC
Mobile: 9123456789  Security Problems
Balance : 100000  enforcing such security constraints is


difficult in file processing system
Name : Vijay
Ac. No.: 221002
Branch : VIT
Year : 2
Address: ABC
Mobile: 91234567
Balance : 150000
Unauthorized User

Database Applications 17

 Banking: transactions
 Airlines: reservations, schedules
 Universities: registration, grades
 Sales: customers, products, purchases
 Online retailers: order tracking,
customized recommendations
 Manufacturing: production, inventory,
orders, supply chain
 Human resources: employee records,
salaries, tax deductions
Characteristics of Database 18

Approach
1. Self-describing nature of a database system
2. Insulation between Programs and Data, and Data
Abstraction
3. Support of Multiple Views of the Data
4. Sharing of Data and Multiuser Transaction
Processing

Source: Ramez Elmasri and Shamkant B. Navathe


Characteristics of 19

Database Approach

 Self-describing nature of a database system:


 A DBMS catalog stores the description of a
particular database (e.g. data structures, types,
and constraints)
 The description is called meta-data.
 This allows the DBMS software to work with
Image Source: Ramez Elmasri and Shamkant B. Navathe, ,
https://www.quora.com/How-should-I-write-my-self-description-in-SSB-interviews
different database applications.
Characteristics of 20

Database Approach
Date of
Birth

 Insulation between programs and data:


 Program-data independence
 Allows changing data structures and storage
organization without having to change the DBMS
Program-data independence
+ access programs.
Program Operation Independence
=  Program Operation Independence
Data Abstraction
 The implementation (or method) of the operation is
Image Source: Ramez Elmasri and Shamkant B. Navathe, , specified separately and can be changed without
https://www.quora.com/How-should-I-write-my-self-description-in-SSB-interviews affecting the interface
Characteristics of 21

Database Approach
 Data Abstraction:
 Conceptual representation
 A conceptual representation of the
STUDENT records
 Hides the details such as how the data
is stored or how the operations are
implemented.
 A data model
 Is type of data abstraction
 hide storage and implementation details
 present the users with a conceptual view of
Image Source: https://www.hitechnectar.com/blogs/data-abstraction-level/ the database.
Characteristics of 22

Customer File
Name : Ram Database Approach
AC.No.: 221001
Branch : VIT Manager
Address: ABC can
Mobile: 9123456789 access all customers
Balance : 100000 details


Name : Vijay
Ac. No.: 221002 Cashier
Branch : VIT can access only
Year : 2 Ac.No. and Balance
 Support of multiple views of the data:
Address: ABC  Each user may see a different view of the
Mobile: 91234567 database, which describes only the data
Balance : 150000 of interest to that user.

Characteristics of 23

Database Approach
 Sharing of data and multi-user
transaction processing:
 Allowing a set of concurrent users to
retrieve from and to update the database.
 Concurrency control within the DBMS
guarantees that each transaction is correctly
executed or aborted
 Recovery subsystem ensures each
completed transaction has its effect
permanently recorded in the database
 OLTP (Online Transaction Processing) is a
major part of database applications. This
allows hundreds of concurrent transactions
to execute per second.
Advantages of Database Approach 24

 Controlling redundancy
 Duplication, wastage of storage space,
inconsistency
 Restricting unauthorized access to data.
 Providing persistent storage for program
Objects
 Providing backup and recovery services.

Image Source: https://www.ringlead.com/blog/the-benefits-of-using-database-management-systems


Advantages of Database Approach 25

 Providing Storage Structures (e.g. indexes)


and searching techniques for efficient Query
Processing (Query Processing and
Optimization)
 Providing multiple user interfaces -GUI
 Representing complex relationships among
data.

Image Source: https://www.ringlead.com/blog/the-benefits-of-using-database-management-systems


Advantages of Database Approach 26

 Enforcing integrity constraints on the


database.
 Drawing inferences and actions using
rules
 Triggers, stored procedures
 Potential for enforcing standards.
 Reduced application development time.

Image Source: https://www.ringlead.com/blog/the-benefits-of-using-database-management-systems


Advantages of Database Approach 27

 Flexibility to change data structures.


 Availability of up to date information.
 Economies of scale:
 Wasteful overlap of resources and
personnel can be avoided by
consolidating data and applications
across departments

Image Source: https://www.ringlead.com/blog/the-benefits-of-using-database-management-systems


Database Users 28

• Database Administrators
• Database Designers
• End Users
• Casual, Naive or parametric,
• Sophisticated, Standalone users
• System Analysts and Application Programmers
 Actors on the scene
 Those who actually use the database content
and control and monitor the database
 Workers behind the Scene
• DBMS system designers and implementers  Those who design, develop and administrate
• Tool developers the Database and operation of DBMS software
• Operators and maintenance personnel and system environment

Image Source: https://www.iconfinder.com/icons/85409/users_icon


Actors on the Scene 29

 Database administrators:
 Responsible for authorizing access to the
database
 coordinating and monitoring its use
 acquiring software and hardware resources
Image Source: https://www.dataversity.net/so-you-want-to-be-a-database-administrator/
as needed.
Actors on the Scene 30

 Database Designers:
 Responsible to define the content, the
structure, the constraints, and functions or
transactions against the database. They must
communicate with the end-users and
Image Source: https://www.cybertec-postgresql.com/en/services/postgresql- understand their needs.
design/postgresql-database-modeling/
Actors on the Scene 31

 End-users: They use the data for queries,


reports and update the database content. End-
users can be categorized into:
 Casual: access database occasionally when
needed
 Naive or Parametric: they make up a large
section of the end-user population.
 Examples are bank-tellers or reservation
clerks who do this activity for an entire shift
of operations.

Image Source: http://dinesql.blogspot.com/2015/10/types-of-database-end-users.html


Actors on the Scene 32

 Sophisticated:
 Business analysts, scientists, engineers,
others thoroughly familiar with the system
capabilities.
 Many use tools in the form of software
packages that work closely with the stored
Image Source: https://knowyourmeme.com/photos/1698903-computer-reaction-faces
database.
Actors on the Scene 33

 Stand-alone:
 Mostly maintain personal databases using
ready-to-use packaged applications.
 An example is a tax program user that creates its
own internal database.
 Another example is a user that maintains an
Image Source: https://www.yourdictionary.com/stand-alone-pc address book.
Workers behind the 34

Scene

 DBMS System Designers and Implementers


 Design and implement DBMS modules and interfaces as a
software package.
 Tool Developers
 They design and implement tools which include software
packages that facilitate database modelling and design,
database system design and improved performance.
Image Source: https://www.yourdictionary.com/stand-alone-pc
Workers behind the 35

Scene

 Operators and Maintenance Personnel


 Responsible for actual running and maintenance of the
hardware and software environment for the database
system.
Image Source: https://www.yourdictionary.com/stand-alone-pc
36

Classification of DBMS
 Criteria to classify DBMS
 The data model on which the DBMS is
based.
 Relational, object ,hierarchical, network
and XML model
 The number of users supported by the
system
 Single user, multiuser
37

Classification of DBMS
 Criteria to classify DBMS
Distributed DBMS Homogeneous DDBMS  The number of sites over which the
database is distributed
 Centralized, Distributed DBMS,
Homogeneous DDBMS, Heterogeneous
DDBMS, Federated DBMS or multi
database system

Federated DBMS

Heterogeneous DDBMS Federated DBMS


Image Source: bisma,Slideshare,TutorialRide,
38

Classification of DBMS
 Criteria to classify DBMS
 Cost
 Open source – MySQL, PostgreSQL
 Proprietary – Oracle, MS SQL Server,IBM DB2
 Types of access path options for storing files
 File structure
 General or Special purpose
 General purpose
 meets the need of as many applications as possible
 Example: EMail
 Special purpose
 Air line reservation, Railway reservation
 OLTP – large no. of concurrent transactions without delay
Student
Name Reg.No. Dept
Data Models 39

Ram 22BCSE1000 ECM

Srinath 22BCSE1101 ECM

Sam 21BCSE1102 ECM

Relationship
Course
Name Code Dept Reg.No.

DBMS BCSE302L CSE 22BCSE1000  Data Model:


SE CSE3001 CSE 22BCSE1101
 A set of concepts to describe the
structure of a database
IWP CSE3002 CSE 21BCSE1102
 Data types
 Relationships
Image Source: https://powerpivotpro.com/2016/02/data-modeling-power-pivot-power-bi/
 Constraints
Data Model 40

Operations
 Operations on the data model may
include
 basic model operations (e.g. generic
insert, delete, update, retrieve)
 user-defined operations (e.g.
compute_student_gpa).
41
Categories of Data Model
 Conceptual or High-level or semantic data models:
 Provide concepts that are close to the way many users perceive data.
 ER Model - Entity, Attribute, relationship
 Implementation or representational data models:
 Provide concepts that easily understood by end users and hides many details of data storage
on disk
 Relational model – records
 Physical or low-level or internal data models:
 Provide concepts that describe details of how data is stored in the computer storage
 Record format, record ordering, access path
Schema vs Instances 42

 Database Schema:
 description of a database

Student={Name,Student_number,
class,Major}
Course={Course_Name,Course_number,
Credit_balance, Department}

Schema vs Instances 43

 Schema Diagram:
 An illustrative display of (most aspects
of) a database schema.
Schema vs Instances 44

 Schema Construct:
 A component of the schema or an
object within the schema

 Ex: STUDENT, COURSE, SECTION,


GRADE_REPORT, PREREQUISITE
Schema vs Instances 45

 Database State:
 Database Instance The actual data
stored in a database at a particular
moment in time.
 Also called occurrence or snapshot
Student
Database Schema vs 46

Name Reg.No. Dept State


 Empty State:
Student  Refers no data in the database

Name Reg.No. Dept


 Initial Database State:
Ram 20BCE1100 CSE  When the database is initially loaded
with initial data.
Student  Valid State:
 A state that satisfies the structure and
Name Reg.No. Dept
constraints of the database.
Ram 20BCE1100 CSE

Srinath 20BCE1101 CSE

Ram 20BCE1102 CSE


Database Schema vs 47

State
 Distinction

Database Schema Database State

The database The database state changes


schema changes every time the database is
very infrequently updated

Schema is also State is also called


called intension. extension.
Three Schema 48

…..ViewN
Architecture
View1 …
 convenient tool with which the user can visualize
the schema levels in a DB system.
 It is also called ANSI/SPARC architecture or 3-
level architecture.
 It is used to
 describe the structure of a specific database system.
 separates the user applications and physical
database.
 External or View level:
 Describes various user views (part of a DB)
 uses a representational data model
Three Schema 49

View1 … …..ViewN
Architecture
 Conceptual or Logical schema:
 Describe structure of whole database
 Describe - entities, data types, relationships, user
operations, and constraints
 uses a representational data model
 Hides the details of physical storage structures
 Internal Level or internal schema:
 PHYSICAL storage structure of DB and access
paths(index).
 uses a physical data model
50

Three Schema Architecture


Name EId Dept Name EmpId Dept DoB Salary Name EmpId Dept DoB Salary

Vijay 1 CSE Vijay 1 CSE … x


Vijay 1 CSE … x
Ram 2 CSE
Ram 2 CSE … Y
Sam 3 CSE
View1 (by student) View2 (by Vijay) View3 (by HoD/Dean)
Name EmpId Dept DoB Salary

Vijay 1 CSE … x

Ram 2 CSE … Y

Sam 3 ECE …. z
51
Data Independence
 Data Independence is defined as a property
of DBMS that helps the user to change the
Database schema at one level of a database
system without requiring to change the
schema at the next higher level.
 Types of Data Independence
 Physical Data Independence
 Logical Data Independence
52
Physical Data Independence
 Physical Data Independence : Change the
internal schema without having to change
conceptual schema. i.e can easily change the
physical storage structures or devices with an
effect on the conceptual schema.
 Example - creating additional access structures
 Index
 Quick access
Benefits of Physical Data 53

Independence
 Due to Physical independence, any of the below
changes will not affect the conceptual layer:
 Using a new storage device like Hard Drive or
Magnetic Tapes
 Modifying the file organization technique in the
Database
 Switching to different data structures.
 Changing the access method
 Changes to compression techniques or hashing
algorithms.
 Change of Location of Database
 Example : from C: Drive to D: Drive
54
Logical Data Independence
 Logical Data Independence is the ability to change the
conceptual scheme without changing external
schema or application programs
 When compared to Physical Data independence, it is
challenging to achieve logical data independence.
 Due to Logical independence, any of the below change
will not affect the external layer.
 Add/Modify/Delete a new attribute, entity or
relationship is possible without a rewrite of existing
application programs.
 Merging two records into one.
 Breaking an existing record into two or more
records.
Centralized Database
55

Management System

 A centralized database is stored at a single location such as a


mainframe computer.
 It is maintained and modified from that location only and
usually accessed using an internet connection such as a LAN or
WAN.
 The centralized database is used by organizations such as
colleges, companies, banks etc.
Merits and De-merits of
56

Centralized DBMS
Advantages :
 The Data Integrity is maximized as the whole database is
stored at a single physical location. It is easier to coordinate
the data and it is as accurate and consistent as possible.
 The Data Redundancy is minimal in the centralized
database. All the data is stored together and not scattered
across different locations. So, there is no redundant data
available.
 Since all the data is in one place, there can be stronger
security measures around it. So, It is much more secure.
 Data is easily portable because it is stored at the same
place.
 It is cheaper than other types of databases as it requires
less power and maintenance.
 All the information can be easily accessed from the same
location and at the same time.
Merits and De-merits of
57

Centralized DBMS
Disadvantages :
 Since all the data is at one location, it takes more time
to search and access it. If the network is slow, this
process takes even more time.
 There is a lot of data access traffic for the centralized
database. This may create a bottleneck situation.
 Since all the data is at the same location, if multiple
users try to access it simultaneously it creates a
problem. This may reduce the efficiency of the system.
 If there are no database recovery measures in place
and a system failure occurs, then all the data in the
database will be destroyed.
Client-Server Database
58

Management System
 A client does not share any of its resources,
but requests a server’s content or service
function.
 Clients therefore initiate communication
sessions with servers which await incoming
requests.
 Examples of computer applications that use
the client–server model are Email, network
printing, and the World Wide Web.

Image Source: guru99.com


Merits and De-merits of
59

Client-Server DBMS
Advantages :
 Centralization – Access, Resources, and
Data Security are controlled through server.
 Scalability – Any element can be upgraded
when needed.
 Flexibiltiy – New Technology can be easily
integrated into the system.
 Interoperabilty – All components work
together.
Merits and De-merits of 60

Client-Server DBMS
Disadvantages :
 Dependability – When Servers goes down,
operations will cease.
 Lack of Mature Tools - To administrate.
 Lack of Scalability – Network OS are not
vary scalable.
 Network Congestion.
DBMS Architecture 61

 DBMS design depends upon its architecture.


 Client/server architecture consists of many PCs
and a workstation which are connected via the
network.
 Client/server architecture is used to deal with a
large number of PCs, web servers, database
servers and other components that are
connected with networks.
Image Source: medium.com, guru99.com
Three Layers of an 62

Application
 Presentation Layer
 Business Layer • Get Username,
Password, Captcha
 Data Layer presentation layer
• Get Username, Passwor
from data layer
• Compare
1- Tier Architecture 63

 In this architecture, the database is directly


accessed by the user.
 Any changes will be performed over the database
itself.
 Used for development of the local application,
where programmers can directly communicate
with the database for the quick response.

Image Source: guru99.com


2 - Tier Architecture 64

 It is based on client-server model. But Applications


on the client end can directly communicate with the
database at the server side.
 For this interaction, API’s like: ODBC, JDBC are used.
The user interfaces and application programs are run
on the client-side.
 The server side is responsible to provide the
functionalities like: query processing and transaction
management.
 To communicate with the DBMS, client-side
application establishes a connection with the server
side.

Image Source: guru99.com


3 - Tier Architecture 65

 The 3-Tier architecture contains another layer between the


client and server.
 In this architecture, client can’t directly communicate with the
database at the server side.
 The application on the client-end interacts with an application
server which further communicates with the database system.
 End user has no idea about the existence of the database
beyond the application server and the database also has no idea
about any other user beyond the application.
 The 3-Tier architecture is used in case of large web application.
Image Source: guru99.com
DBMS Components/System Structure
66
DBMS Components 67

 Users
 DBA - Responsibilities
 Defines
 the Schemas
 Storage structure and access method
 Grant user authority
 Integrity constraint
 Monitor the performance and response the changes to
requirements
 Casual user
 Occasionally access data via interactive query interface
 Application Programmer
 Parametric users
DBMS Components 68

 DDL Compiler :
 Processes schema definitions, specified in the DDL, and
stores descriptions of the schemas in the DBMS catalog
 System Catalog/Data Dictionary:
 It defines the structure of the database.
 names and sizes of tables, names and data types of data
items, storage details of each table, mapping information
among schemas, constraints, indexes, triggers etc.
 Query Compiler:
 Queries are parsed and validated for correctness of the
query syntax, the names of tables and data elements
 Query Optimizer:
 rearrangement and possible reordering of operations,
elimination of redundancies, and use of correct algorithms
and indexes during execution
DBMS Components 69

 Precompiler:
 extracts DML commands from an application program
written in a host programming language
 DML :
 DML processor must interact with the query processor to
generate the appropriate code
 Query processor :
 It transforms user queries into a series of low level
instructions.
 It is used to interpret the online user’s query and convert it
into an efficient series of operations in a form capable of
being sent to the run time data manager for execution.
DBMS Components 70

 Stored data manager:


 uses basic operating system services for carrying out low-
level input/output (read/write) operations between the
disk and main memory.
 concurrency control and backup and recovery
systems
 Concurrency-control manager controls the interaction
among the concurrent transactions, to ensure the
consistency of the database
 Recovery system ensures database remains in a consistent
state despite system failures, power failures and OS crashes
and transaction failures.
71

You might also like