Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 79

Database Systems

Introduction
Basic Definitions
Database: A collection of related data.

Data: Known facts that can be recorded and have an


implicit meaning.

Mini-world: Some part of the real world about which


data is stored in a database. For example, student
grades and transcripts at a university.

Database Management System (DBMS): A software


package/ system to facilitate the creation and
maintenance of a computerized database.

Database System: The DBMS software together with


the data itself. Sometimes, the applications are also
included.
Typical DBMS Functionality

 Define a database : in terms of data types,


structures and constraints.

 Construct or Load the Database on a


secondary storage medium.

 Manipulating the database : querying,


generating reports, insertions, deletions and
modifications to its content
 Concurrent Processing and Sharing by a set
of users and programs – yet, keeping all data
valid and consistent

 Other features:
 Protection or Security measures to prevent
unauthorized access
 Presentation and Visualization of data
Example of a Database
 Mini-world for the example: Part of a
UNIVERSITY environment.
 Some mini-world entities:
 STUDENTs
 COURSEs
 SECTIONs (of COURSEs)
 (academic) DEPARTMENTs
 INSTRUCTORs

 Note: The above could be expressed in the


ENTITY-RELATIONSHIP data model.
Fig 1: Sample database
 Some mini-world relationships:

 SECTIONs are of specific COURSEs


 STUDENTs take SECTIONs
 COURSEs have prerequisite COURSEs
 INSTRUCTORs teach SECTIONs
 COURSEs are offered by DEPARTMENTs
 STUDENTs major in DEPARTMENTs
Main Characteristics of the Database Approach

 Self-describing nature of a database system

 Insulation between programs and data

 Support of multiple views of the data

 Sharing of data and multiuser transaction


processing
Main Characteristics of the Database....
 1. Self-describing nature of a database system:
A DBMS catalog stores the description of the
database. (The description is called meta-data).
- This allows the DBMS software to work with
different databases.
- In traditional file processing, data definition is
typically part of the application programs
themselves.
- Programs are constrained to work with only
one specific database, whose structure is declared
in the application programs.
2. Insulation between programs and data:
 In traditional file processing, the structure of data file

is embedded in the application programmes, so any


changes to the structure of a file may require
changing all programs that access that file.
 By contrast, DBMS access programs do not require

such changes in most cases. The structure of data


file is stored in the DBMS catalog separately from
the access programmes.
 DBMS allows changing data storage structures and

operations without having to change the DBMS


access programs. This property is called program-
data independence.
 For example, a file access program may be
written in such a way that it can access only
STUDENT records of the structure shown in
Figure 1.4
 Program-operation independence:
- In some types of db sys, users can define
operations on data as part of the db definitions.
- An operation is specified in two parts:
- Interface – includes operation name and data
types of its arguments(or parameters)
- Implementation – a implementation of
operation
- specified separately
- can be changed without
affecting the interface
 User application programs can operate on the data
by invoking these operations through their names
and arguments, regardless of how the operations
are implemented. This is termed as program-
operation independence.

 The characteristic that allows program-data


independence and program-operation
independence is called data abstraction.

 Data Abstraction: A data model is used to hide


storage details and present the users with a
conceptual view of the database.
Ex: Fig 1.2, Fig 1.3, Fig 1.4
3. Support of multiple views of the data:

 Each user may see a different view of the


database, which describes only the data of
interest to that user.

 View-May be a subset of the database or it may


contain virtual data that is derived from the
database files but is not explicitly stored.
1. one user of the database of may be interested only in accessing and printing
the transcript of each student.
2. second user, who is interested only in checking that students have taken all
the prerequisites of each course for which they register, may require the view
4. Sharing of data and multiuser transaction
processing:
- Allowing a set of concurrent users to retrieve and
to update the database.
- Concurrency control within the DBMS
guarantees that each transaction is correctly
executed or completely aborted.
- OLTP (Online Transaction Processing) is a
major part of database applications.
ex: allotment of seats in reservation sys.

 A Fundamental role of multiuser DBMS s/w is to


ensure that concurrent transaction operate
correctly and efficiently.
 A transaction is an executing program or process
that includes one or more database accesses, such
as reading or updating of database records.
 Each transaction is supposed to execute a logically
correct database access if executed in its entirety
without interference from other transactions.
 The DBMS must enforce several transaction
properties. The isolation property ensures that
each transaction appears to execute in isolation
from other transactions, even though hundreds of
transactions may be executing concurrently. The
atomicity property ensures that either all the
database operations in a transaction are executed
or none are.
Characteristics of DBMS summary
 Self-describing nature of a database system
 metadata
 Insulation between programs and data and data
abstraction
 Program-data independence
 Program-operation independence
 Data abstraction
 Support of multiple views of the data
 Views
 Sharing of data and multi-user transaction processing
 Concurrency control software
 OLTP-Online transaction processing
 Transaction – process that includes one or more db
accesses
Database Users
Users may be divided into:

1. Those who actually use and control the


content (called “Actors on the Scene”) and

2. Those who enable the database to be


developed and the DBMS software to be
designed and implemented (called
“Workers Behind the Scene”).
Actors on the scene
1.Database administrators: (DBAs)

- responsible for authorizing access to the


database,
- for co-ordinating and monitoring its use,
- acquiring software, and hardware resources
- controlling its use and monitoring efficiency of
operations.
- Accountable for problems such as breach of
security or poor sys response time.
- In large organizations, assisted by a staff
2.Database Designers:

 Responsible for identifying the data to be stored in


the database.

 For choosing appropriate structure and


constraints to represent and store the data.

 They must communicate with the db users and


understand their needs and to create a design
that meets their requirements.

 The final database must be capable of supporting


the requirements of all user groups.
3.End-users:

- they use the data for queries,


- reports and
- some of them actually update the database
content.

Categories of end users:


 Casual end users

 Naïve or parametric end users

 Sophisticated end users

 Standalone end users


Categories of End-users
 Casual end-user :
 access database occasionally when needed

 Need (may) different information each time

 Use a sophisticated db query language to

specify their requests.


 Naïve or Parametric end-user :
 Constantly query and update the db.

 Use standard types of queries and updates

called “canned transactions” . – carefully


programmed & tested
Ex: Are bank-tellers or reservation clerks.
 Sophisticated end-user :
 includes business analysts, scientists, engineers,

others thoroughly familiar with the system


capabilities.
 Many use tools in the form of software packages

that work closely with the stored database.

 Stand-alone end-user :
 Maintain personal databases

 Use ready-made program packages the provide

easy-to-use menu-based or graphics-based


interfaces.
4. Application programmers and System Analysts
(Software engineers)
 System analysts:
Determine the requirements of end users (naïve &

parametric)
Develop specifications for canned transactions

that meet these requirements.


 Application programmer:
Implement the specifications as programs

Then they test, debug, document and maintain

these canned transactions.


Software engineers should be familiar with the capabilities
provided by the DBMS to accomplish their tasks.
Workers behind the Scene
The people who are associated with the design,
development and operation of the DBMS software and
system environment

These persons are typically not interested in the db


content itself

Categories:
1. DBMS system designers and implementers
2. Tool developers
3. Operators and Maintenance personnel
1. DBMS system designers and implementers:

 Design and implement the DBMS modules and


interfaces as a software package.
 Modules – for implementing the catalog,
processing query language, processing the
interface, controlling concurrency, handling
data recovery and security.
 The DBMS must interface with other system
software such as the O.S and compilers for
various programming languages.
2. Tool developers – Design and implement
tools.

Tools – The s/w package that facilitate db


modeling and design, db sys design &
improved performance.
- Optional packages that are often
purchased separately.
- Include packages for db design,
performance monitoring , GI, Simulation
etc.,
3. Operators and Maintenance personnel
(System Administration Personnel)

- Are responsible for the actual running and


maintenance of the hardware and software
environment for the db system.
Drawbacks of using file systems to store data:
• Data redundancy and inconsistency
-Multiple file formats, duplication of
information in different files

• Difficulty in accessing data


- Need to write a new program to carry
out each new task

• Data isolation — multiple files and


formats
 Integrity problems
 Integrity constraints (e.g. account balance

> 0) become part of program code


 Hard to add new constraints or change

existing ones
 Atomicity of updates
 Failures may leave database in an
inconsistent state with partial updates
carried out
 E.g. transfer of funds from one account to

another should either complete or not


happen at all
Drawbacks of using file systems contd..

 Concurrent access by multiple users


 Concurrent accessed needed for performance

 Uncontrolled concurrent accesses can lead to

inconsistencies
 E.g. two people reading a balance and
updating it at the same time
 Security problems
Database systems offer solutions to all the
above problems
DBMS ( Database Management System)
 A database management system is, well, a
system used to manage databases.
RDBMS (Relational Database Management
System)
 A relational database management system is a
database management system used to manage
relational databases.
 A relational database is one where tables of
data can have relationships based on primary
and foreign keys.
Advantages of Using the Database Approach

 1. Controlling Redundancy:
- Solve the problems of
- Multiple entry of same data
- Wastage of memory
- inconsistent data
 2. Restricting unauthorized Access:
- Users or user groups are given a/c nos.
protected by passwords.
- DBMS should provide a security & authorization
subsystem for DBA to create A/Cs and specify A/C
restriction.
 3. Providing persistent storage for program
objects:
 Reason for Object-oriented database system.
- Programming languages have complex data

structures, such as record types in Pascal,


class definition in C++ or Java.
 The values of program variables are discarded
once a program terminates, unless the
programmer explicitly stores them in permanent
files.
 Often involves converting these complex
structures into a format suitable for file storage.
- OOdb sys are compatible with programming
languages such as C++ & Java.
- DBMS s/w automatically performs any necessary
conversions.
- Hence, a complex object in C++ can be stored
permanently in an OODBMS.

- Traditional DB sys often suffered from the


impedance-mismatch problem.
since the data structure provided by the
DBMS were incompatible with the programming
language’s data structures.
- OOdb sys typically offer data structure compatibility
with one or more ObjOrienProgramming language
4. Providing Storage structures for efficient query
processing
 Db sys must provide capabilities for efficiently
executing queries and updates.
 Must provide data structures to speed up disk
search for the desired records.
 Indexes – based on tree data structure
 Buffering module: maintains parts of the db in main
memory buffers.
 The Query processing and optimization module of
the DBMS is responsible for choosing an efficient
query execution plan for each query based on the
existing storage structure.
5. Providing backup and recovery services
 Must provide facilities for recovering from hardware and
software failures.
 The backup and recovery subsystem of the DBMS is
responsible for recovery.

6. Providing multiple user interfaces


• DBMS should provide a variety of user interface, because
many types of users use a db.
• These include query language for casual users,
programming language interfaces for application
programmers, forms & command codes for parametric
users, etc.,
• Menu-based+ form style=GUI, Also Web GUI
7. Representing complex relationships among data
- A db may include numerous varieties of data that are
interrelated in many ways.
- DBMS must have the capability to represent a variety of
complex relationships among the data, to define new
relationships as they arise, and to retrieve and update
related data easily and efficiently. (Ex: Fig 1)
8. Enforcing integrity constraints
- DBMS should provide capabilities for defining and enforcing
constraints.
Ex: specifying a data type for each data item, constraint
specifying uniqueness on data item values
- It is the responsibility of db designers to identify integrity
constraints during the db design.
Fig 1: Sample database
9. Permitting Inferencing and Actions using rules:

 Some db syss provide capabilities for defining deduction


rules for inferecning new information from the stored db
facts. Such sys are called deductive db sys.
 Ex: Eligibility criteria
 Triggers: Is a form of a rule activated by updates to the
table, which results in performing some additional
operations to some other tables sending messages and
so on.

 Stored procedures: Procedures used to enforce rules.


They become a part of the overall db definition & are
invoked appropriately when certain conditions are met.

 Active db sys: Provides more powerful functionalities.


Provides active rules that can automatically initiate
actions when certain events and conditions occur.
When not to use DBMS:

 Simple, well-defined db applications that are not


expected to change

 No multiple-user access to data

 Initial investment is less on hardware, software


and training (or db is small)
Definition of schema, data model and instances
 Data abstraction : Refers to the suppression of
details of data organization and storage and the
highlighting of the essential features for an
improved understanding of data.
 Data Model:
 A set of concepts to describe

 the structure of a database,


 the operations for manipulating these structures,
 and certain constraints that the database should
obey.
Types of Data Models:

1. High Level or Conceptual data models.

2. Low Level or Physical data models.

3. Representational or Implementation data models.


Schemas
 Database Schema
 The description of a database.
 Includes descriptions of the database structure,
data types, and the constraints on the database.
 Schema Diagram
 An illustrative display of (most aspects of) a
database schema.
 Schema Construct
 A component of the schema or an object within the
schema, e.g., STUDENT, COURSE.
Example of a Database Schema
Schemas contd…
 Database State
 The actual data stored in a database at a

particular moment in time.


 This includes the collection of all the data in the

database.
 Also called database instance (or occurrence

or snapshot).
 The term instance is also applied to
individual database components, e.g. record
instance, table instance, entity instance
Example of a database state
 Distinction between database schema &
database state
 The database schema changes very infrequently.
 The database state changes every time the
database is updated.
• The DBMS is partly responsible for ensuring that
every state of the database is a valid state—that
is, a state that satisfies the structure and
constraints specified in the schema.

• Hence, specifying a correct schema to the DBMS


is extremely important and the schema must be
designed with utmost care.
 The DBMS stores the descriptions of the
schema constructs and constraints—also called
the meta-data—in the DBMS catalog so that
DBMS software can refer to the schema
whenever it needs to.

 Schema is also called intension.

 State is also called extension.


Three-Schema Architecture
 Proposed to support DBMS characteristics of:
 Program-data independence.
 Support of multiple views of the data.
 Use of catalog to store the db description
 Its goal is to separate the user applications and
the physical database.
 Not explicitly used in commercial DBMS products,
but has been useful in explaining database
system organization
The Three-schema architecture
 Defines DBMS schemas at three levels:

 Internal schema at the internal level to


describe physical storage structures.
 Lowest level of abstraction.
 Typically uses a physical data model.

 Describes the complete details of data


storage and access paths for the database.
 Conceptual schema at the conceptual level
to describe the structure and constraints for
the whole database for a community of users.
 Hides the details of physical storage
structures. Concentrates on describing
entities, data types, relationships, user
operations and constraints.
 Uses a implementation/Representational
data model to describe conceptual schema.
 External schemas at the external level (or
view level) to describe the various user views.

 Each External schemas describes the part


of the db that a particular user group is
interested in and hides the rest of the db
from that user group.
 Uses representational model to implement
external schema.
 The three schema architecture is a convenient
tool with which the user can visualize the
schema levels in a db system.
 Most DBMSs do not separate the three levels
completely and explicitly, but support the three-
schema architecture to some extent.
 In most DBMSs external schemas are specified in
the same data model that describes the conceptual–
level information. Ex: Oracle-SQL
 Some DBMSs allow different data model for external
and Conceptual level
 Ex: Universal Data Base –IBMs DBMS
 Uses Relational model for Conceptual schema

 May use Object-oriented model for external


schema
Mappings: The processes of transforming
requests and results between levels is called
mappings.

 Programs refer to an external schema, and are


mapped by the DBMS to the internal schema for
execution.

 Data extracted from the internal DBMS level is


reformatted to match the user’s external view (e.g.
formatting the results of an SQL query for display
in a Web page)
Data Independence
 Data independence can be defined as the
capacity to change the schema at one level
without changing the schema at next higher
level.

 There are two types of data Independence.


They are:
1. Logical data independence.

2. Physical data independence.


The Three-schema architecture
Logical data independence:
 Is the capacity to change the conceptual

schema without having to change the external


schema or application programs.
 Only the view definition and mapping need to be
changed in a DBMS that support logical data
independence.
 After the conceptual schema undergoes a
logical reorganization, application programs that
reference the external schema constructs must
work as before.
Physical data independence:

 Is the capacity to change the internal schema


without changing the conceptual schema.

 Hence, the external schemas need not be


changed as well.

 Modification at the physical level are


occasionally necessary to improve
performance.
 Ex: By creating additional access structures—to
improve the performance of retrieval or update.
 If the same data as before remains in the
database, we should not have to change the
conceptual schema.
 For example, providing an access path to
improve retrieval speed of section records
(Figure 1.2) by semester and year should not
require a query such as list all sections offered
in fall 2008 to be changed, although the query
would be executed more efficiently by the DBMS
by utilizing the new access path.
 Logical data independence is more difficult
to achieve than physical data independence,
because it allows structural and constraint
changes without affecting application programs

 When a schema at a lower level is changed,


only the mappings between this schema and
higher-level schemas need to be changed in a
DBMS that fully supports data independence.
DBMS Languages

 Types of DBMS languages:

 Data Definition Language (DDL)

 Data Manipulation Language (DML)

 Data Control Language (DCL)


Data Definition Language (DDL)
 Used by the DBA and database designers to specify
the conceptual schema of a database.
 In many DBMSs, the DDL is also used to define
internal and external schemas (views).
 In some DBMSs, separate storage definition
language (SDL) and view definition language
(VDL) are used to define internal and external
schemas.

 Ex: Create, Alter, Drop, Truncate


Data Manipulation Language (DML)
 Used to specify database retrievals and updates.
 DML commands (data sublanguage) can be
embedded in a general-purpose programming
language (host language), such as COBOL, C, C+
+, or Java.
 Alternatively, stand-alone DML commands can be
applied directly (called a query language).

 Ex: Insert, Delete, Select, Update


Data Control Language (DCL)
 The DCL language is used for controlling the
access to the table and hence securing the
database.

 DCL is used to provide certain privileges to a


particular user. Privileges are rights to be
allocated.

 The various privileges that can be granted or


revoked are: Select, Insert, Delete, Update,
References, Execute All.
Some of the DCL commands

 COMMIT - save work done

 SAVEPOINT - identify a point in a transaction to


which you can later roll back

 ROLLBACK - restore database to original since


the last COMMIT

 SET TRANSACTION - Change transaction


options like what rollback segment to use
DBMS Interfaces
User friendly interfaces provided by DBMS may include
the following:
Menu-based, for web clients or browsing :
 Present the user with list of options (called menus)

that lead the user through the formulations of a


request.
 No need to memorize the specific commands &

syntax of a query language.


 Pull-down menus are a very popular technique in

web based user interfaces.


 Used in browsing interfaces, which allow a user to

look through the content of db in an exploratory &


unstructured manner.
Form-based interfaces:
 Displays a form to each user.
 User can fill out all of the form entries to insert
data.
 Usually designed and programmed for naive
users as interfaces to canned transactions.
 Many DBMSs have form specification
languages, which are special languages that
help programmers specify such forms.
 Ex: SQL forms, Oracle forms, etc.,
Graphical User Interfaces(GUI):

 Displays a schema to the user in diagrammatic


form. Then user can specify a query by
manipulating the diagram.

 In many cases GUI utilize both menu & forms.

 Most GUIs use a pointing devices such as a


mouse, to pick certain parts of the displayed
schema diagram.
Natural language interface:
 Accept request written in English or some other
language and attempt to understand them.
 Usually has its own schema, which is similar to
the db conceptual schema, as well as a
dictionary of important words.
 Refers to the words in its schema, as well as to
the set of standard words in its dictionary, to
interpret the request.
 The capabilities of natural language interfaces
have not advanced rapidly.
(Ex: Search engine that accept strings of natural
language)
Speech input & output:

 Uses speech as an input query & speech as an


answer to a question or result of a request.

Ex: enquires for telephone directory, flight


arrival/departure, back account information

 Speech i/p is detected using a library of


predefined words and used to set up the
parameters that are supplied to the queries. For
o/p, a similar conversion from text or numbers
into speech takes place.
Interfaces for parametric users:

 System analysts and programmers design and


implement a special interfaces for each class of
naive users. Ex: bank tellers
 Usually a small set of abbreviated commands is
included, with the goal of minimizing the no. of
keystrokes required for each request.
 Ex: Function keys in a terminal can be
programmed to initiate various commands.
Interfaces for the DBA:

 Most database systems contain privileged


commands that can be used only by DBA’s staff.

Ex: Commands for creating accounts,


setting system parameters,
granting a/c authorization,
changing a schema and
reorganizing the storage structure of database.
The database system environment:
 A DBMS is a complex software system. In this
section we discuss the types of s/w components that
constitute a DBMS and the types of computer system
s/w with which the DBMS interacts.
DBMS Component Modules
The following Figure illustrates the typical DBMS
components in a simplified form.
Figure is divided into two halves.
Top half - refers various users of db & their interfaces
Lower half- components responsible for storage of
data and processing of transactions.

You might also like