Professional Documents
Culture Documents
Information Systems Application - Databases: Course Introduction
Information Systems Application - Databases: Course Introduction
Application - Databases
Course Introduction
© 2009, CMU-ISR 2
Course Structure
Foundations
• Introduction to DBMS
• Entity Relationship Conceptual Design
• Relational Model SQL Data Definition Language
• Relational Algebra and Calculus
• SQL Data Modeling Language
Applications I
• Database Application Development
Systems
• Overview of Storage and Indexing
• Overview of Query Evaluation
• Overview of Transaction Management
© 2009, CMU-ISR 3
Course Structure cont’d
Applications II
• Schema Refinement, FDs, Normalization
• Physical DB Design and Tuning
• Security and Authorization
Advance Topics
• Parallel and Distributed DBs
• Data Warehousing and Decision Support
• Data Mining
• Information Retrieval and XML Data
Management
© 2009, CMU-ISR 4
What is a Database Management
System?
First, what is a database?
• A collection of data
• Models the activities of one or more related
organizations
• Composed of:
Entities
Relationships between entities
© 2009, CMU-ISR 5
Files vs. DBMS
Files
• Data read sequentially
• Large amounts of data have to move in and out
of main memory
• Finding data would be difficult
• Data consistency would be a problem
• Security difficult for users needing access to
only a subset of the data
• Recovery very difficult
© 2009, CMU-ISR 6
File vs. DBMS
DBMS provides for:
• Storage for large amounts of data that can be
quickly accessed
• Efficient management of data in and out of
main memory
• Crash recovery
• Security and Access control
• Inconsistency Management
• Special language to access and update data
© 2009, CMU-ISR 7
Why Use a DBMS?
Data Independence Æ Hide the details of data
representation and storage from application and reduce
development time
Efficient Data Access Æ DBMS is optimized for data storage
and retrieval
Data Integrity and Security Æ DBMS can enforce data
integrity constraints and provide sophisticated access
controls mechanisms
Data Administration Æ Trained DB administrators can
optimize data retrieval performance
Concurrent Access and Crash Recovery Æ Multiple users
accessing the data simultaneously and protects against
system failures
© 2009, CMU-ISR 8
Data Models
Data Model is a collection of concepts for
describing data
A Schema is a description of a particular
collection of data, using a given data
model
The Relational Model of Data, similar to a
set of records, is the most widely used
model used in modern DBMSs
The Semantic Model is a more abstract,
high-level data model that makes it easier
for users to think about the data
© 2009, CMU-ISR 9
Models In Use
Relational Model: DB2 (IBM), Informix,
Oracle, Sybase, MS Access, Paradox,
Tandem, etc.
Hierarchical Model: IMS (IBM)
Network Model: IDS and IDMS
Object-oriented Model: Objectstore and
Versant
Object-relational Model: DBMS Products
from IBM, Oracle, Versant, et al.
© 2009, CMU-ISR 10
Levels of Abstraction
Data Definition Language (DDL) is used to
define the external and conceptual
schemas
• Most widely used DDL is SQL
Conceptual Schema
Physical Schema
© 2009, CMU-ISR 12
More on Abstraction
External Schemas or Views
• A collection of one or more views and relations
from the conceptual schema
• Similar to a relation, but records in views are
not stored in the database
• Computed using the definition for the view
Physical Schema
• How the DBMS actually stores relationsÆ as
unordered files
• Auxiliary data structures, called indexes, which
speed up data retrieval
© 2009, CMU-ISR 13
Conceptual Schema
Also called the logical schema
Consists of two parts:
• Entities
• Relationships
© 2009, CMU-ISR 15
Concurrency Control
Concurrency is one of the greatest benefits
of a DBMS
DBMS Performance can actually improve
during concurrency
• Disk accesses are frequent, and relatively slow,
so the CPU works more efficiently when disk is
spinning
• Interleaving actions from different users can
lead to inconsistency
• Users feel that they are using a single-user
system
© 2009, CMU-ISR 16
DBMS Transactions
A Transaction is an atomic sequence of
database actions (reads/writes)
Database must always be in a consistent
state after a transaction completes
Users can specify integrity constraints on
data Æ database will enforce these
constraints
Since Database doesn’t ‘understand’ the
data, ultimate database consistency is
user’s responsibility
© 2009, CMU-ISR 17
Concurrent Transactions
DBMS ensures that execution of {T1…Tn}
is equivalent to some serial executions of
T1’…Tn’
• Before reading/writing and object, a
transaction requests a lock on the object.
• All locks are released at the end of the
transaction
• Ti is writing, which affects Tj, if Ti has the lock
then Tj must wait until Ti completes the
transaction
• Deadlock can occur when interleaved
transactions are both waiting for each others
transactions to finish
© 2009, CMU-ISR 18
Incomplete Transactions
Atomicity properties ensure that either the
transaction completes in its entirety or
does not complete any part of the
transaction
Keep a log of all actions carried out by the
DBMS
• Before a change is made to the database the
log entry is forced to a safe area (WAL
protocol)
• After a crash, the effects of a partially executed
transaction are undone using the log
© 2009, CMU-ISR 19
The Log
The following actions are recorded in the log:
• Ti writes an object: the old value and the new value
• Ti commits/aborts: a log record indicating this action
• Logs must be forced to disk
© 2009, CMU-ISR 21
Database Architecture
SQL Commands
DATABASE
© 2009, CMU-ISR 22
Summary
DBMS uses to maintain, query large
datasets
Benefits include recovery from system
crashes, concurrent access, quick
application development, data integrity
and security
Levels of abstraction give data
independence
DBMS typically have layered architectures
It all starts with the data!
© 2009, CMU-ISR 23