Download as pdf or txt
Download as pdf or txt
You are on page 1of 57

Chapter 1

Introduction

Dr. Vijay Kumar


Assistant Professor
NIT, Hamirpur
Slide 1- 1
Outline

• Types of Databases and Database Applications


• Basic Definitions
• Typical DBMS Functionality
• Example of a Database (UNIVERSITY)
• Main Characteristics of the Database Approach
• Database Users
• Advantages of Using the Database Approach
• When Not to Use Databases

Slide 1- 2
Types of Databases and Database
Applications
• Traditional Applications:
• Numeric and Textual Databases
• More Recent Applications:
• Multimedia Databases
• Geographic Information Systems (GIS)
• Data Warehouses
• Real-time and Active Databases
• Many other applications

Slide 1- 3
Basic Definitions

• Database:
• A collection of related data.
• Data:
• Known facts that can be recorded and have an implicit meaning.
• Mini-world:
• Some part of the real world about which data is stored in a database.
For example, student grades and transcripts at a university.
• Database Management System (DBMS):
• A software package/ system to facilitate the creation and
maintenance of a computerized database.
• Database System:
• The DBMS software together with the data itself. Sometimes, the
applications are also included.

Slide 1- 4
Simplified database system environment

Slide 1- 5
Typical DBMS Functionality

• Define a particular database in terms of its data types,


structures, and constraints
• Construct or Load the initial database contents on a
secondary storage medium
• Manipulating the database:
• Retrieval: Querying, generating reports
• Modification: Insertions, deletions and updates to its content
• Accessing the database through Web applications
• Processing and Sharing by a set of concurrent users and
application programs – yet, keeping all data valid and
consistent

Slide 1- 6
Typical DBMS Functionality

• Other features:
• Protection or Security measures to prevent
unauthorized access
• “Active” processing to take internal actions on
data
• Presentation and Visualization of data
• Maintaining the database and associated
programs over the lifetime of the database
application
• Called database, software, and system maintenance

Slide 1- 7
Example of a Database
(with a Conceptual Data Model)
•Mini-world for the example:
•Part of a UNIVERSITY environment.
•Some mini-world entities:
•STUDENTs
•COURSEs
•SECTIONs (of COURSEs)
•(academic) DEPARTMENTs
•INSTRUCTORs

Slide 1- 8
Example of a Database
(with a Conceptual Data Model)

•Some mini-world relationships:


•SECTIONs are of specific COURSEs
•STUDENTs take SECTIONs
•COURSEs have prerequisite COURSEs
•INSTRUCTORs teach SECTIONs
•COURSEs are offered by
DEPARTMENTs
•STUDENTs major in DEPARTMENTs
Slide 1- 9
Example of a simple database

Slide 1- 10
Main Characteristics of the Database
Approach
• Self-describing nature of a database system:
• A DBMS catalog stores the description of a particular
database (e.g. data structures, types, and constraints)
• The description is called meta-data.
• This allows the DBMS software to work with different
database applications.
• Insulation between programs and data:
• Called program-data independence.
• Allows changing data structures and storage organization
without having to change the DBMS access programs.

Slide 1- 11
Example of a simplified database catalog

Slide 1- 12
Main Characteristics of the Database
Approach (continued)
• Data Abstraction:
• A data model is used to hide storage details and
present the users with a conceptual view of the
database.
• Programs refer to the data model constructs
rather than data storage details
• Support of multiple views of the data:
• Each user may see a different view of the
database, which describes only the data of
interest to that user.

Slide 1- 13
Main Characteristics of the Database
Approach (continued)
• Sharing of data and multi-user transaction processing:
• Allowing a set of concurrent users to retrieve from and to
update the database.
• Concurrency control within the DBMS guarantees that
each transaction is correctly executed or aborted
• Recovery subsystem ensures each completed transaction
has its effect permanently recorded in the database
• OLTP (Online Transaction Processing) is a major part of
database applications. This allows hundreds of concurrent
transactions to execute per second.

Slide 1- 14
Database Users

• Users may be divided into


• Those who actually use and control the database
content, and those who design, develop and
maintain database applications (called “Actors on
the Scene”), and
• Those who design and develop the DBMS
software and related tools, and the computer
systems operators (called “Workers Behind the
Scene”).

Slide 1- 15
Database Users

• Actors on the scene


• Database administrators:
• Responsible for authorizing access to the database, for
coordinating and monitoring its use, acquiring software
and hardware resources, controlling its use and
monitoring efficiency of operations.
• Database Designers:
• Responsible to define the content, the structure, the
constraints, and functions or transactions against the
database. They must communicate with the end-users
and understand their needs.

Slide 1- 16
Categories of End-users

• Actors on the scene (continued)


• End-users: They use the data for queries, reports and
some of them update the database content. End-users
can be categorized into:
• Casual: access database occasionally when needed
• Naïve or Parametric: they make up a large section of
the end-user population.
• They use previously well-defined functions in the
form of “canned transactions” against the database.
• Examples are bank-tellers or reservation clerks who
do this activity for an entire shift of operations.

Slide 1- 17
Categories of End-users (continued)

• Sophisticated:
• These include business analysts, scientists,
engineers, others thoroughly familiar with the
system capabilities.
• Many use tools in the form of software packages
that work closely with the stored database.
• Stand-alone:
• Mostly maintain personal databases using ready-to-
use packaged applications.
• An example is a tax program user that creates its
own internal database.
• Another example is a user that maintains an address
book

Slide 1- 18
Advantages of Using the Database Approach

• Controlling redundancy in data storage and in


development and maintenance efforts.
• Sharing of data among multiple users.
• Restricting unauthorized access to data.
• Providing persistent storage for program Objects
• In Object-oriented DBMSs
• Providing Storage Structures (e.g. indexes) for
efficient Query Processing

Slide 1- 19
Advantages of Using the Database Approach
(continued)
• Providing backup and recovery services.
• Providing multiple interfaces to different classes of
users.
• Representing complex relationships among data.
• Enforcing integrity constraints on the database.
• Drawing inferences and actions from the stored
data using deductive and active rules

Slide 1- 20
Additional Implications of Using the Database
Approach
• Potential for enforcing standards:
• This is very crucial for the success of database
applications in large organizations. Standards
refer to data item names, display formats,
screens, report structures, meta-data (description
of data), Web page layouts, etc.
• Reduced application development time:
• Incremental time to add each new application is
reduced.

Slide 1- 21
Additional Implications of Using the Database
Approach (continued)
• Flexibility to change data structures:
• Database structure may evolve as new requirements are
defined.
• Availability of current information:
• Extremely important for on-line transaction systems such
as airline, hotel, car reservations.
• Economies of scale:
• Wasteful overlap of resources and personnel can be
avoided by consolidating data and applications across
departments.

Slide 1- 22
Historical Development of Database
Technology
• Early Database Applications:
• The Hierarchical and Network Models were introduced in
mid 1960s and dominated during the seventies.
• A bulk of the worldwide database processing still occurs
using these models, particularly, the hierarchical model.
• Relational Model based Systems:
• Relational model was originally introduced in 1970, was
heavily researched and experimented within IBM
Research and several universities.
• Relational DBMS Products emerged in the early 1980s.

Slide 1- 23
Historical Development of Database
Technology (continued)
• Object-oriented and emerging applications:
• Object-Oriented Database Management Systems (OODBMSs)
were introduced in late 1980s and early 1990s to cater to the
need of complex data processing in CAD and other
applications.
• Their use has not taken off much.
• Many relational DBMSs have incorporated object database
concepts, leading to a new category called object-relational
DBMSs (ORDBMSs)
• Extended relational systems add further capabilities (e.g. for
multimedia data, XML, and other data types)

Slide 1- 24
Historical Development of Database
Technology (continued)
• Data on the Web and E-commerce Applications:
• Web contains data in HTML (Hypertext markup language)
with links among pages.
• This has given rise to a new set of applications and E-
commerce is using new standards like XML (eXtended
Markup Language).
• Script programming languages such as PHP and JavaScript
allow generation of dynamic Web pages that are partially
generated from a database
• Also allow database updates through Web pages

Slide 1- 25
Extending Database Capabilities

• New functionality is being added to DBMSs in the following areas:


• Scientific Applications
• XML (eXtensible Markup Language)
• Image Storage and Management
• Audio and Video Data Management
• Data Warehousing and Data Mining
• Spatial Data Management
• Time Series and Historical Data Management

• The above gives rise to new research and development in incorporating


new data types, complex data structures, new operations and storage
and indexing schemes in database systems.

Slide 1- 26
When not to use a DBMS

• Main inhibitors (costs) of using a DBMS:


• High initial investment and possible need for additional
hardware.
• Overhead for providing generality, security, concurrency
control, recovery, and integrity functions.
• When a DBMS may be unnecessary:
• If the database and applications are simple, well defined, and
not expected to change.
• If there are stringent real-time requirements that may not be
met because of DBMS overhead.
• If access to data by multiple users is not required.

Slide 1- 27
When not to use a DBMS

• When no DBMS may suffice:


•If the database system is not able to handle
the complexity of data because of
modeling limitations
•If the database users need special
operations not supported by the DBMS.

Slide 1- 28
Data Models

• Data Model: A set of concepts to describe the


structure of a database, and certain constraints that
the database should obey.

• Data Model Operations: Operations for specifying


database retrievals and updates by referring to the
concepts of the data model. Operations on the data
model may include basic operations and user-
defined operations.

Slide 2-29
Categories of data models

• Conceptual (high-level, semantic) data models: Provide


concepts that are close to the way many users perceive
data. (Also called entity-based or object-based data
models.)

• Physical (low-level, internal) data models: Provide


concepts that describe details of how data is stored in
the computer.

• Implementation (representational) data models:


Provide concepts that fall between the above two,
balancing user views with some computer storage
details.
Slide 2-30
The Entity Relationship Model

• The ER model is the most commonly used conceptual model


• In this model, the real world consists of a collection of basic
objects called entities and the relationships among these
objects
• An entity is an object that is distinguishable from other
objects by a specific set of attributes
• An entity set is the set of all entities of the same type
• A relationship is an association among entities
• The set of all relationships of the same type is a relationship
set
• One nice thing about this model is that you can represent the
logical structure of a DB graphically, using an ER diagram.

Slide 2-31
Example ER Diagram

QuickTime™ and a
decompressor
are needed to see this picture.
QuickTime™ and a
decompressor
are needed to see this picture.
QuickT ime™ and a
decompressor Q uic k Time™ an d a
are needed to see this picture. d ec o mp r es s or
a re n ee d ed to se e th is pic tur e .

QuickTime™ and a
decompressor
are needed to see this picture.

QuickTime™ and a
decompressor
are needed to see this picture.

Slide 2-32
The Object Oriented Model
• The OO model is a representational data model that is still at a fairly
high level
• It’s similar to the ER model in that it’s based on a collection of objects,
but the objects are designed differently
• The real world consists of a collection of objects called objects, which
store both data values and code for operating on these values
• The values themselves may be objects, and so we can get nesting of
objects
• We can also have two or more objects containing all the same values
that are nevertheless distinct. Physical address identifiers are used to
distinguish them. In the ER model, entities must be distinguished by
some unique value.

Slide 2-33
More Representational Models
• Most representational models are record-based. In a record-based
model, data is structured in fixed-format records.
• Each record has a fixed number of fields, and each field usually has
a fixed length.
• Two older record-based models, the network model and
hierarchical model, are no longer used to build new systems. They
use pointers, or hard-coded links, to connect the records of a DB.
• The representational model supported by Oracle is the relational
model.
• In the relational model, you view data as being arranged in tables,
with rows and columns. Each column has a unique name. Each
row is a record. The examples given for the University mini-world
were shown in the relational model.

Slide 2-34
Schemas versus Instances

• Database Schema: The description of a database.


Includes descriptions of the database structure and
the constraints that should hold on the database.
• Schema Diagram: A diagrammatic display of (some
aspects of) a database schema.
• Schema Construct: A component of the schema or
an object within the schema, e.g., STUDENT,
COURSE.
• Database Instance: The actual data stored in a
database at a particular moment in time. Also
called database state (or occurrence).

Slide 2-35
FIGURE 2.1
Schema diagram for the database in Figure 1.2.

Slide 2-36
Database Schema Vs. Database State
• Database State: Refers to the content of a database at
a moment in time.
• Initial Database State: Refers to the database when it
is loaded
• Valid State: A state that satisfies the structure and
constraints of the database.
• Distinction
• The database schema changes very infrequently. The
database state changes every time the database is updated.
• Schema is also called intension, whereas state is called
extension.

Slide 2-37
Three-Schema Architecture

• Proposed to support DBMS characteristics of:


• Program-data independence.
• Support of multiple views of the data.

Slide 2-38
FIGURE 2.2
The three-
schema
architecture.

Slide 2-39
Three-Schema Architecture

• Defines DBMS schemas at three levels:


• Internal schema at the internal level to describe physical
storage structures and access paths. Typically uses a
physical data model.
• Conceptual schema at the conceptual level to describe
the structure and constraints for the whole database for a
community of users. Uses a conceptual or an
implementation data model.
• External schemas at the external level to describe the
various user views. Usually uses the same data model as
the conceptual level.

Slide 2-40
Three-Schema Architecture

Mappings among schema levels are needed to transform requests and


data. Programs refer to an external schema, and are mapped by the
DBMS to the internal schema for execution.

Slide 2-41
Data Independence

• Logical Data Independence: The capacity to change the conceptual


schema without having to change the external schemas and their
application programs.
• Physical Data Independence: The capacity to change the internal
schema without having to change the conceptual schema.

Slide 2-42
Data Independence

When a schema at a lower level is changed, only the mappings


between this schema and higher-level schemas need to be changed
in a DBMS that fully supports data independence. The higher-level
schemas themselves are unchanged. Hence, the application
programs need not be changed since they refer to the external
schemas.

Slide 2-43
DBMS Languages

• Data Definition Language (DDL): Used by the DBA


and database designers to specify the conceptual
schema of a database. In many DBMSs, the DDL is
also used to define internal and external schemas
(views). In some DBMSs, separate storage
definition language (SDL) and view definition
language (VDL) are used to define internal and
external schemas.

Slide 2-44
DBMS Languages

• Data Manipulation Language (DML): Used to specify database


retrievals and updates.
• DML commands (data sublanguage) can be embedded in a general-purpose
programming language (host language), such as COBOL, C or an Assembly
Language.
• Alternatively, stand-alone DML commands can be applied directly (query
language).

Slide 2-45
DBMS Languages

• High Level or Non-procedural Languages: e.g., SQL, are set-oriented


and specify what data to retrieve than how to retrieve. Also called
declarative languages.
• Low Level or Procedural Languages: record-at-a-time; they specify
how to retrieve data and include constructs such as looping.

Slide 2-46
DBMS Interfaces

• Stand-alone query language interfaces.


• Programmer interfaces for embedding DML in
programming languages:
• Pre-compiler Approach
• Procedure (Subroutine) Call Approach
• User-friendly interfaces:
• Menu-based, popular for browsing on the web
• Forms-based, designed for naïve users
• Graphics-based (Point and Click, Drag and Drop etc.)
• Natural language: requests in written English
• Combinations of the above

Slide 2-47
Other DBMS Interfaces

• Speech as Input (?) and Output


• Web Browser as an interface
• Parametric interfaces (e.g., bank tellers) using function keys.
• Interfaces for the DBA:
• Creating accounts, granting authorizations
• Setting system parameters
• Changing schemas or access path

Slide 2-48
FIGURE 2.3
Component modules of a DBMS and their
interactions.

Slide 2-49
Database System Utilities

• To perform certain functions such as:


• Loading data stored in files into a database. Includes data
conversion tools.
• Backing up the database periodically on tape.
• Reorganizing database file structures.
• Report generation utilities.
• Performance monitoring utilities.
• Other functions, such as sorting, user monitoring, data
compression, etc.

Slide 2-50
Centralized and Client-Server Architectures

• Centralized DBMS: combines everything into single system including-


DBMS software, hardware, application programs and user interface
processing software.

Slide 2-51
FIGURE 2.4
A physical centralized architecture.

Slide 2-52
Specialized Servers with Specialized
functions:
• File Servers
• Printer Servers
• Web Servers
• E-mail Servers

Slide 2-53
Clients:

• Provide appropriate interfaces and a client-version


of the system to access and utilize the server
resources.
• Clients maybe diskless machines or PCs or
Workstations with disks with only the client
software installed.
• Connected to the servers via some form of a
network.
(LAN: local area network, wireless network, etc.)

Slide 2-54
Two Tier Client-Server Architecture

• User Interface Programs and Application Programs run on the client


side
• Interface called ODBC (Open Database Connectivity – see Ch 9)
provides an Application program interface (API) allow client side
programs to call the DBMS. Most DBMS vendors provide ODBC drivers.

Slide 2-55
Two Tier Client-Server Architecture

• A client program may connect to several DBMSs.


• Other variations of clients are possible: e.g., in
some DBMSs, more functionality is transferred to
clients including data dictionary functions,
optimization and recovery across multiple servers,
etc. In such situations the server may be called the
Data Server.

Slide 2-56
FIGURE 2.5
Logical two-tier client/server architecture.

Slide 2-57

You might also like