Database Management System

You might also like

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 6

Database management system

A DBMS is a set of software programs that controls the organization, storage, management, and retrieval of data in


a database. DBMSs are categorized according to their data structures or types. The DBMS accepts requests for data from an
application program and instructs the operating system to transfer the appropriate data. The queries and responses must be
submitted and received according to a format that conforms to one or more applicable protocols. When a DBMS is
used, information systems can be changed much more easily as the organization's information requirements change. New
categories of data can be added to the database without disruption to the existing system.
Database servers are computers that hold the actual databases and run only the DBMS and related software. Database
servers are usually multiprocessor computers, with generous memory and RAID disk arrays used for stable storage.
Hardware database accelerators, connected to one or more servers via a high-speed channel, are also used in large volume
transaction processing environments. DBMSs are found at the heart of most database applications. DBMSs may be built
around a custom multitasking kernel with built-in networking support, but modern DBMSs typically rely on a
standard operating system to provide these functions

Relational Model
Its central idea was to describe a database as a collection of predicates over a finite set of predicate variables,
describing constraints on the possible values and combinations of values. The content of the database at any given time is a
finite (logical) model of the database, i.e. a set of relations, one per predicate variable, such that all predicates are satisfied.
A request for information from the database (a database query) is also a predicate.
The purpose of the relational model is to provide a declarative method for specifying data and queries: we directly state what
information the database contains and what information we want from it, and let the database management system software
take care of describing data structures for storing the data and retrieval procedures for getting queries answered

Major strengths of the relational model:


.The data model and access to it is simple to understand and use, even for those who are not experienced programmers.
.The model of data represented in tables is remarkably simple.
.Access to data via the model does not require navigation (roughly, following pointers), as do the CODASYL and network
models.
.It admits a simple (in principle), declarative query language.
.There are straightforward database design procedures.
.The data model admits a solid and well understood mathematical foundation (first-order predicate logic). This has facilitated
the development of a sophisticated theoretical underpinning, which has contributed greatly to the features of practical
systems.
.Efficient implementation techniques are well known and widely used.
Standards exist both for query languages (SQL) and for interfaces via programming languages (embedded SQL and
ODBC/CLI).
Limitations of the Relational Model
It is convenient to divide the limitations up into two categories.
First of all, there are always very special types of data which require special forms of representation and/or inference. Some
examples are the following.
Limitations regarding special forms of data:
Temporal data, Spatial data, Multimedia data, Unstructured data (warehousing/mining), Document libraries (digital libraries)
Limitations regarding SQL as the query language:
Recursive queries (e.g., compute the ancestor relation from the parent relation):, Although part of the SQL:1999 standard,
recursive queries are still not supported by many systems (e.g. PostgreSQL)., Support for recursive queries in SQL:1999 is
weak in any case. (Only so-called linear queries are supported.)

Differentiate between hierarchical and network data model


The network model is a database model conceived as a flexible way of representing objects and their relationships. Its
original inventor was Charles Bachman, and it was developed into a standard specification published in 1969 by the
CODASYL Consortium. Where the hierarchical model structures data as a tree of records, with each record having one parent
record and many children, the network model allows each record to have multiple parent and child records, forming
a lattice structure.
The chief argument in favour of the network model, in comparison to the hierarchic model, was that it allowed a more
natural modeling of relationships between entities. Although the model was widely implemented and used, it failed to
become dominant for two main reasons. Firstly, IBM chose to stick to the hierarchical model with semi-network extensions in
their established products such as IMS and DL/I. Secondly, it was eventually displaced by the relational model, which offered
a higher-level, more declarative interface. Until the early 1980s the performance benefits of the low-level navigational
interfaces offered by hierarchical and network databases were persuasive for many large-scale applications, but as hardware
became faster, the extra productivity and flexibility of the relational model led to the gradual obsolescence of the network
model in corporate enterprise usage. 
Advantages of DBMS (Database Management Systems) are followings: 
A true DBMS offers several advantages over file processing. The principal advantages of a DBMS are the followings: 
Database concurrency
Database concurrency controls ensure that transactions occur in an ordered fashion. The main job of these controls is to
protect transactions issued by different users/applications from the effects of each other. They must preserve the four
characteristics of database transactions: atomicity, isolation, consistency and durability.
What are advantages of database management system over conventional file processing system
• Flexibility: Because programs and data are independent, programs do not have to be modified when types of unrelated
data are added to or deleted from the database, or when physical storage changes. 
• Fast response to information requests: Because data are integrated into a single database, complex requests can be
handled much more rapidly then if the data were located in separate, non-integrated files. In many businesses, faster
response means better customer service. 
• Multiple access: Database software allows data to be accessed in a variety of ways (such as through various key fields) and
often, by using several programming languages (both 3GL and nonprocedural 4GL programs). 
• Lower user training costs: Users often find it easier to learn such systems and training costs may be reduced. Also, the
total time taken to process requests may be shorter, which would increase user productivity. 
• Less storage: Theoretically, all occurrences of data items need be stored only once, thereby eliminating the storage of
redundant data. System developers and database designers often use data normalization to minimize data redundancy. 
Here are some disadvantages 
1. DBMS subjects a business to a risk in the loss of critical data in its electronic format it can be more readily stolen without
proper security  2. The cost of a DBMS can be prohibitive for small enterprises as they struggle with cost justification for
making the investment in the infrastructure  3. Improper use of the DBMS can lead to incorrect decision making as people
take for granted the data is accurate as presented. 4. Data’s can be stolen by careless of the password security. 
What are the difference between DDL, DML , TCL and DCL commands?
DDL
Data Definition Language (DDL) statements are used to define the database structure or schema. Some examples:
o CREATE - to create objects in the database
o ALTER - alters the structure of the database
o DROP - delete objects from the database
o TRUNCATE - remove all records from a table, including all spaces allocated for the records are removed
o COMMENT - add comments to the data dictionary
o RENAME - rename an object
DML
Data Manipulation Language (DML) statements are used for managing data within schema objects. Some examples:
o SELECT - retrieve data from the a database
o INSERT - insert data into a table
o UPDATE - updates existing data within a table
o DELETE - deletes all records from a table, the space for the records remain
o MERGE - UPSERT operation (insert or update)
o CALL - call a PL/SQL or Java subprogram
o EXPLAIN PLAN - explain access path to data
o LOCK TABLE - control concurrency
DCL
Data Control Language (DCL) statements. Some examples:
o GRANT - gives user's access privileges to database
o REVOKE - withdraw access privileges given with the GRANT command
TCL
Transaction Control (TCL) statements are used to manage the changes made by DML statements. It allows statements to be
grouped together into logical transactions.
o COMMIT - save work done
o SAVEPOINT - identify a point in a transaction to which you can later roll back
o ROLLBACK - restore database to original since the last COMMIT
o SET TRANSACTION - Change transaction options like isolation level and what rollback segment to use
Logical Modeling
Logical modeling deals with gathering business requirements and converting those requirements into a model. The logical
model revolves around the needs of the business, not the database, although the needs of the business are used to establish
the needs of the database. Logical modeling involves gathering information about business processes, business entities
(categories of data), and organizational units. After this information is gathered, diagrams and reports are produced
including entity relationship diagrams, business process diagrams, and eventually process flow diagrams. The diagrams
produced should show the processes and data that exists, as well as the relationships between business processes and data.
Logical modeling should accurately render a visual representation of the activities and data relevant to a particular business.

The diagrams and documentation generated during logical modeling is used to determine whether the requirements of the
business have been completely gathered. Management, developers, and end users alike review these diagrams and
documentation to determine if more work is required before physical modeling commences.
Typical deliverables of logical modeling include
Entity relationship diagrams: An Entity Relationship Diagram is also referred to as an analysis ERD. The point of the
initial ERD is to provide the development team with a picture of the different categories of data for the business, as well
as how these categories of data are related to one another.
Business process diagrams: The process model illustrates all the parent and child processes that are performed by
individuals within a company. The process model gives the development team an idea of how data moves within the
organization. Because process models illustrate the activities of individuals in the company, the process model can be
used to determine how a database application interface is design.
User feedback documentation
Physical Modeling
Physical modeling involves the actual design of a database according to the requirements that were established during
logical modeling. Logical modeling mainly involves gathering the requirements of the business, with the latter part of logical
modeling directed toward the goals and requirements of the database. Physical modeling deals with the conversion of the
logical, or business model, into a relational database model. When physical modeling occurs, objects are being defined at the
schema level. A schema is a group of related objects in a database. A database design effort is normally associated with one
schema.
During physical modeling, objects such as tables and columns are created based on entities and attributes that were defined
during logical modeling. Constraints are also defined, including primary keys, foreign keys, other unique keys, and check
constraints. Views can be created from database tables to summarize data or to simply provide the user with another
perspective of certain data. Other objects such as indexes and snapshots can also be defined during physical modeling.
Physical modeling is when all the pieces come together to complete the process of defining a database for a business.

Physical modeling is database software specific, meaning that the objects defined during physical modeling can vary
depending on the relational database software being used. For example, most relational database systems have variations
with the way data types are represented and the way data is stored, although basic data types are conceptually the same
among different implementations. Additionally, some database systems have objects that are not available in other database
systems.
Typical deliverables of physical modeling include the following:
 Server model diagrams
The server model diagram shows tables, columns, and relationships within a database.
 User feedback documentation
Database design documentation

Implementation of the Physical Model

The implementation of the physical model is dependent on the hardware and software being used by the company. The
hardware can determine what type of software can be used because software is normally developed according to common
hardware and operating system platforms. Some database software might only be available for Windows NT systems,
whereas other software products such as Oracle are available on a wider range of operating system platforms, such as UNIX.
The available hardware is also important during the implementation of the physical model because data is physically
distributed onto one or more physical disk drives. Normally, the more physical drives available, the better the performance of
the database after the implementation. Some software products now are Java-based and can run on virtually any platform.
Typically, the decisions to use particular hardware, operating system platforms, and database software are made in
conjunction with one another.

UNIVERSE OF DISCOURSE
A database is a model of some aspect of the reality of an organization (Kent, 1978). It is conventional to call this reality a
universe of discourse (UoD), or sometimes a domain of discourse. A UoD will be made up of classes and relationships
between classes. The classes in a UoD will be defined in terms of their properties or attributes.
A database of whatever form, electronic or otherwise, must be designed. The process of database design is the activity of representing classes,
attributes and relationships in a database.
PERSISTENCE
Data in a database is described as being persistent. By persistent we mean that the data is held for some duration. The
duration may not actually be very long. The term persistence is used to distinguish more permanent data from data which is
more transient in nature. Hence, product data, account data, patient data and student data would all normally be regarded
as examples of persistent data. In contrast, data input at a personal computer, held for manipulation within a program, or
printed out on a report, would not be regarded as persistent, as once it has been used it is no longer required.
INTENSIONAL AND EXTENSIONAL PARTS
A database is made up of two parts: an intentional part and an extensional part. The intension of a database is a set of
definitions which describe the structure or organization of a given database. The extension of a database is the total set of
data in the database. The intension of a database is also referred to as its schema. The activity of developing a schema for a
database system is referred to as database design.
INTEGRITY
When we say that a database displays integrity we mean that it is an accurate reflection of its UoD. The process of ensuring
integrity is a major feature of modern information systems. The process of designing for integrity is a much neglected aspect
of database development.
Integrity is an important issue because most databases are designed, once in use, to change. In other words, the data in a
database will change over a period of time. If a database does not change, i.e. it is only used for reference purposes, then
integrity is not an issue of concern.
INTEGRITY CONSTRAINTS
Database integrity is ensured through integrity constraints. An integrity constraint is a rule which establishes how a database is to remain an
accurate reflection of its UoD.
Constraints may be divided into two major types: static constraints and transition constraints (see Figure 1.4). A static constraint, sometimes
known as a ‘state invariant’, is used to check that an incoming transaction will not change a database into an invalid state. A static constraint
is a restriction defined on states.
TRANSACTIONS
The events that cause a change of state are called transactions in database terms. A transaction changes a database from
one state to another. A new state is brought into being by asserting the facts that become true and/or denying the facts that
cease to be true. Hence, in our example we might want to enrol Peter Jones in the module relational database design. This is
an example of a transaction. Transactions of a similar form are called transaction types.
The Object-Oriented Data Model
1. A data model is a logic organization of the real world objects (entities), constraints on them, and the relationships among
objects. A DB language is a concrete syntax for a data model. A DB system implements a data model.
2. A core object-oriented data model consists of the following basic object-oriented concepts:
(1) object and object identifier: Any real world entity is uniformly modeled as an object (associated with a unique id:
used to pinpoint an object to retrieve).
(2) attributes and methods: every object has a state (the set of values for the attributes of the object) and a behavior
(the set of methods - program code - which operate on the state of the object). The state and behavior encapsulated in an
object are accessed or invoked from outside the object only through explicit message passing.
[ An attribute is an instance variable, whose domain may be any class: user-defined or primitive. A class composition
hierarchy (aggregation relationship) is orthogonal to the concept of a class hierarchy. The link in a class composition
hierarchy may form cycles. ]
(3) class: a means of grouping all the objects which share the same set of attributes and methods. An object must belong to
only one class as an instance of that class (instance-of relationship). A class is similar to an abstract data type. A class may
also be primitive (no attributes), e.g., integer, string, Boolean.
(4) Class hierarchy and inheritance: derive a new class (subclass) from an existing class (superclass). The subclass
inherits all the attributes and methods of the existing class and may have additional attributes and methods. single
inheritance (class hierarchy) vs. multiple inheritance (class lattice).
What are the disadvantages of object oriented data model?
Unfamiliarity (causing an added training cost for developers), Inability to work with existing systems (a major benefit of C+
+), Data and operations are separated, No data abstraction or info hiding, Not responsive to changes in problem space
Inadequate for concurrent problems

Post-relational database models


Products offering a more general data model than the relational model are sometimes classified as post-relational. Alternate
terms include "hybrid database", "Object-enhanced RDBMS" and others. The data model in such products
incorporates relations but is not constrained by E.F. Codd's Information Principle, which requires that
all information in the database must be cast explicitly in terms of values in relations and in no other way
Some of these extensions to the relational model integrate concepts from technologies that pre-date the relational model.
For example, they allow representation of a directed graph with trees on thenodes.
Some post-relational products extend relational systems with non-relational features. Others arrived in much the same place
by adding relational features to pre-relational systems. Paradoxically, this allows products that are historically pre-relational,
such as PICK and MUMPS, to make a plausible claim to be post-relational.
Indexing
Indexing is a technique for improving database performance. The many types of index share the common property that they
eliminate the need to examine every entry when running a query. In large databases, this can reduce query time/cost by
orders of magnitude. The simplest form of index is a sorted list of values that can be searched using a binary search with an
adjacent reference to the location of the entry, analogous to the index in the back of a book. The same data can have
multiple indexes (an employee database could be indexed by last name and hire date.)
Indexes affect performance, but not results. Database designers can add or remove indexes without changing application
logic, reducing maintenance costs as the database grows and database usage evolves.
Given a particular query, the DBMS' query optimizer is responsible for devising the most efficient strategy for finding
matching data. The optimizer decides which index or indexes to use, how to combine data from different parts of the
database, how to provide data in the order requested, etc.
Indexes can speed up data access, but they consume space in the database, and must be updated each time the data is
altered. Indexes therefore can speed data access but slow data maintenance. These two properties determine whether a
given index is worth the cost.
Replication
Database replication involves maintaining multiple copies of a database on different computers, to allow more users to
access it, or to allow a secondary site to immediately take over if the primary site stops working. Some DBMS piggyback
replication on top of their transaction logging facility, applying the primary's log to the secondary in near real-time. Database
clustering is a related concept for handling larger databases and user communities by employing a cluster of multiple
computers to host a single database that can use replication as part of its approach .
Security
Database security denotes the system, processes, and procedures that protect a database from unauthorized activity.
DBMSs usually enforce security through access control, auditing, and encryption:
 Access control manages who can connect to the database via authentication and what they can do via authorization.
 Auditing records information about database activity: who, what, when, and possibly where.
 Encryption protects data at the lowest possible level by storing and possibly transmitting data in an unreadable form.
The DBMS encrypts data when it is added to the database and decrypts it when returning query results. This process can
occur on the client side of a network connection to prevent unauthorized access at the point of use.
Isolation
Isolation refers to the ability of one transaction to see the results of other transactions. Greater isolation typically reduces
performance and/or concurrency, leading DBMSs to provide administrative options to reduce isolation. For example, in a
database that analyzes trends rather than looking at low-level detail, increased performance might justify allowing readers to
see uncommitted changes ("dirty reads".)
Locking
When a transaction modifies a resource, the DBMS stops other transactions from also modifying it, typically by locking it.
Locks also provide one method of ensuring that data does not change while a transaction is reading it or even that it doesn't
change until a transaction that once read it has completed.
Granularity
Locks can be coarse, covering an entire database, fine-grained, covering a single data item, or intermediate covering a
collection of data such as all the rows in a RDBMS table.
Lock types
Locks can be shared or exclusive, and can lock out readers and/or writers. Locks can be created implicitly by the DBMS when
a transaction performs an operation, or explicitly at the transaction's request.
Shared locks allow multiple transactions to lock the same resource. The lock persists until all such transactions complete.
Exclusive locks are held by a single transaction and prevent other transactions from locking the same resource.
Read locks are usually shared, and prevent other transactions from modifying the resource. Write locks are exclusive, and
prevent other transactions from modifying the resource. On some systems, write locks also prevent other transactions from
reading the resource.
The DBMS implicitly locks data when it is updated, and may also do so when it is read. Transactions explicitly lock data to
ensure that they can complete without a deadlock or other complication. Explicit locks may be useful for some administrative
tasks.
Locking can significantly affect database performance, especially with large and complex transactions in highly concurrent
environments.
Deadlocks
Deadlocks occur when two transactions each require data that the other has already locked exclusively. Deadlock detection
is performed by the DBMS, which then aborts one of the transactions and allows the other to complete.

Types of database systems


Operational database
These databases store detailed data about the operations of an organization. They are typically organized by subject matter,
process relatively high volumes of updates using transactions. Essentially every major organization on earth uses such
databases. Examples include customer databases that record contact, credit, and demographic information about a business'
customers, personnel databases that hold information such as salary, benefits, skills data about employees, manufacturing
databases that record details about product components, parts inventory, and financial databases that keep track of the
organization's money, accounting and financial dealings.
Data warehouse
Data warehouses archive modern data from operational databases and often from external sources such as market research
firms. Often operational data undergoes transformation on its way into the warehouse, getting summarized, anonymized,
reclassified, etc. The warehouse becomes the central source of data for use by managers and other end-users who may not
have access to operational data. For example, sales data might be aggregated to weekly totals and converted from internal
product codes to use UPC codes so that it can be compared with ACNielsen data.Some basic and essential components of
data warehousing include retrieving and analyzing data, transforming,loading and managing data so as to make it available
for further use.
Analytical database
Analysts may do their work directly against a data warehouse, or create a separate analytic database for Online Analytical
Processing. For example, a company might extract sales records for analyzing the effectiveness of advertising and other
sales promotions at an aggregate level.
Distributed database
These are databases of local work-groups and departments at regional offices, branch offices, manufacturing plants and
other work sites. These databases can include segments of both common operational and common user databases, as well
as data generated and used only at a user’s own site.
End-user database
These databases consist of data developed by individual end-users. Examples of these are collections of documents in
spreadsheets, word processing and downloaded files, or even managing their personal baseball card collection.
External database
These databases contain data collected for use across multiple organizations, either freely or via subscription. The Internet
Movie Database is one example.
Hypermedia databases
The Worldwide web can be thought of as a database, albeit one spread across millions of independent computing
systems. Web browsers "process" this data one page at a time, while web crawlersand other software provide the equivalent
of database indexes to support search and other activities.

You might also like