Unit 1 Introduction To Database: Characteristics

Unit 1
Introduction to Database
Database is a collection of related data and data is a collection of facts and
figures that can be processed to produce information.
Mostly data represents recordable facts. Data aids in producing information,

which is based on facts. For example, if we have data about marks obtained
by all students, we can then conclude about toppers and average marks.
A database management system stores data in such a way that it

becomes easier to retrieve, manipulate, and produce information.
Characteristics
Traditionally, data was organized in file formats. DBMS was a new concept
then, and all the research was done to make it overcome the deficiencies in
traditional style of data management. A modern DBMS has the following
characteristics −
 Real-world entity − A modern DBMS is more realistic and uses real-world

entities to design its architecture. It uses the behavior and attributes too. For
example, a school database may use students as an entity and their age as an
attribute.
 Relation-based tables − DBMS allows entities and relations among them to

form tables. A user can understand the architecture of a database just by looking
at the table names.
 Isolation of data and application − A database system is entirely different than

its data. A database is an active entity, whereas data is said to be passive, on
which the database works and organizes. DBMS also stores metadata, which is
data about data, to ease its own process.
 Less redundancy − DBMS follows the rules of normalization, which splits a

relation when any of its attributes is having redundancy in values. Normalization
is a mathematically rich and scientific process that reduces data redundancy.
 Consistency − Consistency is a state where every relation in a database remains

consistent. There exist methods and techniques, which can detect attempt of
leaving database in inconsistent state. A DBMS can provide greater consistency
as compared to earlier forms of data storing applications like file-processing
systems.
 Query Language − DBMS is equipped with query language, which makes it more
efficient to retrieve and manipulate data. A user can apply as many and as
different filtering options as required to retrieve a set of data. Traditionally it was
not possible where file-processing system was used.
 ACID Properties − DBMS follows the concepts

of Atomicity, Consistency, Isolation, and Durability (normally shortened as
ACID). These concepts are applied on transactions, which manipulate data in a
database. ACID properties help the database stay healthy in multi-transactional
environments and in case of failure.
 Multiuser and Concurrent Access − DBMS supports multi-user environment

and allows them to access and manipulate data in parallel. Though there are
restrictions on transactions when users attempt to handle the same data item,
but users are always unaware of them.
 Multiple views − DBMS offers multiple views for different users. A user who is
in the Sales department will have a different view of database than a person
working in the Production department. This feature enables the users to have a
concentrate view of the database according to their requirements.
 Security − Features like multiple views offer security to some extent where users
are unable to access data of other users and departments. DBMS offers methods
to impose constraints while entering data into the database and retrieving the
same at a later stage. DBMS offers many different levels of security features,
which enables multiple users to have different views with different features. For
example, a user in the Sales department cannot see the data that belongs to the
Purchase department. Additionally, it can also be managed how much data of the
Sales department should be displayed to the user. Since a DBMS is not saved on
the disk as traditional file systems, it is very hard for miscreants to break the
code.
Users
A typical DBMS has users with different rights and permissions who use it for
different purposes. Some users retrieve data and some back it up. The users
of a DBMS can be broadly categorized as follows −
 Administrators − Administrators maintain the DBMS and are responsible for
administrating the database. They are responsible to look after its usage and by
whom it should be used. They create access profiles for users and apply
limitations to maintain isolation and force security. Administrators also look after
DBMS resources like system license, required tools, and other software and
hardware related maintenance.
 Designers − Designers are the group of people who actually work on the
designing part of the database. They keep a close watch on what data should be
kept and in what format. They identify and design the whole set of entities,
relations, constraints, and views.
 End Users − End users are those who actually reap the benefits of having a
DBMS. End users can range from simple viewers who pay attention to the logs or
market rates to sophisticated users such as business analysts.
Three layered Architecture or Three tier

Architecture
The design of a DBMS depends on its architecture. It can be centralized or
decentralized or hierarchical. The architecture of a DBMS can be seen as
either single tier or multi-tier. An n-tier architecture divides the whole system
into related but independent n modules, which can be independently
modified, altered, changed, or replaced.
In 1-tier architecture, the DBMS is the only entity where the user directly sits
on the DBMS and uses it. Any changes done here will directly be done on the
DBMS itself. It does not provide handy tools for end-users. Database
designers and programmers normally prefer to use single-tier architecture.
If the architecture of DBMS is 2-tier, then it must have an application through
which the DBMS can be accessed. Programmers use 2-tier architecture where
they access the DBMS by means of an application. Here the application tier
is entirely independent of the database in terms of operation, design, and
programming.
3-tier Architecture
A 3-tier architecture separates its tiers from each other based on the
complexity of the users and how they use the data present in the database.
It is the most widely used architecture to design a DBMS.
 Database (Data) Tier − At this tier, the database resides along with its query
processing languages. We also have the relations that define the data and their
constraints at this level.
 Application (Middle) Tier − At this tier reside the application server and the
programs that access the database. For a user, this application tier presents an
abstracted view of the database. End-users are unaware of any existence of the
database beyond the application. At the other end, the database tier is not aware
of any other user beyond the application tier. Hence, the application layer sits in
the middle and acts as a mediator between the end-user and the database.
 User (Presentation) Tier − End-users operate on this tier and they know
nothing about any existence of the database beyond this layer. At this layer,
multiple views of the database can be provided by the application. All views are
generated by applications that reside in the application tier.
Multiple-tier database architecture is highly modifiable, as almost all its

components are independent and can be changed independently.
Data Models
Data models define how the logical structure of a database is modeled. Data
Models are fundamental entities to introduce abstraction in a DBMS. Data
models define how data is connected to each other and how they are
processed and stored inside the system.
The very first data model could be flat data-models, where all the data used
are to be kept in the same plane. Earlier data models were not so scientific,
hence they were prone to introduce lots of duplication and update anomalies.
Entity-Relationship Model
Entity-Relationship (ER) Model is based on the notion of real-world entities
and relationships among them. While formulating real-world scenario into the
database model, the ER Model creates entity set, relationship set, general
attributes and constraints.
ER Model is best used for the conceptual design of a database.
ER Model is based on −
 Entities and their attributes.
 Relationships among entities.
These concepts are explained below.

 Entity − An entity in an ER Model is a real-world entity having properties
called attributes. Every attribute is defined by its set of values called domain.
For example, in a school database, a student is considered as an entity. Student
has various attributes like name, age, class, etc.
 Relationship − The logical association among entities is called relationship.

Relationships are mapped with entities in various ways. Mapping cardinalities
define the number of association between two entities.
Mapping cardinalities −
o one to one
o one to many
o many to one
o many to many
Relational Model
The most popular data model in DBMS is the Relational Model. It is more
scientific a model than others. This model is based on first-order predicate
logic and defines a table as an n-ary relation.
The main highlights of this model are −
 Data is stored in tables called relations.
 Relations can be normalized.
 In normalized relations, values saved are atomic values.
 Each row in a relation contains a unique value.
 Each column in a relation contains values from a same domain.
Concepts of Relational Model

Relational data model is the primary data model, which is used widely around
the world for data storage and processing. This model is simple and it has all
the properties and capabilities required to process data with storage
efficiency.
Concepts
Tables − In relational data model, relations are saved in the format of
Tables. This format stores the relation among entities. A table has rows and
columns, where rows represents records and columns represent the
attributes.
Tuple − A single row of a table, which contains a single record for that
relation is called a tuple.
Relation instance − A finite set of tuples in the relational database system

represents relation instance. Relation instances do not have duplicate tuples.
Relation schema − A relation schema describes the relation name (table

name), attributes, and their names.
Relation key − Each row has one or more attributes, known as relation key,
which can identify the row in the relation (table) uniquely.
Attribute domain − Every attribute has some pre-defined value scope,

known as attribute domain.
Constraints
Every relation has some conditions that must hold for it to be a valid relation.
These conditions are called Relational Integrity Constraints. There are
three main integrity constraints −
 Key constraints
 Domain constraints
 Referential integrity constraints
Key Constraints
There must be at least one minimal subset of attributes in the relation, which
can identify a tuple uniquely. This minimal subset of attributes is
called key for that relation. If there are more than one such minimal subsets,
these are called candidate keys.
Key constraints force that −
 in a relation with a key attribute, no two tuples can have identical values for key
attributes.
 a key attribute can not have NULL values.
Key constraints are also referred to as Entity Constraints.
Domain Constraints
Attributes have specific values in real-world scenario. For example, age can
only be a positive integer. The same constraints have been tried to employ
on the attributes of a relation. Every attribute is bound to have a specific
range of values. For example, age cannot be less than zero and telephone
numbers cannot contain a digit outside 0-9.
Referential integrity Constraints

Referential integrity constraints work on the concept of Foreign Keys. A
foreign key is a key attribute of a relation that can be referred in other
relation.
Referential integrity constraint states that if a relation refers to a key attribute

of a different or same relation, then that key element must exist.
Relational Algebra
Relational algebra is a procedural query language, which takes instances of
relations as input and yields instances of relations as output. It uses operators
to perform queries. An operator can be either unary or binary. They accept
relations as their input and yield relations as their output. Relational algebra
is performed recursively on a relation and intermediate results are also
considered relations.
The fundamental operations of relational algebra are as follows −
 Select
 Project
 Union
 Set different
 Cartesian product
 Rename
We will discuss all these operations in the following sections.
Select Operation (σ)

It selects tuples that satisfy the given predicate from a relation.
Notation − σp(r)
Where σ stands for selection predicate and rstands for relation. p is

prepositional logic formula which may use connectors like and, or, and not.
These terms may use relational operators like − =, ≠, ≥, < , >, ≤.
For example −
σsubject = "database"(Books)
Output − Selects tuples from books where subject is 'database'.
σsubject = "database" and price = "450"(Books)
Output − Selects tuples from books where subject is 'database' and 'price'
is 450.
σsubject = "database" and price = "450" or year > "2010"(Books)
Output − Selects tuples from books where subject is 'database' and 'price'
is 450 or those books published after 2010.
Project Operation (∏)

It projects column(s) that satisfy a given predicate.
Notation − ∏A1, A2, An (r)
Where A1, A2 , An are attribute names of relation r.
Duplicate rows are automatically eliminated, as relation is a set.
For example −
∏subject, author (Books)
Selects and projects columns named as subject and author from the relation
Books.
Union Operation (∪)

It performs binary union between two given relations and is defined as −
r ∪ s = { t | t ∈ r or t ∈ s}
Notation − r U s
Where r and s are either database relations or relation result set (temporary
relation).
For a union operation to be valid, the following conditions must hold −
 r, and s must have the same number of attributes.
 Attribute domains must be compatible.

 Duplicate tuples are automatically eliminated.
∏ author (Books) ∪ ∏ author (Articles)
Output − Projects the names of the authors who have either written a book
or an article or both.
Set Difference (−)

The result of set difference operation is tuples, which are present in one
relation but are not in the second relation.
Notation − r − s
Finds all the tuples that are present in r but not in s.
∏ author (Books) − ∏ author (Articles)
Output − Provides the name of authors who have written books but not
articles.
Cartesian Product (Χ)

Combines information of two different relations into one.
Notation − r Χ s
Where r and s are relations and their output will be defined as −
r Χ s = { q t | q ∈ r and t ∈ s}
σauthor = 'tutorialspoint'(Books Χ Articles)
Output − Yields a relation, which shows all the books and articles written by
tutorialspoint.
Rename Operation (ρ)

The results of relational algebra are also relations but without any name. The
rename operation allows us to rename the output relation. 'rename' operation
is denoted with small Greek letter rho ρ.
Notation − ρ x (E)
Where the result of expression E is saved with name of x.
Additional operations are −
 Set intersection
 Assignment
 Natural join
Relational Calculus
In contrast to Relational Algebra, Relational Calculus is a non-procedural
query language, that is, it tells what to do but never explains how to do it.
Relational calculus exists in two forms −
Tuple Relational Calculus (TRC)

Filtering variable ranges over tuples
Notation − {T | Condition}
Returns all tuples T that satisfies a condition.
For example −
{ T.name | Author(T) AND T.article = 'database' }
Output − Returns tuples with 'name' from Author who has written article on
'database'.
TRC can be quantified. We can use Existential (∃) and Universal Quantifiers
(∀).
For example −
{ R| ∃T ∈ Authors(T.article='database' AND R.name=T.name)}
Output − The above query will yield the same result as the previous one.
Domain Relational Calculus (DRC)

In DRC, the filtering variable uses the domain of attributes instead of entire
tuple values (as done in TRC, mentioned above).
Notation −
{ a1, a2, a3, ..., an | P (a1, a2, a3, ... ,an)}
Where a1, a2 are attributes and P stands for formulae built by inner
attributes.
For example −
{< article, page, subject > | ∈ TutorialsPoint ∧ subject = 'database'}
Output − Yields Article, Page, and Subject from the relation TutorialsPoint,
where subject is database.
Just like TRC, DRC can also be written using existential and universal
quantifiers. DRC also involves relational operators.
The expression power of Tuple Relation Calculus and Domain Relation

Calculus is equivalent to Relational Algebra.
Unit 2
Introduction to SQL Queries
SQL is Structured Query Language, which is a computer language for storing,
manipulating and retrieving data stored in a relational database.
SQL is the standard language for Relational Database System. All the
Relational Database Management Systems (RDMS) like MySQL, MS Access,
Oracle, Sybase, Informix, Postgres and SQL Server use SQL as their standard
database language.
Also, they are using different dialects, such as −
 MS SQL Server using T-SQL,
 Oracle using PL/SQL,
 MS Access version of SQL is called JET SQL (native format) etc.
SQL is widely popular because it offers the following advantages −
 Allows users to access data in the relational database management systems.
 Allows users to describe the data.
 Allows users to define the data in a database and manipulate that data.
 Allows to embed within other languages using SQL modules, libraries & pre-
compilers.
 Allows users to create and drop databases and tables.
 Allows users to create view, stored procedure, functions in a database.
 Allows users to set permissions on tables, procedures and views.

SQL Process
When you are executing an SQL command for any RDBMS, the system
determines the best way to carry out your request and SQL engine figures
out how to interpret the task.
There are various components included in this process.
These components are −
 Query Dispatcher
 Optimization Engines
 Classic Query Engine
 SQL Query Engine, etc.
A classic query engine handles all the non-SQL queries, but a SQL query
engine won't handle logical files.
Following is a simple diagram showing the SQL Architecture −
These SQL commands are mainly categorized into four categories as discussed below:
1. DDL(Data Definition Language) : DDL or Data Definition Language actually consists
of the SQL commands that can be used to define the database schema. It simply
deals with descriptions of the database schema and is used to create and modify
the structure of database objects in database.
Examples of DDL commands:
 CREATE – is used to create the database or its objects (like table, index, function,
views, store procedure and triggers).
Syntax:
CREATE TABLE table_name
column1 data_type(size),
....
);
table_name: name of the table.
column1 name of the first column.
data_type: Type of data we want to store in the particular column.
For example,int for integer data.
size: Size of the data we can store in a particular column. For example if for
a column we specify the data_type as int and size as 10 then this column can store an integer
number of maximum 10 digits.
Example Query:
This query will create a table named Students with three columns, ROLL_NO, NAME
and SUBJECT.
CREATE TABLE Students
ROLL_NO int(3),
NAME varchar(20),
SUBJECT varchar(20),
);
 DROP – is used to delete objects from the database.
Syntax:
DROP object object_name
Examples:
DROP TABLE table_name;
table_name: Name of the table to be deleted.
DROP DATABASE database_name;
database_name: Name of the database to be
 ALTER-is used to alter the structure of the database.

 TRUNCATE–is used to remove all records from a table, including all spaces
 allocated for the records are removed.
 Syntax:
 TRUNCATE TABLE table_name;
 table_name: Name of the table to be truncated.
 DATABASE name - student_data
 COMMENT –is used to add comments to the data dictionary.

 Syntax:
 SELECT * FROM /* Customers; */
 RENAME –is used to rename an object existing in the database.

 Syntax(Oracle):
 ALTER TABLE table_name
 RENAME COLUMN old_name TO new_name;

 DML(Data Manipulation Language) :The SQL commands that deals with the
manipulation of data present in database belong to DML or Data Manipulation
Language and this includes most of the SQL statements.
Examples of DML:
o SELECT – is used to retrieve data from the a database.

o INSERT – is used to insert data into a table.
o UPDATE – is used to update existing data within a table.
o DELETE – is used to delete records from a database table.
 DCL(Data Control Language) : DCL includes commands such as GRANT and REVOKE
which mainly deals with the rights, permissions and other controls of the database
system.
Examples of DCL commands:
o GRANT-gives user’s access privileges to database.

o REVOKE-withdraw user’s access privileges given by using the GRANT command.
 TCL(transaction Control Language) :TCL commands deals with the transaction
within the database.
Examples of TCL commands:
o COMMIT– commits a Transaction.

o ROLLBACK– rollbacks a transaction in case of any error occurs.
o SAVEPOINT–sets a savepoint within a transaction.
o SET TRANSACTION–specify characteristics for the transaction
INTEGRITY CONSTRAINTS OVER RELATION
 Database integrity refers to the validity and consistency of stored data. Integrity is
usually expressed in terms of constraints, which are consistency rules that the
database is not permitted to violate. Constraints may apply to each attribute or
they may apply to relationships between tables.
 Integrity constraints ensure that changes (update deletion, insertion) made to the
database by authorized users do not result in a loss of data consistency. Thus,
integrity constraints guard against accidental damage to the database.
EXAMPLE- A brood group must be ‘A’ or ‘B’ or ‘AB’ or ‘O’ only (can not any other
values else).
TYPES OF INTEGRITY CONSTRAINTS

Various types of integrity constraints are-
1. Domain Integrity
2. Entity Integrity Constraint
3. Referential Integrity Constraint
4. Key Constraints
1. Domain Integrity-
Domain integrity means the definition of a valid set of values for an attribute. You define
data type, length or size, is null value allowed , is the value unique or not for an attribute
,the default value, the range (values in between) and/or specific values for the attribute.
2. Entity Integrity Constraint-
This rule states that in any database relation value of attribute of a primary key can't be
null.
EXAMPLE- Consider a relation "STUDENT" Where "Stu_id" is a primary key and it
must not contain any null value whereas other attributes may contain null value e.g
"Branch" in the following relation contains one null value.
Stu_id Name Branch
11255234 Aman CSE

Stu_id Name Branch
11255369 Kapil EcE
11255324 Ajay ME
11255237 Raman CSE
11255678 Aastha ECE
3.Referential Integrity Constraint-

It states that if a foreign key exists in a relation then either the foreign key value must
match a primary key value of some tuple in its home relation or the foreign key value
must be null.
The rules are:
1. You can't delete a record from a primary table if matching records exist in a related table.
2. You can't change a primary key value in the primary table if that record has related
records.
3. You can't enter a value in the foreign key field of the related table that doesn't exist in the
primary key of the primary table.
4. However, you can enter a Null value in the foreign key, specifying that the records are
unrelated.
EXAMPLE-
Consider 2 relations "stu" and "stu_1" Where "Stu_id " is the primary key in the "stu"
relation and foreign key in the "stu_1" relation.
Relation "stu"
Stu_id Name Branch
11255234 Aman CSE
11255369 Kapil EcE
11255324 Ajay ME
11255237 Raman CSE
11255678 Aastha ECE
Relation "stu_1"
Stu_id Course Duration
11255234 B TECH 4 years

Stu_id Course Duration
Examples
Rule 1. You can't delete any of the rows in the ”stu” relation that are visible since all the
”stu” are in use in the “stu_1” relation.
Rule 2. You can't change any of the ”Stu_id” in the “stu” relation since all the “Stu_id”
are in use in the ”stu_1” relation. * Rule 3.* The values that you can enter in the” Stu_id”
field in the “stu_1” relation must be in the” Stu_id” field in the “stu” relation.
Rule 4 You can enter a null value in the "stu_1" relation if the records are unrelated.
4.Key Constraints-
A Key Constraint is a statement that a certain minimal subset of the fields of a relation is
a unique identifier for a tuple. The types of key constraints-
1. Primary key constraints

2. Unique key constraints
3. Foreign Key constraints
4. NOT NULL constraints
5. Check constraints
1. Primary key constraints

Primary key is the term used to identify one or more columns in a table that make a row
of data unique. Although the primary key typically consists of one column in a table,
more than one column can comprise the primary key.
For example, either the employee's Social Security number or an assigned employee
identification number is the logical primary key for an employee table. The objective is
for every record to have a unique primary key or value for the employee's identification
number. Because there is probably no need to have more than one record for each
employee in an employee table, the employee identification number makes a logical
primary key. The primary key is assigned at table creation.
The following example identifies the EMP_ID column as the PRIMARY KEY for the
EMPLOYEES table:
CREATE TABLE EMPLOYEE_TBL

(EMP_ID CHAR(9) NOT NULL PRIMARY KEY,
EMP_NAME VARCHAR (40) NOT NULL,
EMP_ST_ADDR VARCHAR (20) NOT NULL,
EMP_CITY VARCHAR (15) NOT NULL,
EMP_ST CHAR(2) NOT NULL,
EMP_ZIP INTEGER(5) NOT NULL,
EMP_PHONE INTEGER(10) NULL,
EMP_PAGER INTEGER(10) NULL);
2. Unique Constraints
A unique column constraint in a table is similar to a primary key in that the value in that
column for every row of data in the table must have a unique value. Although a primary
key constraint is placed on one column, you can place a unique constraint on another
column even though it is not actually for use as the primary key.

(EMP_ID CHAR(9) NOT NULL PRIMARY KEY,
EMP_NAME VARCHAR (40) NOT NULL,
EMP_ST_ADDR VARCHAR (20) NOT NULL,
EMP_CITY VARCHAR (15) NOT NULL,
EMP_ZIP INTEGER(5) NOT NULL,
EMP_PHONE INTEGER(10) NULL UNIQUE,
EMP_PAGER INTEGER(10) NULL)
3. Foreign Key Constraints

A foreign key is a column in a child table that references a primary key in the parent
table. A foreign key constraint is the main mechanism used to enforce referential
integrity between tables in a relational database. A column defined as a foreign key is
used to reference a column defined as a primary key in another table.
CREATE TABLE EMPLOYEE_PAY_TBL

(EMP_ID CHAR(9) NOT NULL,
POSITION VARCHAR2(15) NOT NULL,
DATE_HIRE DATE NULL,
PAY_RATE NUMBER(4,2) NOT NULL,
DATE_LAST_RAISE DATE NULL,
4. NOT NULL Constraints

Previous examples use the keywords NULL and NOT NULL listed on the same line as
each column and after the data type. NOT NULL is a constraint that you can place on a
table's column. This constraint disallows the entrance of NULL values into a column; in
other words, data is required in a NOT NULL column for each row of data in the table.
NULL is generally the default for a column if NOT NULL is not specified, allowing NULL
values in a column.
5. Check Constraints
Check (CHK) constraints can be utilized to check the validity of data entered into
particular table columns. Check constraints are used to provide back-end database
edits, although edits are commonly found in the front-end application as well. General
edits restrict values that can be entered into columns or objects, whether within the
database itself or on a front-end application. The check constraint is a way of providing
another protective layer for the data.

(EMP_ID CHAR(9) NOT NULL,
EMP_NAME VARCHAR2(40) NOT NULL,
EMP_ST_ADDR VARCHAR2(20) NOT NULL,
EMP_CITY VARCHAR2(15) NOT NULL,
EMP_ZIP NUMBER(5) NOT NULL,
EMP_PHONE NUMBER(10) NULL,
EMP_PAGER NUMBER(10) NULL),
PRIMARY KEY (EMP_ID),
CONSTRAINT CHK_EMP_ZIP CHECK ( EMP_ZIP = '46234');
JOINS
Join is a combination of a Cartesian product followed by a selection process.

A Join operation pairs two tuples from different relations, if and only if a given
join condition is satisfied.
We will briefly describe various join types in the following sections.
Theta (θ) Join

Theta join combines tuples from different relations provided they satisfy the
theta condition. The join condition is denoted by the symbol θ.
Notation
R1 ⋈θ R2
R1 and R2 are relations having attributes (A1, A2, .., An) and (B1, B2,.. ,Bn)
such that the attributes don’t have anything in common, that is R1 ∩ R2 =
Φ.
Theta join can use all kinds of comparison operators.
Student
SID Name Std
101 Alex 10
102 Maria 11
Subjects
Class Subject
10 Math
10 English
11 Music
11 Sports
Student_Detail −
STUDENT ⋈Student.Std = Subject.Class SUBJECT
Student_detail
SID Name Std Class Subject

101 Alex 10 10 Math
101 Alex 10 10 English
102 Maria 11 11 Music
102 Maria 11 11 Sports
Equijoin
When Theta join uses only equality comparison operator, it is said to be
equijoin. The above example corresponds to equijoin.
Natural Join (⋈)

Natural join does not use any comparison operator. It does not concatenate
the way a Cartesian product does. We can perform a Natural Join only if there
is at least one common attribute that exists between two relations. In
addition, the attributes must have the same name and domain.
Natural join acts on those matching attributes where the values of attributes
in both the relations are same.
Courses
CID Course Dept
CS01 Database CS
ME01 Mechanics ME
EE01 Electronics EE
HoD
Dept Head
CS Alex
ME Maya
EE Mira
Courses ⋈ HoD
Dept CID Course Head
CS CS01 Database Alex
ME ME01 Mechanics Maya
EE EE01 Electronics Mira
Outer Joins
Theta Join, Equijoin, and Natural Join are called inner joins. An inner join
includes only those tuples with matching attributes and the rest are discarded
in the resulting relation. Therefore, we need to use outer joins to include all
the tuples from the participating relations in the resulting relation. There are
three kinds of outer joins − left outer join, right outer join, and full outer join.
Left Outer Join(R S)

All the tuples from the Left relation, R, are included in the resulting relation.
If there are tuples in R without any matching tuple in the Right relation S,
then the S-attributes of the resulting relation are made NULL.
Left
A B
100 Database
101 Mechanics
102 Electronics
Right
A B
100 Alex
102 Maya
104 Mira
Courses HoD
A B C D
100 Database 100 Alex
101 Mechanics --- ---
102 Electronics 102 Maya

Right Outer Join: ( R S)
All the tuples from the Right relation, S, are included in the resulting relation.
If there are tuples in S without any matching tuple in R, then the R-attributes
of resulting relation are made NULL.
Courses HoD
A B C D
--- --- 104 Mira
Full Outer Join: ( R S)

All the tuples from both participating relations are included in the resulting
relation. If there are no matching tuples for both relations, their respective
unmatched attributes are made NULL.
Courses HoD
A B C D
101 Mechanics --- ---
--- --- 104 Mira
VIEWS
A view is nothing more than a SQL statement that is stored in the database
with an associated name. A view is actually a composition of a table in the
form of a predefined SQL query.
A view can contain all rows of a table or select rows from a table. A view can
be created from one or many tables which depends on the written SQL query
to create a view.
Views, which are a type of virtual tables allow users to do the following −
 Structure data in a way that users or classes of users find natural or intuitive.
 Restrict access to the data in such a way that a user can see and (sometimes)
modify exactly what they need and no more.
 Summarize data from various tables which can be used to generate reports.
Creating Views
Database views are created using the CREATE VIEW statement. Views can
be created from a single table, multiple tables or another view.
To create a view, a user must have the appropriate system privilege

according to the specific implementation.
The basic CREATE VIEW syntax is as follows −

CREATE VIEW view_name AS
SELECT column1, column2.....
FROM table_name
WHERE [condition];
You can include multiple tables in your SELECT statement in a similar way as
you use them in a normal SQL SELECT query.
Example
Consider the CUSTOMERS table having the following records −
+----+----------+-----+-----------+----------+
| ID | NAME | AGE | ADDRESS | SALARY |
+----+----------+-----+-----------+----------+
| 1 | Ramesh | 32 | Ahmedabad | 2000.00 |
| 2 | Khilan | 25 | Delhi | 1500.00 |

| 3 | kaushik | 23 | Kota | 2000.00 |
| 4 | Chaitali | 25 | Mumbai | 6500.00 |
| 5 | Hardik | 27 | Bhopal | 8500.00 |
| 6 | Komal | 22 | MP | 4500.00 |
| 7 | Muffy | 24 | Indore | 10000.00 |
+----+----------+-----+-----------+----------+
Following is an example to create a view from the CUSTOMERS table. This

view would be used to have customer name and age from the CUSTOMERS
table.
SQL > CREATE VIEW CUSTOMERS_VIEW AS
SELECT name, age
FROM CUSTOMERS;
Now, you can query CUSTOMERS_VIEW in a similar way as you query an

actual table. Following is an example for the same.
SQL > SELECT * FROM CUSTOMERS_VIEW;
This would produce the following result.

+----------+-----+
| name | age |
+----------+-----+
| Ramesh | 32 |
| Khilan | 25 |
| kaushik | 23 |
| Chaitali | 25 |
| Hardik | 27 |
| Komal | 22 |
| Muffy | 24 |
+----------+-----+
The WITH CHECK OPTION

The WITH CHECK OPTION is a CREATE VIEW statement option. The purpose
of the WITH CHECK OPTION is to ensure that all UPDATE and INSERTs satisfy
the condition(s) in the view definition.
If they do not satisfy the condition(s), the UPDATE or INSERT returns an

error.
The following code block has an example of creating same view
CUSTOMERS_VIEW with the WITH CHECK OPTION.
CREATE VIEW CUSTOMERS_VIEW AS
SELECT name, age
FROM CUSTOMERS
WHERE age IS NOT NULL
WITH CHECK OPTION;
The WITH CHECK OPTION in this case should deny the entry of any NULL
values in the view's AGE column, because the view is defined by data that
does not have a NULL value in the AGE column.
Updating a View
A view can be updated under certain conditions which are given below −
 The SELECT clause may not contain the keyword DISTINCT.
 The SELECT clause may not contain summary functions.
 The SELECT clause may not contain set functions.
 The SELECT clause may not contain set operators.
 The SELECT clause may not contain an ORDER BY clause.
 The FROM clause may not contain multiple tables.
 The WHERE clause may not contain subqueries.
 The query may not contain GROUP BY or HAVING.
 Calculated columns may not be updated.
 All NOT NULL columns from the base table must be included in the view in order
for the INSERT query to function.
So, if a view satisfies all the above-mentioned rules then you can update that
view. The following code block has an example to update the age of Ramesh.
SQL > UPDATE CUSTOMERS_VIEW
SET AGE = 35
WHERE name = 'Ramesh';
This would ultimately update the base table CUSTOMERS and the same would
reflect in the view itself. Now, try to query the base table and the SELECT
statement would produce the following result.
+----+----------+-----+-----------+----------+
+----+----------+-----+-----------+----------+
| 2 | Khilan | 25 | Delhi | 1500.00 |
| 3 | kaushik | 23 | Kota | 2000.00 |
| 4 | Chaitali | 25 | Mumbai | 6500.00 |
| 5 | Hardik | 27 | Bhopal | 8500.00 |
| 6 | Komal | 22 | MP | 4500.00 |
| 7 | Muffy | 24 | Indore | 10000.00 |
+----+----------+-----+-----------+----------+
Inserting Rows into a View

Rows of data can be inserted into a view. The same rules that apply to the
UPDATE command also apply to the INSERT command.
Here, we cannot insert rows in the CUSTOMERS_VIEW because we have not

included all the NOT NULL columns in this view, otherwise you can insert rows
in a view in a similar way as you insert them in a table.
Deleting Rows into a View

Rows of data can be deleted from a view. The same rules that apply to the
UPDATE and INSERT commands apply to the DELETE command.
Following is an example to delete a record having AGE = 22.
SQL > DELETE FROM CUSTOMERS_VIEW
WHERE age = 22;
This would ultimately delete a row from the base table CUSTOMERS and the
same would reflect in the view itself. Now, try to query the base table and
the SELECT statement would produce the following result.
+----+----------+-----+-----------+----------+
+----+----------+-----+-----------+----------+
| 2 | Khilan | 25 | Delhi | 1500.00 |
| 3 | kaushik | 23 | Kota | 2000.00 |
| 4 | Chaitali | 25 | Mumbai | 6500.00 |
| 5 | Hardik | 27 | Bhopal | 8500.00 |
| 7 | Muffy | 24 | Indore | 10000.00 |
+----+----------+-----+-----------+----------+
Dropping Views
Obviously, where you have a view, you need a way to drop the view if it is
no longer needed. The syntax is very simple and is given below −
DROP VIEW view_name;
Following is an example to drop the CUSTOMERS_VIEW from the CUSTOMERS

table.
DROP VIEW CUSTOMERS_VIEW;
Types of Function
1. System Defined Function
These functions are defined by Sql Server for different purpose. We have two types of system
defined function in Sql Server
1. Scalar Function
Scalar functions operates on a single value and returns a single value. Below is the list
of some useful Sql Server Scalar functions.
System Scalar Function
Scalar Function
Description
abs(-10.67)
This returns absolute number of the given number means 10.67.
rand(10)
This will generate random number of 10 characters.

round(17.56719,3)
This will round off the given number to 3 places of decimal means
17.567
upper('dotnet')
This will returns upper case of given string means 'DOTNET'
lower('DOTNET')
This will returns lower case of given string means 'dotnet'
ltrim(' dotnet')
This will remove the spaces from left hand side of 'dotnet' string.
convert(int, 15.56)
This will convert the given float value to integer means 15.
2. Aggregate Function
Aggregate functions operates on a collection of values and returns a single
value. Below is the list of some useful Sql Server Aggregate functions.
System Aggregate Function
Aggregate Function
Description
max()
This returns maximum value from a collection of values.

min()
This returns minimum value from a collection of values.
avg()
This returns average of all values in a collection.
count()
This returns no of counts from a collection of values.
TRIGGERS
Triggers in PL/SQL. Triggers are stored programs, which are automatically
executed or fired when some events occur. Triggers are, in fact, written to
be executed in response to any of the following events −
 A database manipulation (DML)statement (DELETE, INSERT, or UPDATE)
 A database definition (DDL) statement (CREATE, ALTER, or DROP).
 A database operation (SERVERERROR, LOGON, LOGOFF, STARTUP, or

SHUTDOWN).
Triggers can be defined on the table, view, schema, or database with which
the event is associated.
Benefits of Triggers
Triggers can be written for the following purposes −
 Generating some derived column values automatically
 Enforcing referential integrity
 Event logging and storing information on table access
 Auditing
 Synchronous replication of tables
 Imposing security authorizations
 Preventing invalid transactions

Creating Triggers
The syntax for creating a trigger is −
CREATE [OR REPLACE ] TRIGGER trigger_name
{BEFORE | AFTER | INSTEAD OF }
{INSERT [OR] | UPDATE [OR] | DELETE}
[OF col_name]
ON table_name
[REFERENCING OLD AS o NEW AS n]
[FOR EACH ROW]
WHEN (condition)
DECLARE
Declaration-statements
BEGIN
Executable-statements
EXCEPTION
Exception-handling-statements
END;
Where,
 CREATE [OR REPLACE] TRIGGER trigger_name − Creates or replaces an existing

trigger with the trigger_name.
 {BEFORE | AFTER | INSTEAD OF} − This specifies when the trigger will be
executed. The INSTEAD OF clause is used for creating trigger on a view.
 {INSERT [OR] | UPDATE [OR] | DELETE} − This specifies the DML operation.
 [OF col_name] − This specifies the column name that will be updated.
 [ON table_name] − This specifies the name of the table associated with the
trigger.
 [REFERENCING OLD AS o NEW AS n] − This allows you to refer new and old values
for various DML statements, such as INSERT, UPDATE, and DELETE.
 [FOR EACH ROW] − This specifies a row-level trigger, i.e., the trigger will be
executed for each row being affected. Otherwise the trigger will execute just once
when the SQL statement is executed, which is called a table level trigger.
 WHEN (condition) − This provides a condition for rows for which the trigger would
fire. This clause is valid only for row-level triggers.
Example
To start with, we will be using the CUSTOMERS table we had created and
used in the previous chapters −
Select * from customers;
+----+----------+-----+-----------+----------+
+----+----------+-----+-----------+----------+
| 2 | Khilan | 25 | Delhi | 1500.00 |
| 3 | kaushik | 23 | Kota | 2000.00 |
| 4 | Chaitali | 25 | Mumbai | 6500.00 |
| 5 | Hardik | 27 | Bhopal | 8500.00 |
| 6 | Komal | 22 | MP | 4500.00 |
+----+----------+-----+-----------+----------+
The following program creates a row-level trigger for the customers table
that would fire for INSERT or UPDATE or DELETE operations performed on
the CUSTOMERS table. This trigger will display the salary difference between
the old values and new values −
CREATE OR REPLACE TRIGGER display_salary_changes
BEFORE DELETE OR INSERT OR UPDATE ON customers
FOR EACH ROW
WHEN (NEW.ID > 0)
DECLARE
sal_diff number;
BEGIN
sal_diff := :NEW.salary - :OLD.salary;
dbms_output.put_line('Old salary: ' || :OLD.salary);
dbms_output.put_line('New salary: ' || :NEW.salary);
dbms_output.put_line('Salary difference: ' || sal_diff);

END;
When the above code is executed at the SQL prompt, it produces the following
result −
Trigger created.
The following points need to be considered here −
 OLD and NEW references are not available for table-level triggers, rather you can
use them for record-level triggers.
 If you want to query the table in the same trigger, then you should use the AFTER
keyword, because triggers can query the table or change it again only after the
initial changes are applied and the table is back in a consistent state.
 The above trigger has been written in such a way that it will fire before any DELETE
or INSERT or UPDATE operation on the table, but you can write your trigger on a
single or multiple operations, for example BEFORE DELETE, which will fire
whenever a record will be deleted using the DELETE operation on the table.
Triggering a Trigger
Let us perform some DML operations on the CUSTOMERS table. Here is one
INSERT statement, which will create a new record in the table −
INSERT INTO CUSTOMERS (ID,NAME,AGE,ADDRESS,SALARY)
VALUES (7, 'Kriti', 22, 'HP', 7500.00 );
When a record is created in the CUSTOMERS table, the above create

trigger, display_salary_changeswill be fired and it will display the following
result −
Old salary:
New salary: 7500
Salary difference:
Because this is a new record, old salary is not available and the above result
comes as null. Let us now perform one more DML operation on the
CUSTOMERS table. The UPDATE statement will update an existing record in
the table −
UPDATE customers
SET salary = salary + 500
WHERE id = 2;
When a record is updated in the CUSTOMERS table, the above create

trigger, display_salary_changeswill be fired and it will display the following
result −
Old salary: 1500
New salary: 2000
Salary difference: 500
UNIT 3
ER Model - Basic Concepts

The ER model defines the conceptual view of a database. It works around
real-world entities and the associations among them. At view level, the ER
model is considered a good option for designing databases.
Entity
An entity can be a real-world object, either animate or inanimate, that can
be easily identifiable. For example, in a school database, students, teachers,
classes, and courses offered can be considered as entities. All these entities
have some attributes or properties that give them their identity.
An entity set is a collection of similar types of entities. An entity set may

contain entities with attribute sharing similar values. For example, a Students
set may contain all the students of a school; likewise a Teachers set may
contain all the teachers of a school from all faculties. Entity sets need not be
disjoint.
Attributes
Entities are represented by means of their properties, called attributes. All
attributes have values. For example, a student entity may have name, class,
and age as attributes.
There exists a domain or range of values that can be assigned to attributes.

For example, a student's name cannot be a numeric value. It has to be
alphabetic. A student's age cannot be negative, etc.
Types of Attributes
 Simple attribute − Simple attributes are atomic values, which cannot be divided
further. For example, a student's phone number is an atomic value of 10 digits.
 Composite attribute − Composite attributes are made of more than one simple
attribute. For example, a student's complete name may have first_name and
last_name.
 Derived attribute − Derived attributes are the attributes that do not exist in the
physical database, but their values are derived from other attributes present in
the database. For example, average_salary in a department should not be saved
directly in the database, instead it can be derived. For another example, age can
be derived from data_of_birth.
 Single-value attribute − Single-value attributes contain single value. For

example − Social_Security_Number.
 Multi-value attribute − Multi-value attributes may contain more than one

values. For example, a person can have more than one phone number,
email_address, etc.
These attribute types can come together in a way like −
 simple single-valued attributes
 simple multi-valued attributes
 composite single-valued attributes
 composite multi-valued attributes
Entity-Set and Keys

Key is an attribute or collection of attributes that uniquely identifies an entity
among entity set.
For example, the roll_number of a student makes him/her identifiable among

students.
 Super Key − A set of attributes (one or more) that collectively identifies an entity
in an entity set.
 Candidate Key − A minimal super key is called a candidate key. An entity set
may have more than one candidate key.
 Primary Key − A primary key is one of the candidate keys chosen by the
database designer to uniquely identify the entity set.
Relationship
The association among entities is called a relationship. For example, an
employee works_at a department, a student enrolls in a course. Here,
Works_at and Enrolls are called relationships.
Relationship Set
A set of relationships of similar type is called a relationship set. Like entities,
a relationship too can have attributes. These attributes are
called descriptive attributes.
Degree of Relationship
The number of participating entities in a relationship defines the degree of
the relationship.
 Binary = degree 2
 Ternary = degree 3
 n-ary = degree
Mapping Cardinalities
Cardinality defines the number of entities in one entity set, which can be
associated with the number of entities of other set via relationship set.
 One-to-one − One entity from entity set A can be associated with at most one
entity of entity set B and vice versa.
 One-to-many − One entity from entity set A can be associated with more than
one entities of entity set B however an entity from entity set B, can be associated
with at most one entity.
 Many-to-one − More than one entities from entity set A can be associated with
at most one entity of entity set B, however an entity from entity set B can be
associated with more than one entity from entity set A.
Many-to-many − One entity from A can be associated with more than one entity from
B and vice versa.
Dependencies
Dependencies in DBMS is a relation between two or more attributes. It has the following types in
DBMS:
1. Functional Dependency
2. Fully-Functional Dependency
3. Transitive Dependency
4. Multivalued Dependency
5. Partial Dependency
Let us start with Functional Dependency:
Functional Dependency
If the information stored in a table can uniquely determine another information in the same table,
then it is called Functional Dependency. Consider it as an association between two attributes of the
same relation.
If P functionally determines Q, then
P -> Q
Let us see an example:
<Employee>
EmpID EmpName EmpAge
E01 Amit 28
E02 Rohit 31
In the above table, EmpName is functionally dependent on EmpID because EmpName can take
only one value for the given value of EmpID:
EmpID -> EmpName
The same is displayed below:

Fully-functionally Dependency
An attribute is fully functional dependent on another attribute, if it is Functionally Dependent on that

attribute and not on any of its proper subset.
For example, an attribute Q is fully functional dependent on another attribute P, if it is Functionally

Dependent on P and not on any of the proper subset of P.
<ProjectCost>
ProjectID ProjectCost
001 1000
002 5000
<EmployeeProject>
EmpID ProjectID Days (spent on the project)
E099 001 320
E056 002 190
The above relations states:
EmpID, ProjectID, ProjectCost -> Days
However, it is not fully functional dependent.
Whereas the subset {EmpID, ProjectID} can easily determine the {Days} spent on the project by
the employee.
This summarizes and gives our fully functional dependency:
{EmpID, ProjectID} -> (Days)
Transitive Dependency
When an indirect relationship causes functional dependency it is called Transitive Dependency.
If P -> Q and Q -> R is true, then P-> R is a transitive dependency.
Multivalued Dependency
When existence of one or more rows in a table implies one or more other rows in the same table,
then the Multi-valued dependencies occur.
If a table has attributes P, Q and R, then Q and R are multi-valued facts of P.
It is represented by double arrow:
->->
For our example:
P->->Q
Q->->R
In the above case, Multivalued Dependency exists only if Q and R are independent attributes.
Partial Dependency
Partial Dependency occurs when a nonprime attribute is functionally dependent on part of a

candidate key.
The 2nd Normal Form (2NF) eliminates the Partial Dependency. Let us see an example:
<StudentProject>
StudentID ProjectNo StudentName ProjectName
S01 199 Katie Geo Location
S02 120 Ollie Cluster Exploration
In the above table, we have partial dependency; let us see how:
The prime key attributes are StudentID and ProjectNo.
As stated, the non-prime attributes i.e. StudentName and ProjectName should be functionally
dependent on part of a candidate key, to be Partial Dependent.
The StudentName can be determined by StudentID that makes the relation Partial Dependent.
The ProjectName can be determined by ProjectID, which that the relation Partial Dependent.
Normalization
If a database design is not perfect, it may contain anomalies, which are like a bad
dream for any database administrator. Managing a database with anomalies is next
to impossible.
 Update anomalies − If data items are scattered and are not linked to each other
properly, then it could lead to strange situations. For example, when we try to
update one data item having its copies scattered over several places, a few
instances get updated properly while a few others are left with old values. Such
instances leave the database in an inconsistent state.
 Deletion anomalies − We tried to delete a record, but parts of it was left

undeleted because of unawareness, the data is also saved somewhere else.
 Insert anomalies − We tried to insert data in a record that does not exist at all.
Normalization is a method to remove all these anomalies and bring the database to
a consistent state.
First Normal Form
First Normal Form is defined in the definition of relations (tables) itself. This rule
defines that all the attributes in a relation must have atomic domains. The values in
an atomic domain are indivisible units.
We re-arrange the relation (table) as below, to convert it to First Normal Form.
Each attribute must contain only a single value from its pre-defined domain.
Second Normal Form

Before we learn about the second normal form, we need to understand the following
−
 Prime attribute − An attribute, which is a part of the candidate-key, is known

as a prime attribute.
 Non-prime attribute − An attribute, which is not a part of the prime-key, is said

to be a non-prime attribute.
If we follow second normal form, then every non-prime attribute should be fully
functionally dependent on prime key attribute. That is, if X → A holds, then there
should not be any proper subset Y of X, for which Y → A also holds true.
We see here in Student_Project relation that the prime key attributes are Stu_ID
and Proj_ID. According to the rule, non-key attributes, i.e. Stu_Name and
Proj_Name must be dependent upon both and not on any of the prime key attribute
individually. But we find that Stu_Name can be identified by Stu_ID and Proj_Name
can be identified by Proj_ID independently. This is called partial dependency,
which is not allowed in Second Normal Form.
We broke the relation in two as depicted in the above picture. So there exists no
partial dependency.
Third Normal Form

For a relation to be in Third Normal Form, it must be in Second Normal form and the
following must satisfy −
 No non-prime attribute is transitively dependent on prime key attribute.
 For any non-trivial functional dependency, X → A, then either −
o X is a superkey or,
o A is prime attribute.
We find that in the above Student_detail relation, Stu_ID is the key and only prime
key attribute. We find that City can be identified by Stu_ID as well as Zip itself.
Neither Zip is a superkey nor is City a prime attribute. Additionally, Stu_ID → Zip →
City, so there exists transitive dependency.
To bring this relation into third normal form, we break the relation into two relations
as follows −
Boyce-Codd Normal Form

Boyce-Codd Normal Form (BCNF) is an extension of Third Normal Form on strict
terms. BCNF states that −
 For any non-trivial functional dependency, X → A, X must be a super-key.
In the above image, Stu_ID is the super-key in the relation Student_Detail and Zip
is the super-key in the relation ZipCodes. So,
Stu_ID → Stu_Name, Zip
and
Zip → City
Which confirms that both the relations are in BCNF

FOURTH NORMAL FORM
The 4NF comes after 1NF, 2NF, 3NF, and Boyce-Codd Normal Form. It was introduced by Ronald
Fagin in 1977.
To be in 4NF, a relation should be in Bouce-Codd Normal Form and may not contain more than one
multi-valued attribute.
Example
<Movie>
Movie_Name Shooting_Location Listing
MovieOne UK Comedy
MovieOne UK Thriller
MovieTwo Australia Action
MovieTwo Australia Crime
MovieThree India Drama
The above is not in 4NF, since
 More than one movie can have the same listing
 Many shooting locations can have the same movie
Let us convert the above table in 4NF:
<Movie_Shooting>
Movie_Name Shooting_Location
MovieOne UK
MovieOne UK
MovieTwo Australia
MovieTwo Australia
MovieThree India
<Movie_Listing>
Movie_Name Listing
MovieOne Comedy
MovieOne Thriller
MovieTwo Action
MovieTwo Crime
MovieThree Drama
Now the violation is removed and the tables are in 4NF.
Fifth Normal Form
The 5NF (Fifth Normal Form) is also known as project-join normal form. A relation is in Fifth Normal
Form (5NF), if it is in 4NF, and won’t have lossless decomposition into smaller tables.
You can also consider that a relation is in 5NF, if the candidate key implies every join dependency in
it.
Example
The below relation violates the Fifth Normal Form (5NF) of Normalization:
<Employee>
EmpName EmpSkills EmpJob (Assigned Work)
David Java E145
John JavaScript E146
Jamie jQuery E146
Emma Java E147
The above relation can be decomposed into the following three tables; therefore, it is not in 5NF:
<EmployeeSkills>
EmpName EmpSkills
David Java
John JavaScript
Jamie jQuery
Emma Java
The following is the <EmployeeJob> relation that displays the jobs assigned to each employee:
<EmployeeJob>
EmpName EmpJob
David E145
John E146
Jamie E146
Emma E147
Here is the skills that are related to the assigned jobs:
<JobSkills>
EmpSkills EmpJob
Java E145
JavaScript E146
jQuery E146
Java E147
Our Join Dependency:
{(EmpName, EmpSkills ), (EmpName, EmpJob), (EmpSkills, EmpJob)}
The above relations have join dependency, so they are not in 5NF. That would mean that a join
relation of the above three relations is equal to our original relation <Employee>.
EMBEDDED SQL *
We have looked at a wide range of SQL query constructs, treating SQL as an inde-
pendent language in its own right. A relational DBMS supports an interactive SQL
interface, and users can directly enter SQL commands. This simple approach is fine
as long as the task at hand can be accomplished entirely with SQL commands. In
practice we often encounter situations in which we need the greater flexibility of a
general-purpose programming language, in addition to the data manipulation facilities
provided by SQL. For example, we may want to integrate a database application with
a nice graphical user interface, or we may want to ask a query that cannot be expressed
in SQL. To deal with such situations, the SQL standard defines how SQL commands can be
executed from within a program in a host language such as C or Java. The use of
SQL commands within a host language program is called embedded SQL. Details
of embedded SQL also depend on the host language. Although similar capabilities are
supported for a variety of host languages, the syntax sometimes varies.
Conceptually, embedding SQL commands in a host language program is straightfor-
ward. SQL statements (i.e., not declarations) can be used wherever a statement in the
host language is allowed (with a few restrictions). Of course, SQL statements must be
clearly marked so that a preprocessor can deal with them before invoking the compiler
for the host language. Also, any host language variables used to pass arguments into
an SQL command must be declared in SQL. In particular, some special host language
variables must be declared in SQL (so that, for example, any error conditions arising
during SQL execution can be communicated back to the main application program in
the host language).
There are, however, two complications to bear in mind. First, the data types recognized
by SQL may not be recognized by the host language, and vice versa. This mismatch is
typically addressed by casting data values appropriately before passing them to or from
SQL commands. (SQL, like C and other programming languages, provides an operator
to cast values of one type into values of another type.) The second complication has
to do with the fact that SQL is set-oriented; commands operate on and produce
tables, which are sets (or multisets) of rows. Programming languages do not typically
have a data type that corresponds to sets or multisets of rows. Thus, although SQL
commands deal with tables, the interface to the host language is constrained to be
one row at a time. The cursor mechanism is introduced to deal with this problem; we
discuss cursors in Section 5.8.
In our discussion of embedded SQL, we assume that the host language is C for con-
creteness, because minor differences exist in how SQL statements are embedded in
different host languages.
Declaring Variables and Exceptions

SQL statements can refer to variables defined in the host program. Such host-language
variables must be prefixed by a colon (:) in SQL statements and must be declared be-
tween the commands EXEC SQL BEGIN DECLARE SECTION and EXEC SQL END DECLARE
SECTION. The declarations are similar to how they would look in a C program and,
as usual in C, are separated by semicolons. For example, we can declare variables
c sname, c sid, c rating, and c age (with the initial c used as a naming convention to
emphasize that these are host language variables) as follows:
EXEC SQL BEGIN DECLARE SECTION
char c sname[20];
long c sid;
short c rating;
float c age;
EXEC SQL END DECLARE SECTION
The first question that arises is which SQL types correspond to the various C types,
since we have just declared a collection of C variables whose values are intended to
be read (and possibly set) in an SQL run-time environment when an SQL statement
that refers to them is executed. The SQL-92 standard defines such a correspondence
between the host language types and SQL types for a number of host languages. In our
example c sname has the type CHARACTER(20) when referred to in an SQL statement.
c sid has the type INTEGER, c rating has the type SMALLINT, and c age has the type
REAL.
An important point to consider is that SQL needs some way to report what went wrong
if an error condition arises when executing an SQL statement. The SQL-92 standard
recognizes two special variables for reporting errors, SQLCODE and SQLSTATE. SQLCODE is
the older of the two and is defined to return some negative value when an error condition
arises, without specifying further just what error a particular negative integer denotes.
SQLSTATE, introduced in the SQL-92 standard for the first time, associates predefined
values with several common error conditions, thereby introducing some uniformity to
how errors are reported. One of these two variables must be declared. The appropriate
C type for SQLCODE is long and the appropriate C type for SQLSTATE is char[6], that
is, a character string that is five characters long. (Recall the null-terminator in C
strings!) In this chapter, we will assume that SQLSTATE is declared.

Embedding SQL Statements
All SQL statements that are embedded within a host program must be clearly marked,
with the details dependent on the host language; in C, SQL statements must be pre-
fixed by EXEC SQL. An SQL statement can essentially appear in any place in the host
language program where a host language statement can appear.
As a simple example, the following embedded SQL statement inserts a row, whose
column values are based on the values of the host language variables contained in it,
into the Sailors relation:
EXEC SQL INSERT INTO Sailors VALUES (:c sname, :c sid, :c rating, :c age);
Observe that a semicolon terminates the command, as per the convention for termi-
nating statements in C.
The SQLSTATE variable should be checked for errors and exceptions after each embedded
SQL statement. SQL provides the WHENEVER command to simplify this tedious task:
EXEC SQL WHENEVER [ SQLERROR | NOT FOUND ] [ CONTINUE | GOTO stmt ]
The intent is that after each embedded SQL statement is executed, the value of
SQLSTATE should be checked. If SQLERROR is specified and the value of SQLSTATE
indicates an exception, control is transferred to stmt, which is presumably responsi-
ble for error/exception handling. Control is also transferred to stmt if NOT FOUND is
specified and the value of SQLSTATE is 02000, which denotes NO DATA.
CURSORS *
A major problem in embedding SQL statements in a host language like C is that an
impedance mismatch occurs because SQL operates on sets of records, whereas languages
like C do not cleanly support a set-of-records abstraction. The solution is to essentially
provide a mechanism that allows us to retrieve rows one at a time from a relation.
This mechanism is called a cursor. We can declare a cursor on any relation or on any
SQL query (because every query returns a set of rows). Once a cursor is declared, we
can open it (which positions the cursor just before the first row); fetch the next row;
move the cursor (to the next row, to the row after the next n, to the first row, or to
the previous row, etc., by specifying additional parameters for the FETCH command);
or close the cursor. Thus, a cursor essentially allows us to retrieve the rows in a table
by positioning the cursor at a particular row and reading its contents.
Basic Cursor Definition and Usage

Cursors enable us to examine in the host language program a collection of rows com-
puted by an embedded SQL statement:
We usually need to open a cursor if the embedded statement is a SELECT (i.e., a
query). However, we can avoid opening a cursor if the answer contains a single
row, as we will see shortly.
INSERT, DELETE, and UPDATE statements typically don’t require a cursor, although
some variants of DELETE and UPDATE do use a cursor.
As an example, we can find the name and age of a sailor, specified by assigning a value
to the host variable c sid, declared earlier, as follows:
EXEC SQL SELECT S.sname, S.age
INTO :c sname, :c age
FROM Sailors S
WHERE S.sid = :c sid;
The INTO clause allows us to assign the columns of the single answer row to the host
variables c sname and c age. Thus, we do not need a cursor to embed this query in
a host language program. But what about the following query, which computes the
names and ages of all sailors with a rating greater than the current value of the host
variable c minrating?
SELECT S.sname, S.age
FROM Sailors S
WHERE S.rating > :c minrating.
This query returns a collection of rows, not just one row. When executed interactively,
the answers are printed on the screen. If we embed this query in a C program by
prefixing the command with EXEC SQL, how can the answers be bound to host language
variables? The INTO clause is not adequate because we must deal with several rows.
The solution is to use a cursor:
DECLARE sinfo CURSOR FOR
SELECT S.sname, S.age
FROM Sailors S
WHERE S.rating > :c minrating;
This code can be included in a C program, and once it is executed, the cursor sinfo is
defined. Subsequently, we can open the cursor:
OPEN sinfo;
The value of c minrating in the SQL query associated with the cursor is the value of
this variable when we open the cursor. (The cursor declaration is processed at compile
time, and the OPEN command is executed at run-time.)
A cursor can be thought of as ‘pointing’ to a row in the collection of answers to the
query associated with it. When a cursor is opened, it is positioned just before the first
row. We can use the FETCH command to read the first row of cursor sinfo into host
language variables:
FETCH sinfo INTO :c sname, :c age;
When the FETCH statement is executed, the cursor is positioned to point at the next
row (which is the first row in the table when FETCH is executed for the first time after
opening the cursor) and the column values in the row are copied into the corresponding
host variables. By repeatedly executing this FETCH statement (say, in a while-loop in
the C program), we can read all the rows computed by the query, one row at a time.
Additional parameters to the FETCH command allow us to position a cursor in very
flexible ways, but we will not discuss them.
How do we know when we have looked at all the rows associated with the cursor?
By looking at the special variables SQLCODE or SQLSTATE, of course. SQLSTATE, for
example, is set to the value 02000, which denotes NO DATA, to indicate that there are
no more rows if the FETCH statement positions the cursor after the last row.
When we are done with a cursor, we can close it:

CLOSE info;
It can be opened again if needed, and the value of : c minrating in the SQL query
associated with the cursor would be the value of the host variable c minrating at that
time.
Properties of Cursors
The general form of a cursor declaration is:
DECLARE cursorname [INSENSITIVE] [SCROLL] CURSOR FOR
some query
[ ORDER BY order-item-list ]
[ FOR READ ONLY | FOR UPDATE ]
A cursor can be declared to be a read-only cursor (FOR READ ONLY) or, if it is a cursor
on a base relation or an updatable view, to be an updatable cursor (FOR UPDATE).
If it is updatable, simple variants of the UPDATE and DELETE commands allow us to
update or delete the row on which the cursor is positioned. For example, if sinfo is an
updatable cursor and is open, we can execute the following statement:
UPDATE Sailors S
SET S.rating = S.rating - 1
WHERE CURRENT of sinfo;
This embedded SQL statement modifies the rating value of the row currently pointed
to by cursor sinfo; similarly, we can delete this row by executing the next statement:
DELETE Sailors S
WHERE CURRENT of sinfo;
A cursor is updatable by default unless it is a scrollable or insensitive cursor (see
below), in which case it is read-only by default.
If the keyword SCROLL is specified, the cursor is scrollable, which means that vari-
ants of the FETCH command can be used to position the cursor in very flexible ways;
otherwise, only the basic FETCH command, which retrieves the next row, is allowed.
If the keyword INSENSITIVE is specified, the cursor behaves as if it is ranging over a

private copy of the collection of answer rows. Otherwise, and by default, other actions
of some transaction could modify these rows, creating unpredictable behavior. For
example, while we are fetching rows using the sinfo cursor, we might modify rating
values in Sailor rows by concurrently executing the command:
UPDATE Sailors S
SET S.rating = S.rating – 1
Consider a Sailor row such that: (1) it has not yet been fetched, and (2) its original
rating value would have met the condition in the WHERE clause of the query associated
with sinfo, but the new rating value does not. Do we fetch such a Sailor row? If
INSENSITIVE is specified, the behavior is as if all answers were computed and stored
when sinfo was opened; thus, the update command has no effect on the rows fetched
by sinfo if it is executed after sinfo is opened. If INSENSITIVE is not specified, the
behavior is implementation dependent in this situation.
Finally, in what order do FETCH commands retrieve rows? In general this order is
unspecified, but the optional ORDER BY clause can be used to specify a sort order.
Note that columns mentioned in the ORDER BY clause cannot be updated through the
cursor!
The order-item-list is a list of order-items; an order-item is a column name, op-
tionally followed by one of the keywords ASC or DESC. Every column mentioned in the
ORDER BY clause must also appear in the select-list of the query associated with the
cursor; otherwise it is not clear what columns we should sort on. The keywords ASC or
DESC that follow a column control whether the result should be sorted—with respect
to that column—in ascending or descending order; the default is ASC. This clause is
applied as the last step in evaluating the query.
Consider the query discussed in Section 5.5.1, and the answer shown in Figure 5.13.
Suppose that a cursor is opened on this query, with the clause:
ORDER BY minage ASC, rating DESC
The answer is sorted first in ascending order by minage, and if several rows have the
same minage value, these rows are sorted further in descending order by rating. The
cursor would fetch the rows in the order shown in Figure 5.18.
rating minage
8 25.5
3 25.5
7 35.0
Figure 5.18 Order in which Tuples Are Fetched
DYNAMIC SQL *
Consider an application such as a spreadsheet or a graphical front-end that needs to
access data from a DBMS. Such an application must accept commands from a user and, based on what
the user needs, generate appropriate SQL statements to retrieve
the necessary data. In such situations, we may not be able to predict in advance just
what SQL statements need to be executed, even though there is (presumably) some
algorithm by which the application can construct the necessary SQL statements once
a user’s command is issued.
SQL provides some facilities to deal with such situations; these are referred to as
dynamic SQL. There are two main commands, PREPARE and EXECUTE, which we
illustrate through a simple example:
char c sqlstring[] = {”DELETE FROM Sailors WHERE rating>5”};
EXEC SQL PREPARE readytogo FROM :c sqlstring;
EXEC SQL EXECUTE readytogo;
The first statement declares the C variable c sqlstring and initializes its value to the
string representation of an SQL command. The second statement results in this string
being parsed and compiled as an SQL command, with the resulting executable bound
to the SQL variable readytogo. (Since readytogo is an SQL variable, just like a cursor
name, it is not prefixed by a colon.) The third statement executes the command.
Many situations require the use of dynamic SQL. However, note that the preparation of
a dynamic SQL command occurs at run-time and is a run-time overhead. Interactive
and embedded SQL commands can be prepared once at compile time and then re-
executed as often as desired. Consequently you should limit the use of dynamic SQL
to situations in which it is essential.
There are many more things to know about dynamic SQL—how can we pass parameters
from the host langugage program to the SQL statement being prepared, for example?—
but we will not discuss it further; readers interested in using dynamic SQL should
consult one of the many good books devoted to SQL.
ODBC AND JDBC *

Embedded SQL enables the integration of SQL with a general-purpose programming
language. As described in Section 5.7, a DBMS-specific preprocessor transforms the
embedded SQL statements into function calls in the host language. The details of
this translation vary across DBMS, and therefore even though the source code can
be compiled to work with different DBMSs, the final executable works only with one
specific DBMS.
ODBC and JDBC, short for Open DataBase Connectivity and Java DataBase Con-
nectivity, also enable the integration of SQL with a general-purpose programming
language. Both ODBC and JDBC expose database capabilities in a standardized way to the application
programmer through an application programming interface
(API). In contrast to embedded SQL, ODBC and JDBC allow a single executable to
access different DBMSs without recompilation. Thus, while embedded SQL is DBMS-
independent only at the source code level, applications using ODBC or JDBC are
DBMS-independent at the source code level and at the level of the executable. In
addition, using ODBC or JDBC an application can access not only one DBMS, but
several different DBMSs simultaneously.
ODBC and JDBC achieve portability at the level of the executable by introducing
an extra level of indirection. All direct interaction with a specific DBMS happens
through a DBMS specific driver. A driver is a software program that translates the
ODBC or JDBC calls into DBMS-specific calls. Since it is only known at run-time
which DBMSs the application is going to access, drivers are loaded dynamically on
demand. Existing drivers are registered with a driver manager, which manages the
set of existing drivers.

One interesting point to note is that a driver does not necessarily need to interact with
a DBMS that understands SQL. It is sufficient that the driver translates the SQL com-
mands from the application into equivalent commands that the DBMS understands.
Therefore, we will refer in the remainder of this section to a data storage subsystem
with which a driver interacts as a data source.
An application that interacts with a data source through ODBC or JDBC performs
the following steps. A data source is selected, the corresponding driver is dynamically
loaded, and a connection with the data source is established. There is no limit on the
number of open connections and an application can have several open connections to
different data sources. Each connection has transaction semantics; that is, changes
from one connection are only visible to other connections after the connection has
committed its changes. While a connection is open, transactions are executed by
submitting SQL statements, retrieving results, processing errors and finally committing
or rolling back. The application disconnects from the data source to terminate the
interaction.
Architecture
The architecture of ODBC/JDBC has four main components: the application, the
driver manager, several data source specific drivers, and the corresponding data sources.
Each component has different roles, as explained in the next paragraph.
The application initiates and terminates the connection with the data source. It sets
transaction boundaries, submits SQL statements, and retrieves the results—all through
a well-defined interface as specified by the ODBC/JDBC API. The primary goal of the
driver manager is to load ODBC/JDBC drivers and to pass ODBC/JDBC function. calls from the application
to the correct driver. The driver manager also handles
ODBC/JDBC initialization and information calls from the applications and can log
all function calls. In addition, the driver manager performs some rudimentary error
checking. The driver establishes the connection with the data source. In addition
to submitting requests and returning request results, the driver translates data, error
formats, and error codes from a form that is specific to the data source into the
ODBC/JDBC standard. The data source processes commands from the driver and
returns the results.
Depending on the relative location of the data source and the application, several
architectural scenarios are possible. For example, drivers in JDBC are classified into
four types depending on the architectural relationship between the application and the
data source:
1. Type I (bridges) This type of driver translates JDBC function calls into function
calls of another API that is not native to the DBMS. An example is an ODBC-
JDBC bridge. In this case the application loads only one driver, namely the
bridge.
2. Type II (direct translation to the native API) This driver translates JDBC
function calls directly into method invocations of the API of one specific data
source. The driver is dynamically linked, and is specific to the data source.
3. Type III (network bridges) The driver talks over a network to a middle-ware
server that translates the JDBC requests into DBMS-specific method invocations.
In this case, the driver on the client site (i.e., the network bridge) is not DBMS-
specific.
4. Type IV (direct translation over sockets) Instead of calling the DBMS API
directly, the driver communicates with the DBMS through Java sockets. In this
case the driver on the client side is DBMS-specific.
An Example Using JDBC

JDBC is a collection of Java classes and interfaces that enables database access from
programs written in the Java programming language. The classes and interfaces are
part of the java.sql package. In this section, we illustrate the individual steps that
are required to submit a database query to a data source and to retrieve the results.
In JDBC, data source drivers are managed by the Drivermanager class, which main-
tains a list of all currently loaded drivers. The Drivermanager class has methods
registerDriver, deregisterDriver, and getDrivers to enable dynamic addition he first step in connecting to
a data source is to load the corresponding JDBC driver.
This is accomplished by using the Java mechanism for dynamically loading classes.
The static method forName in the Class class returns the Java class as specified in
the argument string and executes its static constructor. The static constructor of
the dynamically loaded class loads an instance of the Driver class, and this Driver
object registers itself with the DriverManager class.
A session with a DBMS is started through creation of a Connection object. A connec-
tion can specify the granularity of transactions. If autocommit is set for a connection,
then each SQL statement is considered to be its own transaction. If autocommit is off,
then a series of statements that compose a transaction can be committed using the
commit method of the Connection class. The Connection class has methods to set
the autocommit mode (setAutoCommit) and to retrieve the current autocommit mode
(getAutoCommit). A transaction can be aborted using the rollback method.
The following Java example code dynamically loads a data source driver and establishes
a connection:
Class.forName(“oracle/jdbc.driver.OracleDriver”);
Connection connection = DriverManager.getConnection(url,uid,password);
In considering the interaction of an application with a data source, the issues that
we encountered in the context of embedded SQL—e.g., passing information between
the application and the data source through shared variables—arise again. To deal
with such issues, JDBC provides special data types and specifies their relationship to
corresponding SQL data types. JDBC allows the creation of SQL statements that
refer to variables in the Java host program. Similar to the SQLSTATE variable, JDBC
throws an SQLException if an error occurs. The information includes SQLState, a
string describing the error. As in embedded SQL, JDBC provides the concept of a
cursor through the ResultSet class.
While a complete discussion of the actual implementation of these concepts is beyond

the scope of the discussion here, we complete this section by considering two illustrative
JDBC code fragments.
In our first example, we show how JDBC refers to Java variables inside an SQL state-
ment. During a session, all interactions with a data source are encapsulated into objects
that are created by the Connection object. SQL statements that refer to variables in
the host program are objects of the class PreparedStatement. Whereas in embedded
SQL the actual names of the host language variables appear in the SQL query text,
JDBC replaces each parameter with a “?” and then sets values of each parameter at
run-time through settype methods, where type is the type of the parameter. These
points are illustrated in the following Java program fragment, which inserts one row
connection.setAutoCommit(false);
PreparedStatement pstmt =
connection.prepareStatement(“INSERT INTO Sailors VALUES ?,?,?,?”);
pstmt.setString(1, j name); pstmt.setInt(2, j id);
pstmt.setInt(3, j rating); pstmt.setInt(4, j age);
pstmt.execute();
pstmt.close();
connection.commit();
Our second example shows how the ResultSet class provides the functionality of a
cursor. After the SQL statement stmt is executed, result is positioned right before the
first row. The method next fetches the next row and enables reading of its values
through gettype methods, where type is the type of the field.
Statement stmt = connection.createStatement();
ResultSet res = stmt.executeQuery(“SELECT S.name, S.age FROM Sailors S”);
while (result.next()) {
String name = res.getString(1);
int age = res.getInt(2);
// process result row
}
stmt.close().
DBMS | Interfaces
A database management system (DBMS) interface is a user interface which allows for the
ability to input queries to a database without using the query language itself.
User-friendly interfaces provide by DBMS may include the following:
1. Menu-Based Interfaces for Web Clients or Browsing –

These interfaces present the user with lists of options (called menus) that lead the user through
the formation of a request. Basic advantage of using menus is that they removes the tension of
remembering specific commands and syntax of any query language, rather than query is
basically composed step by step by collecting or picking options from a menu that is basically
shown by the system. Pull-down menus are a very popular technique in Web based interfaces.
They are also often used in browsing interfacewhich allow a user to look through the contents of
a database in an exploratory and unstructured manner.
2. Forms-Based Interfaces –
A forms-based interface displays a form to each user. Users can fill out all of
the form entries to insert a new data, or they can fill out only certain entries,
in which case the DBMS will redeem same type of data for other remaining
entries. This type of forms are usually designed or created and programmed
for the users that have no expertise in operating system. Many DBMSs
have forms specification languages which are special languages that help
specify such forms.
Example: SQL* Forms is a form-based language that specifies queries using a
form designed in conjunction with the relational database schema.b>
3. Graphical User Interface –
A GUI typically displays a schema to the user in diagrammatic form.The user
then can specify a query by manipulating the diagram. In many cases, GUI’s
utilize both menus and forms. Most GUIs use a pointing device such as mouse,
to pick certain part of the displayed schema diagram.
4. Natural language Interfaces –
These interfaces accept request written in English or some other language and
attempt to understand them. A Natural language interface has its own
schema, which is similar to the database conceptual schema as well as a
dictionary of important words.
The natural language interface refers to the words in its schema as well as to the set
of standard words in a dictionary to interpret the request.If the interpretation is
successful, the interface generates a high-level query corresponding to the natural
language and submits it to the DBMS for processing, otherwise a dialogue is started
with the user to clarify any provided condition or request. The main disadvantage
with this is that the capabilities of this type of interfaces are not that much advance.
5. Speech Input and Output –

There is an limited use of speech say it for a query or an answer to a question
or being a result of a request it is becoming commonplace Applications with
limited vocabularies such as inquiries for telephone directory, flight
arrival/departure, and bank account information are allowed speech for input
and output to enable ordinary folks to access this information.
The Speech input is detected using a predefined words and used to set up the
parameters that are supplied to the queries. For output, a similar conversion from
text or numbers into speech take place.
6. Interfaces for DBA –

Most database system contains privileged commands that can be used only by
the DBA’s staff. These include commands for creating accounts, setting system
parameters, granting account authorization, changing a schema, reorganizing
the storage structures of a databases.
Security and Authentication in SQL

 A DBMS system always has a separate system for security which is responsible for
protecting database against accidental or intentional loss, destruction or misuse.
 Security Levels:
o Database level:- DBMS system should ensure that the authorization restriction
needs to be there on users.
o Operating system Level:- Operating system should not allow unauthorized users
to enter in system.
o Network Level:- Database is at some remote place and it is accessed by users
through the network so security is required.
 Security Mechanisms:
o Access Control(Authorization)
 Which identifies valid users who may have any access to the valid data in
the Database and which may restrict the operations that the user may
perform?
 For Example The movie database might designate two roles:”users”
(query the data only) and “designers”(add new data)user must be
assigned to a role to have the access privileges given to that role.
 Each applications is associated with a specified role. Each role has a list
of authorized users who may execute/Design/administers the application.
o Authenticate the User:
 Which identify valid users who may have any access to the data in the
Database?
 Restrict each user’s view of the data in the database
 This may be done with help of concept of views in Relational databases.
o Cryptographic control/Data Encryption:
 Encode data in a cryptic form(Coded)so that although data is captured by
unintentional user still he can’t be able to decode the data.
 Used for sensitive data, usually when transmitted over communications
links but also may be used to prevent by passing the system to gain
access to the data.
o Inference control:
 Ensure that confidential information can’t be retrieved even by deduction.
 Prevent disclosure of data through statistical summaries of confidential
data.
o Flow control or Physical Protection:
 Prevents the copying of information by unauthorized person.
 Computer systems must be physically secured against any unauthorized
entry.
o Virus control:
 At user level authorization should be done to avoid intruder attacks
through humans.
 There should be mechanism for providing protection against data virus.
o User defined control:
 Define additional constraints or limitations on the use of database.
 These allow developers or programmers to incorporate their own security
procedures in addition to above security mechanism.
Authorization
 Authorization is finding out if the person,once identified,is permitted to have the

resource.
 Authorization explains that what you can do and is handled through the DBMS unless
external security procedures are available.
 Database management system allows DBA to give different access rights to the
users as per their requirements.
 Basic Authorization we can use any one form or combination of the following
basic forms of authorizations
o Resource authorization:-Authorization to access any system resource. e.g.
sharing of database, printer etc.
o Alternation Authorization:- Authorization to add attributes or delete attributes from
relations
o Drop Authorization:-Authorization to drop a relation.
 Granting of privileges:
o A system privilege is the right to perform a particular action,or to perform an
action on any schema objects of a particular type.
o An authorized user may pass on this authorization to other users.This process is
called as granting of privileges..
o Syntax:
o GRANT <privilege list>

o ON<relation name or view name>
o TO<user/role list>
o Example:
The following grant statement grants user U1,U2 and U3 the select
privilege on Emp_Salary relation:
GRANT select
ON Emp_Salary
TO U1,U2 and U3.
 Revoking of privileges:
o We can reject the privileges given to particular user with help of revoke
statement.
o To revoke an authorization,we use the revoke statement.
o Syntax:
o REVOKE <privilege list>

o ON<relation name or view name>
o FROM <user/role list>[restrict/cascade]
 Example:
The revocation of privileges from user or role may cause other user or roles also
have to loose that privileges.This behavior is called cg of the revoke.
Revoke select
ON Emp_Salary
FROM U1,U2,U3.
 Some other types of Privileges:

o Reference privileges:
SQL permits a user to declare foreign keys while creating relations.
Example: Allow user U1 to create relation that references key ‘Eid’ of
Emp_Salary relation.
GRANT REFERENCES (Eid)

ON Emp_Salary
TO U1.
o Execute privileges:
This privileges authorizes a user to execute a function or procedure.
Thus,only user who has execute privilege on a function Create_Acc() can
call function.
GRANT EXECUTE
ON Create_Acc
TO U1.
Stored procedures
A stored procedure is a set of Structured Query Language (SQL) statements with an
assigned name, which are stored in a relational database management systemas a group,
so it can be reused and shared by multiple programs.
Stored procedures can access or modify data in a database, but it is not tied to a specific
database or object, which offers a number of advantages.
Benefits of using stored procedures
A stored procedure provides an important layer of security between the user interface and the
database. It supports security through data access controls because end users may enter or change
data, but do not write procedures. A stored procedure preserves data integrity because information is
entered in a consistent manner. It improves productivity because statements in a stored procedure
only must be written once.
Creating SQL stored procedures.
Stored procedures offer advantages over embedding queries in a graphical user interface (GUI).
Since stored procedures are modular, it is easier to troubleshoot when a problem arises in an
application. Stored procedures are also tunable, which eliminates the need to modify the GUI source
code to improve its performance. It's easier to code stored procedures than to build a query through a
GUI.
Use of stored procedures can reduce network traffic between clients and servers, because the
commands are executed as a single batch of code. This means only the call to execute the procedure
is sent over a network, instead of every single line of code being sent individually.
Stored procedure in SQL
Stored procedures in SQL Server can accept input parameters and return multiple values of output
parameters; in SQL Server, stored procedures program statements to perform operations in the
database and return a status value to a calling procedure or batch.
User-defined procedures are created in a user-defined database or in all system databases, except for
when a read-only (resource database) is used. They are developed in Transact-SQL (T-SQL) or a
reference to Microsoft. Temporary procedures are stored in tempdb, and there are two types of
temporary procedures: local and global. Local procedures are only visible to the current user
connection, while global procedures are visible to any user after they are created. System procedures
arrive with SQL Server and are physically stored in an internal, hidden-resource database. They
appear in the SYS schema of each system, as well as in a user-defined database.
How to run a
stored procedure
UNIT 5
Query Planning and Optimization
Once the query has been rewritten, it is subject to the planning
and optimization phasee, each query block is treated in
isolation and a plan is generated for it.This planning begins
bottom-up from the rewritten query’s innermost subquery,
proceeding to its outermost query block. The optimizer in
PostgreSQL is, for the most part, cost based. The idea is to
generate an access plan whose estimated cost is minimal. The
cost model includes as parameters the I/O cost of sequential
and random page fetches, as well as the CPU costs of
processing heap tuples, index tuples, and simple predicates.
The actual process of optimization is based on one of the
following two forms:
Standard planner. The standard planner uses the the bottom-
up dynamicprogramming algorithm for join order optimization,
originally used in System R, the pioneering relational system
developed by IBM research in the 1970s. The System R dynamic
programming algorithm . The algorithm is used on a single
query block at a time.
• Genetic query optimizer. When the number of tables in a
query block is very
large, System R’s dynamic programming algorithm becomes
very expensive.
Unlike other commercial systems that default to greedy or rule-
based techniques, PostgreSQL uses a more radical approach: a
genetic algorithm that was developed initially to solve
traveling-salesman problems. There exists anecdotal evidence
of the successful use of genetic query optimization in
production systems for queries with around 45 tables.Since the
planner operates in a bottom-up fashion on query blocks, it is
able to perform certain transformations on the query plan as it
is being built. OneOne example is the common subquery-to-join
transformation that is present in many commercial systems
(usually implemented in the rewrite phase). When PostgreSQL
encounters a noncorrelated subquery (such as one caused by a
query on a view), it is generally possible to “pull up” the
planned subquery and merge it into the upper-level query
block. However, transformations that push duplicate
elimination into lower-level query blocks are generally not
possible in PostgreSQL.The query-optimization phase results in
a query plan that is a tree of relationalboperators. Each
operator represents a specific operation on one or more sets
ofbtuples. The operators can be unary (for example, sort,
aggregation), binary (for example, nested-loop join), or n-ary
(for example, set union).Crucial to the cost model is an accurate
estimate of the total number of tuples thatl be processed at
each operator in the plan. This is inferred by the optimizer on
the basis of statistics that are maintained on each relation in
the system. These indicate the total number of tuples for each
relation and specific information on each column of a relation,
such as the column cardinality, a list of most common values in
the table and the number of occurrences, and a histogram that
divides the column’s values into groups of equal population
correlation between the physical and logical row orderings of a
column’s values
— this indicates the cost of an index scan to retrieve tuples that
pass predicates
on the column. The DBA must ensure that these statistics are
current by running
the analyze command periodically
27.6.3 Query Executor
The executor module is responsible for processing a query plan
produced by the
optimizer. The executor follows the iterator model with a set of
four functions
implemented for each operator (open, next, rescan, and close).
Iterators are also
iterators have an extra function, rescan, which is used to reset a
subplan (say for an inner loop of a join) with parameters such
as index key ranges.
Centralized and Client–Server Architectures:
Centralized database systems are those that run on a single
computer system
and do not interact with other computer systems. Such
database systems span
a range from single-user database systems running on personal
computers to
high-performance database systems running on high-end
server systems. Client–server systems, on the other hand, have
functionality split between a server system and multiple client
systems.
Centralized Systems
A modern, general-purpose computer system consists of one to
a few processors
and a number of device controllers that are connected through
a common bus that
provides access to shared memory (Figure 17.1). The
processors have local cache
memories that store local copies of parts of the memory, to
speed up access to data.
Each processor may have several independent cores, each of
which can execute
a separate instruction stream. Each device controller is in
charge of a specific
type of device (for example, a disk drive, an audio device, or a
video display).
The processors and the device controllers can execute
concurrently, competing
for memory access. Cache memory reduces the contention for
memory access,
since it reduces the number of times that the processor needs
to access the shared
memory.
We distinguish two ways in which computers are used: as
single-user systems
and as multiuser systems. Personal computers and
workstations fall into the first
category. A typical single-user system is a desktop unit used by
a single person,
usually with only one processor and one or two hard disks, and
usually only one
person using the machine at a time. A typical multiuser system,
on the other
hand, has more disks and more memory and may have multiple
processors. It
serves a large number of users who are connected to the
system remotely.
Database systems designed for use by single users usually do
not provide
many of the facilities that a multiuser database provides. In
particular, they may
not support concurrency control, which is not required when
only a single user
can generate updates. Provisions for crash recovery in such
systems are either
absent or primitive— for example, they may consist of simply
making a backup
of the database before any update. Some such systems do not
support SQL, and
they provide a simpler query language, such as a variant of
QBE. In contrast,
USB controller
mouse keyboard printer monitor
disks
graphics
adapter
disk
controller
memory
CPU
on-line
Figure 17.1 A centralized computer system.

Unit 1 Introduction To Database: Characteristics

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Unit 1 Introduction To Database: Characteristics

Uploaded by

Copyright:

Available Formats

Unit 1

Mostly data represents recordable facts. Data aids in producing information,

A database management system stores data in such a way that it

 Real-world entity − A modern DBMS is more realistic and uses real-world

 Relation-based tables − DBMS allows entities and relations among them to

 Isolation of data and application − A database system is entirely different than

 Less redundancy − DBMS follows the rules of normalization, which splits a

 Consistency − Consistency is a state where every relation in a database remains

 ACID Properties − DBMS follows the concepts

 Multiuser and Concurrent Access − DBMS supports multi-user environment

Three layered Architecture or Three tier

Multiple-tier database architecture is highly modifiable, as almost all its

ER Model is best used for the conceptual design of a database.

 Entities and their attributes.

 Relationships among entities.

These concepts are explained below.

 Relationship − The logical association among entities is called relationship.

 Data is stored in tables called relations.

 Relations can be normalized.

 In normalized relations, values saved are atomic values.

 Each row in a relation contains a unique value.

 Each column in a relation contains values from a same domain.

Concepts of Relational Model

Relation instance − A finite set of tuples in the relational database system

Relation schema − A relation schema describes the relation name (table

Attribute domain − Every attribute has some pre-defined value scope,

 Referential integrity constraints

Key constraints force that −

 a key attribute can not have NULL values.

Key constraints are also referred to as Entity Constraints.

Referential integrity Constraints

Referential integrity constraint states that if a relation refers to a key attribute

The fundamental operations of relational algebra are as follows −

We will discuss all these operations in the following sections.

Select Operation (σ)

Where σ stands for selection predicate and rstands for relation. p is

Output − Selects tuples from books where subject is 'database'.

σsubject = "database" and price = "450"(Books)

σsubject = "database" and price = "450" or year > "2010"(Books)

Project Operation (∏)

Notation − ∏A1, A2, An (r)

Where A1, A2 , An are attribute names of relation r.

Duplicate rows are automatically eliminated, as relation is a set.

∏subject, author (Books)

Union Operation (∪)

For a union operation to be valid, the following conditions must hold −

 r, and s must have the same number of attributes.

 Attribute domains must be compatible.

∏ author (Books) ∪ ∏ author (Articles)

Set Difference (−)

Finds all the tuples that are present in r but not in s.

∏ author (Books) − ∏ author (Articles)

Cartesian Product (Χ)

Where r and s are relations and their output will be defined as −

σauthor = 'tutorialspoint'(Books Χ Articles)

Rename Operation (ρ)

Where the result of expression E is saved with name of x.

Additional operations are −

Relational calculus exists in two forms −

Tuple Relational Calculus (TRC)

Returns all tuples T that satisfies a condition.

Domain Relational Calculus (DRC)