Unit 1

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 46

Introduction of DBMS

Database: Database is a collection of inter-related data which helps in


efficient retrieval, insertion and deletion of data from database and organizes
the data in the form of tables, views, schemas, reports etc. For Example,
university database organizes the data about students, faculty, and admin
staff etc. which helps in efficient retrieval, insertion and deletion of data from
it. DDL is short name of Data Definition Language, which deals with
database schemas and descriptions, of how the data should reside in the
database.
• CREATE: to create a database and its objects like (table, index, views, store procedure,
function, and triggers)
• ALTER: alters the structure of the existing database
• DROP: delete objects from the database
• TRUNCATE: remove all records from a table, including all spaces allocated for the
records are removed
• COMMENT: add comments to the data dictionary
• RENAME: rename an object
DML is short name of Data Manipulation Language which deals with data
manipulation and includes most common SQL statements such SELECT, INSERT,
UPDATE, DELETE, etc., and it is used to store, modify, retrieve, delete and update
data in a database.
• SELECT: retrieve data from a database
• INSERT: insert data into a table
• UPDATE: updates existing data within a table
• DELETE: Delete all records from a database table
• MERGE: UPSERT operation (insert or update)
• CALL: call a PL/SQL or Java subprogram
• EXPLAIN PLAN: interpretation of the data access path
• LOCK TABLE: concurrency Control
Database Management System: The software which is used to manage database is
called Database Management System (DBMS). For Example, MySQL, Oracle etc. are
popular commercial DBMS used in different applications. DBMS allows users the
following tasks:
Data Definition: It helps in creation, modification and removal of definitions that
define the organization of data in database.
Data Updation: It helps in insertion, modification and deletion of the actual data in
the database.
Data Retrieval: It helps in retrieval of data from the database which can be used by
applications for various purposes.
User Administration: It helps in registering and monitoring users, enforcing data
security, monitoring performance, maintaining data integrity, dealing with
concurrency control and recovering information corrupted by unexpected failure.
Paradigm Shift from File System to DBMS
File System manages data using files in hard disk. Users are allowed to create, delete,
and update the files according to their requirement. Let us consider the example of file
based University Management System. Data of students is available to their respective
Departments, Academics Section, Result Section, Accounts Section, Hostel Office etc.
Some of the data is common for all sections like Roll No, Name, Father Name,
Address and Phone number of students but some data is available to a particular
section only like Hostel allotment number which is a part of hostel office. Let us
discuss the issues with this system:
• Redundancy of data: Data is said to be redundant if same data is copied at
many places. If a student wants to change Phone number, he has to get it
updated at various sections. Similarly, old records must be deleted from all
sections representing that student.
• Inconsistency of Data: Data is said to be inconsistent if multiple copies of
same data does not match with each other. If Phone number is different in
Accounts Section and Academics Section, it will be inconsistent.
Inconsistency may be because of typing errors or not updating all copies of
same data.
• Difficult Data Access: A user should know the exact location of file to
access data, so the process is very cumbersome and tedious. If user wants to
search student hostel allotment number of a student from 10000 unsorted
students’ records, how difficult it can be.
• Unauthorized Access: File System may lead to unauthorized access to
data. If a student gets access to file having his marks, he can change it in
unauthorized way.
• No Concurrent Access: The access of same data by multiple users at same
time is known as concurrency. File system does not allow concurrency as
data can be accessed by only one user at a time.
• No Backup and Recovery: File system does not incorporate any backup
and recovery of data if a file is lost or corrupted.
These are the main reasons which made a shift from file system to DBMS

Introduction of 3-Tier Architecture in


DBMS
DBMS 3-tier architecture divides the complete system into three inter-related
but independent modules as shown below:
1. Physical Level: At the physical level, the information about the
location of database objects in the data store is kept. Various users
of DBMS are unaware of the locations of these objects.In simple
terms,physical level of a database describes how the data is being
stored in secondary storage devices like disks and tapes and also
gives insights on additional storage details.
2. Conceptual Level: At conceptual level, data is represented in the
form of various database tables. For Example, STUDENT database
may contain STUDENT and COURSE tables which will be visible to
users but users are unaware of their storage.Also referred as
logical schema,it describes what kind of data is to be stored in the
database.
3. External Level: An external level specifies a view of the data in
terms of conceptual level tables. Each external level view is used
to cater to the needs of a particular category of users. For Example,
FACULTY of a university is interested in looking course details of
students, STUDENTS are interested in looking at all details related
to academics, accounts, courses and hostel details as well. So,
different views can be generated for different users. The main focus
of external level is data abstraction.
Data Independence
Data independence means a change of data at one level should not affect
another level. Two types of data independence are present in this
architecture:
1. Physical Data Independence: Any change in the physical location
of tables and indexes should not affect the conceptual level or
external view of data. This data independence is easy to achieve
and implemented by most of the DBMS.
2. Conceptual Data Independence: The data at conceptual level
schema and external level schema must be independent. This
means a change in conceptual schema should not affect external
schema. e.g.; Adding or deleting attributes of a table should not
affect the user’s view of the table. But this type of independence is
difficult to achieve as compared to physical data independence
because the changes in conceptual schema are reflected in the
user’s view.
Phases of database design
Database designing for a real-world application starts from capturing the
requirements to physical implementation using DBMS software which
consists of following steps shown below:

Conceptual Design: The requirements of database are captured using high


level conceptual data model. For Example, the ER model is used for the
conceptual design of the database.
Logical Design: Logical Design represents data in the form of relational
model. ER diagram produced in the conceptual design phase is used to
convert the data into the Relational Model.
Physical Design: In physical design, data in relational model is implemented
using commercial DBMS like Oracle, DB2.
Advantages of DBMS
DBMS helps in efficient organization of data in database which has following
advantages over typical file system:
• Minimized redundancy and data inconsistency: Data is
normalized in DBMS to minimize the redundancy which helps in
keeping data consistent. For Example, student information can be
kept at one place in DBMS and accessed by different users.This
minimized redundancy is due to primary key and foreign keys
• Simplified Data Access: A user need only name of the relation not
exact location to access data, so the process is very simple.
• Multiple data views: Different views of same data can be created
to cater the needs of different users. For Example, faculty salary
information can be hidden from student view of data but shown in
admin view.
• Data Security: Only authorized users are allowed to access the
data in DBMS. Also, data can be encrypted by DBMS which makes
it secure.
• Concurrent access to data: Data can be accessed concurrently
by different users at same time in DBMS.
• Backup and Recovery mechanism: DBMS backup and recovery
mechanism helps to avoid data loss and data inconsistency in case
of catastrophic failures.
DBMS Architecture 2-Level, 3-Level
Two-tier architecture:
The two-tier architecture is similar to a basic client-server model. The
application at the client end directly communicates with the database at the
server-side. APIs like ODBC, JDBC are used for this interaction. The server
side is responsible for providing query processing and transaction
management functionalities. On the client-side, the user interfaces and
application programs are run. The application on the client-side establishes a
connection with the server-side in order to communicate with the DBMS.
An advantage of this type is that maintenance and understanding are easier,
compatible with existing systems. However, this model gives poor
performance when there are a large number of users.

Three Tier architecture:


In this type, there is another layer between the client and the server. The
client does not directly communicate with the server. Instead, it interacts with
an application server which further communicates with the database system
and then the query processing and transaction management takes place.
This intermediate layer acts as a medium for the exchange of partially
processed data between server and client. This type of architecture is used
in the case of large web applications.
Advantages:

• Enhanced scalability due to distributed deployment of application


servers. Now, individual connections need not be made between
client and server.
• Data Integrity is maintained. Since there is a middle layer between
client and server, data corruption can be avoided/removed.
• Security is improved. This type of model prevents direct interaction
of the client with the server thereby reducing access to
unauthorized data.
Disadvantages:
Increased complexity of implementation and communication. It becomes
difficult for this sort of interaction to take place due to the presence of middle
layers.
Three-TierArchtecture

Need for DBMS


A Data Base Management System is a system software for easy, efficient
and reliable data processing and management. It can be used for:
• Creation of a database.
• Retrieval of information from the database.
• Updating the database.
• Managing a database.
It provides us with the many functionalities and is more
advantageous than the traditional file system in many ways listed
below:
1) Processing Queries and Object Management:
In traditional file systems, we cannot store data in the form of objects. In
practical-world applications, data is stored in objects and not files. So in a file
system, some application software maps the data stored in files to objects so
that can be used further.
We can directly store data in the form of objects in a database management
system. Application level code needs to be written to handle, store and scan
through the data in a file system whereas a DBMS gives us the ability to
query the database.

2) Controlling redundancy and inconsistency:


Redundancy refers to repeated instances of the same data. A database
system provides redundancy control whereas in a file system, same data
may be stored multiple times. For example, if a student is studying two
different educational programs in the same college, say ,Engineering and
History, then his information such as the phone number and address may be
stored multiple times, once in Engineering dept and the other in History dept.
Therefore, it increases time taken to access and store data. This may also
lead to inconsistent data states in both places. A DBMS uses data
normalization to avoid redundancy and duplicates.
3) Efficient memory management and indexing:
DBMS makes complex memory management easy to handle. In file systems,
files are indexed in place of objects so query operations require entire file
scans whereas in a DBMS , object indexing takes place efficiently through
database schema based on any attribute of the data or a data-property. This
helps in fast retrieval of data based on the indexed attribute.
4) Concurrency control and transaction management:
Several applications allow user to simultaneously access data. This may lead
to inconsistency in data in case files are used. Consider two withdrawal
transactions X and Y in which an amount of 100 and 200 is withdrawn from
an account A initially containing 1000. Now since these transactions are
taking place simultaneously, different transactions may update the account
differently. X reads 1000, debits 100, updates the account A to 900, whereas
Y also reads 1000, debits 200, updates A to 800. In both cases account A
has wrong information. This results in data inconsistency. A DBMS provides
mechanisms to deal with this kind of data inconsistency while allowing users
to access data concurrently. A DBMS implements ACID(atomicity, durability,
isolation,consistency) properties to ensure efficient transaction management
without data corruption.
5) Access Control and ease in accessing data:
A DBMS can grant access to various users and determine which part and
how much of the data can they access from the database thus removing
redundancy. Otherwise in file system, separate files have to be created for
each user containing the amount of data that they can access. Moreover, if a
user has to extract specific data, then he needs a code/application to
process that task in case of file system, e.g. Suppose a manager needs a list
of all employees having salary greater than X. Then we need to write
business logic for the same in case data is stored in files. In case of DBMS, it
provides easy access of data through queries, (e.g., SELECT queries) and
whole logic need not be rewritten. Users can specify exactly what they want
to extract out of the data.
6) Integrity constraints: Data stored in databases must satisfy integrity
constraints. For example, Consider a database schema consisting of the
various educational programs offered by a university such
as(B.Tech/M.Tech/B.Sc/M.Sc/BCA/MCA) etc. Then we have a schema of
students enrolled in these programs. A DBMS ensures that it is only out of
one of the programs offered schema , that the student is enrolled in, i.e. Not
anything out of the blue. Hence, database integrity is preserved.
Apart from the above mentioned features a database management also
provides the following:
• Multiple User Interface
• Data scalability, expandability and flexibility: We can change
schema of the database, all schema will be updated according to it.
• Overall the time for developing an application is reduced.
• Security: Simplifies data storage as it is possible to assign security
permissions allowing restricted access to data.
Data Models in DBMS
A Data Model in Database Management System (DBMS), is the concept of
tools that are developed to summarize the description of the database.
It is classified into 3 types:

1. Conceptual Data Model :


Conceptual data model, describes the database at a very high level and is
useful to understand the needs or requirements of the database. It is this
model, that is used in the requirement gathering process i.e., before the
Database Designers start making a particular database. One such popular
model is the entity/relationship model (ER model). The E/R model
specializes in entities, relationships and even attributes which are used by
the database designers. In terms of this concept, a discussion can be made
even with non-computer science(non-technical) users and stakeholders, and
their requirements can be understood.
2. Representational Data Model :
This type of data model is used to represent only the logical part of the
database and does not represent the physical structure of the databases.
The representational data model allows us to focus primarily, on the design
part of the database. A popular representational model is Relational model.
3. Physical Data Model :
Ultimately, all data in a database is stored physically on a secondary storage
device such as discs and tapes. This is stored in the form of files, records
and certain other data structures. It has all the information of the format in
which the files are present and the structure of the databases, presence of
external data structures and their relation to each other.
Difference between Hierarchical,
Network and Relational Data Model
1. Hierarchical Data Model :
Hierarchical data model is the oldest type of the data model. It was
developed by IBM in 1968. It organizes data in the tree-like structure.
Hierarchical model consists of the the following :
• It contains nodes which are connected by branches.
• The topmost node is called the root node.
• If there are multiple nodes appear at the top level, then these can
be called as root segments.
• Each node has exactly one parent.
• One parent may have many child.
Figure – Hierarchical Data Model
n the above figure, Electronics is the root node which has two children i.e.
Televisions and Portable Electronics. These two has further children for
which they act as parent. For example: Television has children as Tube, LCD
and Plasma, for these three Television act as parent. It follows one to many
relationship.

2. Network Data Model :


It is the advance version of the hierarchical data model. To organize data it
uses directed graphs instead of the tree-structure. In this child can have
more than one parent. It uses the concept of the two data structures i.e.
Records and Sets.

Figure – Network Data Model


In the above figure, Project is the root node which has two children i.e.
Project 1 and Project 2. Project 1 has 3 children and Project 2 has 2 children.
Total there are 5 children i.e Department A, Department B and Department
C, they are network related children as we said that this model can have
more than one parent. So, for the Department B and Department C have two
parents i.e. Project 1 and Project 2.
3. Relational Data Model :
The relational data model was developed by E.F. Codd in 1970. Their are no
physical links as they are in the hierarchical data model. Following are the
properties of the relational data model :
• Data is represented in the form of table only.
• It deals only with the data not with the physical structure.
• It provides information regarding metadata.
• At the intersection of row and column there will be only one value
for the tuple.
• It provides a way to handle the queries with ease.

Figure – Relational Data Model

Difference between Hierarchical, Network and Relational Data Model :


Hierarchical Data Model Network Data Model Relational Data Model

In this model, to store data It organizes records in the


hierarchy method is used. It It organizes records to one form of table and relationship
is the oldest method and not another through links or between tables are set using
in use today. pointers. common fields.
Hierarchical Data Model Network Data Model Relational Data Model

To organize records, it uses It organizes records in the It organizes records in the


tree structure. form of directed graphs. form of tables.

In addition to 1:1 and 1:n it In addition to 1:1 and 1:n it


It implements 1:1 and 1:n also implements many to also implements many to
relations. many relationships. many relationships.

Insertion anomaly exits in


this model i.e. child node
cannot be inserted without There is no insertion
the parent node. anomaly. There is no insertion anomaly.

Deletion anomaly exists in


this model i.e. it is difficult to There is no deletion
delete the parent node. anomaly. There is no deletion anomaly.

This model lacks data There is partial data This model provides data
independence. independence in this model. independence.

It is used to access the data It is used to access the data It is used to access the data
which is complex and which is complex and which is complex and
asymmetric. symmetric. symmetric.

VAX-DBMS, DMS-1100
of UNIVAC and
&XML and XAML use this SUPRADBMS’s use this It is mostly used in real world
model. model. applications. Oracle, SQL.

Data Abstraction and Data


Independence
Database systems comprise complex data-structures. In order to make the
system efficient in terms of retrieval of data, and reduce complexity in terms
of usability of users, developers use abstraction i.e. hide irrelevant details
from the users. This approach simplifies database design.
There are mainly 3 levels of data abstraction:
Physical: This is the lowest level of data abstraction. It tells us how the data
is actually stored in memory. The access methods like sequential or random
access and file organization methods like B+ trees, hashing used for the
same. Usability, size of memory, and the number of times the records are
factors that we need to know while designing the database.
Suppose we need to store the details of an employee. Blocks of storage and
the amount of memory used for these purposes are kept hidden from the
user.

Logical: This level comprises the information that is actually stored in the
database in the form of tables. It also stores the relationship among the data
entities in relatively simple structures. At this level, the information available
to the user at the view level is unknown.
We can store the various attributes of an employee and relationships, e.g.
with the manager can also be stored.
View: This is the highest level of abstraction. Only a part of the actual
database is viewed by the users. This level exists to ease the accessibility of
the database by an individual user. Users view data in the form of rows and
columns. Tables and relations are used to store data. Multiple views of the
same database may exist. Users can just view the data and interact with the
database, storage and implementation details are hidden from them.

The main purpose of data abstraction is to achieve data independence in


order to save time and cost required when the database is modified or
altered.
We have namely two levels of data independence arising from these levels
of abstraction :
Physical level data independence: It refers to the characteristic of being
able to modify the physical schema without any alterations to the conceptual
or logical schema, done for optimization purposes, e.g., Conceptual structure
of the database would not be affected by any change in storage size of the
database system server. Changing from sequential to random access files is
one such example. These alterations or modifications to the physical
structure may include:

• Utilizing new storage devices.


• Modifying data structures used for storage.
• Altering indexes or using alternative file organization techniques
etc.
Logical level data independence: It refers characteristic of being able to
modify the logical schema without affecting the external schema or
application program. The user view of the data would not be affected by any
changes to the conceptual view of the data. These changes may include
insertion or deletion of attributes, altering table structures entities or
relationships to the logical schema, etc.
DBA
A Database Administrator (DBA) is individual or person responsible for
controlling, maintenance, coordinating, and operation of database
management system. Managing, securing, and taking care of database
system is prime responsibility.
They are responsible and in charge for authorizing access to database,
coordinating, capacity, planning, installation, and monitoring uses and for
acquiring and gathering software and hardware resources as and when
needed. Their role also varies from configuration, database design,
migration, security, troubleshooting, backup, and data recovery. Database
administration is major and key function in any firm or organization that is
relying on one or more databases. They are overall commander of Database
system.
Types of Database Administrator (DBA) :
• Administrative DBA –
Their job is to maintain server and keep it functional. They are
concerned with data backups, security, trouble shooting, replication,
migration etc.
• Data Warehouse DBA –
Assigned earlier roles, but held accountable for merging data from
various sources into data warehouse. They also design warehouse,
with cleaning and scrubs data prior to loading.
• Development DBA –
They build and develop queries, stores procedure, etc. that meets
firm or organization needs. They are par at programmer.
• Application DBA –
They particularly manages all requirements of application
components that interact with database and accomplish activities
such as application installation and coordinating, application
upgrades, database cloning, data load process management, etc.
• Architect –
They are held responsible for designing schemas like building
tables. They work to build structure that meets organisation needs.
The design is further used by developers and development DBAs to
design and implement real application.
• OLAP DBA –
They design and builds multi-dimensional cubes for determination
support or OLAP systems.
Importance of Database Administrator (DBA) :
• Database Administrator manages and controls three levels of
database like internal level, conceptual level, and external level of
Database management system architecture and in discussion with
comprehensive user community, gives definition of world view of
database. It then provides external view of different users and
applications.
• Database Administrator ensures held responsible to maintain
integrity and security of database restricting from unauthorized
users. It grants permission to users of database and contains profile
of each and every user in database.
• Database Administrator also held accountable that database is
protected and secured and that any chance of data loss keeps at
minimum.
Role and Duties of Database Administrator (DBA) :
• Decides hardware –
They decides economical hardware, based upon cost, performance
and efficiency of hardware, and best suits organisation. It is
hardware which is interface between end users and database.
• Manages data integrity and security –
Data integrity need to be checked and managed accurately as it
protects and restricts data from unauthorized use. DBA eyes on
relationship within data to maintain data integrity.
• Database design –
DBA is held responsible and accountable for logical, physical
design, external model design, and integrity and security control.
• Database implementation –
DBA implements DBMS and checks database loading at time of its
implementation.
• Query processing performance –
DBA enhances query processing by improving their speed,
performance and accuracy.
• Tuning Database Performance –
If user is not able to get data speedily and accurately then it may
loss organization business. So by tuning SQL commands DBA can
enhance performance of database.
Different types of Database Users
Database users are categorized based up on their interaction with the data
base.
These are seven types of data base users in DBMS.
1. Database Administrator (DBA) :
Database Administrator (DBA) is a person/team who defines the
schema and also controls the 3 levels of database.
The DBA will then create a new account id and password for the
user if he/she need to access the data base.
DBA is also responsible for providing security to the data base and
he allows only the authorized users to access/modify the data base.

• DBA also monitors the recovery and back up and provide


technical support.
• The DBA has a DBA account in the DBMS which called a
system or superuser account.
• DBA repairs damage caused due to hardware and/or
software failures.

2. Naive / Parametric End Users :


Parametric End Users are the unsophisticated who don’t have any
DBMS knowledge but they frequently use the data base
applications in their daily life to get the desired results.
For examples, Railway’s ticket booking users are naive users.
Clerks in any bank is a naive user because they don’t have any
DBMS knowledge but they still use the database and perform their
given task.

3. System Analyst :
System Analyst is a user who analyzes the requirements of
parametric end users. They check whether all the requirements of
end users are satisfied.

4. Sophisticated Users :
Sophisticated users can be engineers, scientists, business analyst,
who are familiar with the database. They can develop their own
data base applications according to their requirement. They don’t
write the program code but they interact the data base by writing
SQL queries directly through the query processor.
5. Data Base Designers :
Data Base Designers are the users who design the structure of
data base which includes tables, indexes, views, constraints,
triggers, stored procedures. He/she controls what data must be
stored and how the data items to be related.

6. Application Program :
Application Program are the back end programmers who writes the
code for the application programs.They are the computer
professionals. These programs could be written in Programming
languages such as Visual Basic, Developer, C, FORTRAN, COBOL
etc.

7. Casual Users / Temporary Users :


Casual Users are the users who occasionally use/access the data
base but each time when they access the data base they require
the new information, for example, Middle or higher level manager.
Advantages of Database Management
System
Database Management System (DBMS) is a set of program that allows
access, retrieval and use of that data by considering appropriate security
measures. The database Management system (DBMS) is really useful for
better data integration and its security.
Advantage of Database Management System (DBMS):
Some of them are given as following below.
1. Better Data Transferring:
Database management creates a place where users have an
advantage of more and better managed data. Thus making it
possible for end-users to have a quick look and to respond fast to
any changes made in their environment.

2. Better Data Security:


As number of users increases data transferring or data sharing rate
also increases thus increasing the risk of data security. It is widely
used in corporation world where companies invest money, time and
effort in large amount to ensure data is secure and is used properly.
A Database Management System (DBMS) provide a better platform
for data privacy and security policies thus, helping companies to
improve Data Security.
3. Better data integration:
Due to Database Management System we have an access to well
managed and synchronized form of data thus it makes data
handling very easy and gives integrated view of how a particular
organization is working and also helps to keep a track on how one
segment of the company affects other segment.

4. Minimized Data Inconsistency:


Data inconsistency occurs between files when different versions of
the same data appear in different places.
For Example, data inconsistency occurs when a student name is
saved as “John Wayne” on a main computer of school but on
teacher registered system same student name is “William J.
Wayne”, or when the price of a product is $86.95 in local system of
company and its National sales office system shows the same
product price as $84.95.
So if a database is properly designed then Data inconsistency can
be greatly reduced hence minimizing data inconsistency.

5. Faster data Access:


The Data base management system (DBMS) helps to produce
quick answers to database queries thus making data accessing
faster and more accurate. For example, to read or update the data.
For example, end users, when dealing with large amounts of sale
data, will have enhanced access to the data, enabling faster sales
cycle.
Some queries may be like:
• What is the increase of the sale in last three months?
• What is the bonus given to each of the salespeople in last
five months?
• How many customers have credit score of 850 or more?

6. Better decision making:


Due to DBMS now we have Better managed data and Improved
data accessing because of which we can generate better quality
information hence on this basis better decisions can be made.
Better Data quality improves accuracy, validity and time it takes to
read data.
DBMS does not guarantee data quality, it provides a framework to
make it is easy to improve data quality .

7. Increased end-user productivity:


The data which is available with the help of combination of tools
which transform data into useful information, helps end user to
make quick, informative and better decisions that can make
difference between success and failure in the global economy.

8. Simple:
Data base management system (DBMS) gives simple and clear
logical view of data. Many operations like insertion, deletion or
creation of file or data are easy to implement.
Disadvantages of DBMS
There are many advantages and disadvantages of DBMS (Database
Management System). The disadvantages of DBMS are explained below.
1. Increased Cost:
These are different types of costs:
1. Cost of Hardware and Software –
This is the first disadvantage of the database management system.
This is because, for DBMS, it is mandatory to have a high-speed
processor and also a large memory size. After all, nowadays there
is a large amount of data in every field which needs to be store
safely and with security.
The requirement of this large amount of space and a high-speed
processor needs expensive hardware and expensive software too.
That is, there is a requirement for sophisticated hardware and
software which means that we need to upgrade the hardware which
is used for the file-based system. Hardware and Software, both
require maintenance which costs very high. All the operating,
Training (all levels including programming, application
development, and database administration), licensing, and
regulatory compliance costs very high.

2. Cost of Staff Training –


Educated staff (database administrator, application programmers,
data entry operations) who maintains the database management
system also requires a good amount. We need the database
system designers to be hired along with application programmers.
Alternatively, the services of some software houses need to be
taken. So there is a lot of money which needs to be spent on
developing software.

3. Cost of Data Conversion –


We need to convert our data into a database management system,
there is a requirement of a lot of money as it adds to the cost of the
database management system. This is because for this conversion
we need to hire database system designers whom we have to pay a
lot of money and also services of some software house will be
required. All this shows that a high initial investment for hardware,
software, and trained staff is required by DBMS. So, altogether
Database Management System results in a costlier system.
4. 2. Complexity:
As we all know that nowadays all companies are using the database
management system as it fulfills lots of requirements and also solves
the problem. But a problem arises, that is all this functionality has
made the database management system an extremely complex
software. For the proper requirement of DBMS, it is very important to
have a good knowledge of it by the developers, DBA, designers, and
also the end-users. This is because if any one of them does not
acquire proper and complete skills then this may lead to data loss or
database failure.
5. These failures may lead to bad design decisions due to which there
may be serious and bad consequences for the organization. So this
complex system needs to be understood by everyone using it. As it
cannot be managed very easily. All this shows that a database
management system is not a child’s game as it cannot be managed
very easily. It requires a lot of management. A good staff is needed to
manage this database at times when it becomes very complicated to
decide where to pick data from and where to save it.
6. 3. Currency Maintenance:
This is very necessary to keep your system current because efficiency
which is one of the biggest factors and needs to be overlooked must
be maximized. That is we need to maximize the efficiency of the
database system to keep our system current. For this, frequent
updation must be performed on all the components as new threats
come daily. DBMS should be updated according to the current
scenario. Also, security measures must be implemented. Due to
advancement in database technology, training cost tends to be
significant.
7. 4. Performance:
The traditional file system is written for small organizations and for
some specific applications due to which performance is generally very
good. But for the small-scale firms, DBMS does not give a good
performance as its speed is very slow. As a result, some applications
will not run as fast as they could. Hence it is not good to use DBMS for
small firms. Because performance is a factor that is overlooked by
everyone. If performance is good then everyone (developers,
designers, end-users) will use it easily and it will be user-friendly too.
As the speed of the system totally depends on the performance so
performance needs to be good.
8. 5. Frequency Upgrade/Replacement Cycles:
Nowadays in this world, we need to stay up-to-date about the latest
technologies, developments arriving in the market. Frequent upgrade
of the products is done by the DBMS vendors to add new functionality
to the systems. New upgrade versions of the software often come
bundled. Sometimes these updates also need hardware upgrades.
Sometimes these changes and updates are so fast that the users find
it difficult to work with that system because it is not easy to learn new
commands and understand them again when the new upgrades are
done. All these upgrades also cost money to train users, designers,
etc. to use the new features.
Difference between File System and
DBMS
1. File System :
File system is basically a way of arranging the files in a storage medium like
hard disk. File system organizes the files and helps in retrieval of files when
they are required. File systems consists of different files which are grouped
into directories. The directories further contain other folders and files. File
system performs basic operations like management, file naming, giving
access rules etc.
Example:
NTFS(New Technology File System), EXT(Extended File System).

2. DBMS(Database Management System) :


Database Management System is basically a software that manages the
collection of related data. It is used for storing data and retrieving the data
effectively when it is needed. It also provides proper security measures for
protecting the data from unauthorized access. In Database Management
System the data can be fetched by SQL queries and relational algebra. It
also provides mechanisms for data recovery and data backup.
Example:

Oracle, MySQL, MS SQL server.

Difference between File System and DBMS :

Basis File System DBMS

File system is a software that


manages and organizes the files in a DBMS is a software for
1. Structure storage medium within a computer. managing the database.

2. Data Redundant data can be present in a In DBMS there is no


Redundancy file system. redundant data.

It provides backup and


3.Backup and It doesn’t provide backup and recovery of data even if it is
Recovery recovery of data if it is lost. lost.

4. Query There is no efficient query Efficient query processing is


processing processing in file system. there in DBMS.
Basis File System DBMS

There is more data


There is less data consistency in file consistency because of the
5.Consistency system. process of normalization.

It has more complexity in


It is less complex as compared to handling as compared to file
6. Complexity DBMS. system.

DBMS has more security


7.Security File systems provide less security in mechanisms as compared to
Constraints comparison to DBMS. file system.

It has a comparatively higher


8.Cost It is less expensive than DBMS. cost than a file system.

9. Data In DBMS data independence


Independence There is no data independence. exists.

Introduction of ER Model
ER Model is used to model the logical view of the system from data
perspective which consists of these components:
Entity, Entity Type, Entity Set –
An Entity may be an object with a physical existence – a particular person,
car, house, or employee – or it may be an object with a conceptual existence
– a company, a job, or a university course.

An Entity is an object of Entity Type and set of all entities is called as entity
set. e.g.; E1 is an entity having Entity Type Student and set of all students is
called Entity Set. In ER diagram, Entity Type is represented as:
Attribute(s):
Attributes are the properties which define the entity type. For example,
Roll_No, Name, DOB, Age, Address, Mobile_No are the attributes which
defines entity type Student. In ER diagram, attribute is represented by an
oval.

1. Key Attribute –
The attribute which uniquely identifies each entity in the entity set is called
key attribute.For example, Roll_No will be unique for each student. In ER
diagram, key attribute is represented by an oval with underlying lines.

2. Composite Attribute –
An attribute composed of many other attribute is called as composite
attribute. For example, Address attribute of student Entity type consists of
Street, City, State, and Country. In ER diagram, composite attribute is
represented by an oval comprising of ovals.

3. Multivalued Attribute –
An attribute consisting more than one value for a given entity. For example,
Phone_No (can be more than one for a given student). In ER diagram,
multivalued attribute is represented by double oval.

4. Derived Attribute –
An attribute which can be derived from other attributes of the entity type is
known as derived attribute. e.g.; Age (can be derived from DOB). In ER
diagram, derived attribute is represented by dashed oval.

The complete entity type Student with its attributes can be represented as:
Relationship Type and Relationship Set:
A relationship type represents the association between entity types. For
example,‘Enrolled in’ is a relationship type that exists between entity type
Student and Course. In ER diagram, relationship type is represented by a
diamond and connecting the entities with lines.

A set of relationships of same type is known as relationship set. The


following relationship set depicts S1 is enrolled in C2, S2 is enrolled in C1
and S3 is enrolled in C3.
Degree of a relationship set:
The number of different entity sets participating in a relationship set is
called as degree of a relationship set.
1. Unary Relationship –
When there is only ONE entity set participating in a relation, the
relationship is called as unary relationship. For example, one person is
married to only one person.

2. Binary Relationship –
When there are TWO entities set participating in a relation, the
relationship is called as binary relationship. For example, Student is enrolled
in Course.
3. n-ary Relationship –
When there are n entities set participating in a relation, the relationship is
called as n-ary relationship.
Cardinality:
The number of times an entity of an entity set participates in a
relationship set is known as cardinality. Cardinality can be of different
types:
1. One to one – When each entity in each entity set can take part only once
in the relationship, the cardinality is one to one. Let us assume that a male
can marry to one female and a female can marry to one male. So the
relationship will be one to one.

Using Sets, it can be represented as:

2. Many to one – When entities in one entity set can take part only once in
the relationship set and entities in other entity set can take part more
than once in the relationship set, cardinality is many to one. Let us
assume that a student can take only one course but one course can be taken
by many students. So the cardinality will be n to 1. It means that for one
course there can be n students but for one student, there will be only one
course.
Using Sets, it can be represented as:

In this case, each student is taking only 1 course but 1 course has been
taken by many students.
3. Many to many – When entities in all entity sets can take part more than
once in the relationship cardinality is many to many. Let us assume that a
student can take more than one course and one course can be taken by
many students. So the relationship will be many to many.

Using sets, it can be represented as:


In this example, student S1 is enrolled in C1 and C3 and Course C3 is
enrolled by S1, S3 and S4. So it is many to many relationships.
Participation Constraint:
Participation Constraint is applied on the entity participating in the
relationship set.
1. Total Participation – Each entity in the entity set must participate in the
relationship. If each student must enroll in a course, the participation of
student will be total. Total participation is shown by double line in ER
diagram.
2. Partial Participation – The entity in the entity set may or may NOT
participate in the relationship. If some courses are not enrolled by any of the
student, the participation of course will be partial.
The diagram depicts the ‘Enrolled in’ relationship set with Student Entity set
having total participation and Course Entity set having partial participation.

Using set, it can be represented as,


Every student in Student Entity set is participating in relationship but there
exists a course C4 which is not taking part in the relationship.

Weak Entity Type and Identifying Relationship:


As discussed before, an entity type has a key attribute which uniquely
identifies each entity in the entity set. But there exists some entity type for
which key attribute can’t be defined. These are called Weak Entity type.
For example, A company may store the information of dependents (Parents,
Children, Spouse) of an Employee. But the dependents don’t have existence
without the employee. So Dependent will be weak entity type and Employee
will be Identifying Entity type for Dependent.
A weak entity type is represented by a double rectangle. The participation of
weak entity type is always total. The relationship between weak entity type
and its identifying strong entity type is called identifying relationship and it is
represented by double diamond.

Difference between Strong and Weak


Entity
Strong Entity:
A strong entity is not dependent on any other entity in the schema. A strong
entity will always have a primary key. Strong entities are represented by a
single rectangle. The relationship of two strong entities is represented by a
single diamond.
Various strong entities, when combined together, create a strong entity set.
Weak Entity:
A weak entity is dependent on a strong entity to ensure its existence. Unlike
a strong entity, a weak entity does not have any primary key. It instead has a
partial discriminator key. A weak entity is represented by a double rectangle.
The relation between one strong and one weak entity is represented by a
double diamond. This relationship is also known as identifying relationship.

Difference between Strong and Weak Entity:

S.NOStrong Entity Weak Entity


Strong entity always has a primary While a weak entity has a partial
1. key. discriminator key.
Strong entity is not dependent on
2. any other entity. Weak entity depends on strong entity.
Strong entity is represented by a Weak entity is represented by a double
3. single rectangle. rectangle.
While the relation between one strong and
Two strong entity’s relationship is one weak entity is represented by a double
4. represented by a single diamond. diamond.
Strong entities have either total While weak entity always has total
5. participation or not. participation.

Mapping Constraints
o A mapping constraint is a data constraint that expresses the number of entities to
which another entity can be related via a relationship set.
o It is most useful in describing the relationship sets that involve more than two entity
sets.
o For binary relationship set R on an entity set A and B, there are four possible mapping
cardinalities. These are as follows:
1. One to one (1:1)
2. One to many (1:M)
3. Many to one (M:1)
4. Many to many (M:M)

One-to-one
In one-to-one mapping, an entity in E1 is associated with at most one entity in E2, and
an entity in E2 is associated with at most one entity in E1.

One-to-many
In one-to-many mapping, an entity in E1 is associated with any number of entities in
E2, and an entity in E2 is associated with at most one entity in E1.

Many-to-one
In one-to-many mapping, an entity in E1 is associated with at most one entity in E2,
and an entity in E2 is associated with any number of entities in E1.
Many-to-many
In many-to-many mapping, an entity in E1 is associated with any number of entities in
E2, and an entity in E2 is associated with any number of entities in E1.

Keys
o Keys play an important role in the relational database.
o It is used to uniquely identify any record or row of data from the table. It is also used
to establish and identify relationships between tables.

For example: In Student table, ID is used as a key because it is unique for each student.
In PERSON table, passport_number, license_number, SSN are keys since they are
unique for each person.
Types of key:

1. Primary key
o It is the first key which is used to identify one and only one instance of an entity
uniquely. An entity can contain multiple keys as we saw in PERSON table. The key which
is most suitable from those lists become a primary key.
o In the EMPLOYEE table, ID can be primary key since it is unique for each employee. In
the EMPLOYEE table, we can even select License_Number and Passport_Number as
primary key since they are also unique.
o For each entity, selection of the primary key is based on requirement and developers.
2. Candidate key
o A candidate key is an attribute or set of an attribute which can uniquely identify a tuple.
o The remaining attributes except for primary key are considered as a candidate key. The
candidate keys are as strong as the primary key.

For example: In the EMPLOYEE table, id is best suited for the primary key. Rest of the
attributes like SSN, Passport_Number, and License_Number, etc. are considered as a
candidate key.
3. Super Key
Super key is a set of an attribute which can uniquely identify a tuple. Super key is a
superset of a candidate key.

For example: In the above EMPLOYEE table, for(EMPLOEE_ID, EMPLOYEE_NAME) the


name of two employees can be the same, but their EMPLYEE_ID can't be the same.
Hence, this combination can also be a key.

The super key would be EMPLOYEE-ID, (EMPLOYEE_ID, EMPLOYEE-NAME), etc.

4. Foreign key
o Foreign keys are the column of the table which is used to point to the primary key of
another table.
o In a company, every employee works in a specific department, and employee and
department are two different entities. So we can't store the information of the
department in the employee table. That's why we link these two tables through the
primary key of one table.
o We add the primary key of the DEPARTMENT table, Department_Id as a new attribute
in the EMPLOYEE table.
o Now in the EMPLOYEE table, Department_Id is the foreign key, and both the tables are
related.

ER Diagram of a Company
ER Diagram is known as Entity-Relationship Diagram, it is used to analyze to
structure of the Database. It shows relationships between entities and their
attributes. An ER Model provides a means of communication.
ER diagram of Company has the following description :
•Company has several departments.
•Each department may have several Location.
• Departments are identified by a name, D_no, Location.
• A Manager control a particular department.
• Each department is associated with number of projects.
• Employees are identified by name, id, address, dob, dat
e_of_joining.
• An employee works in only one department but can work on several
project.
• We also keep track of number of hours worked by an employee on
a single project.
• Each employee has dependent
• Dependent has D_name, Gender and relationship.
ER Diagram of Company :
This Company ER diagram illustrates key information about Company,
including entities such as employee, department, project and dependent. It
allows to understand the relationships between entities.
Entities and their Attributes are
• Employee Entity : Attributes of Employee Entity are Name, Id,
Address, Gender, Dob and Doj.
Id is Primary Key for Employee Entity.
• Department Entity : Attributes of Department Entity are D_no,
Name and Location.
D_no is Primary Key for Department Entity.
• Project Entity : Attributes of Project Entity are P_No, Name and
Location.
P_No is Primary Key for Project Entity.
• Dependent Entity : Attributes of Dependent Entity are D_no,
Gender and relationship.
Relationships are :
• Employees works in Departments –
Many employee works in one Department but one employee can
not work in many departments.
• Manager controls a Department –
employee works under the manager of the Department and the
manager records the date of joining of employee in the department.
• Department has many Projects –
One department has many projects but one project can not come
under many departments.
• Employee works on project –
One employee works on several projects and the number of hours
worked by the employee on a single project is recorded.
• Employee has dependents –
Each Employee has dependents. Each dependent is dependent of
only one employee.
Enhanced ER Model
Today the complexity of the data is increasing so it becomes more and more
difficult to use the traditional ER model for database modeling. To reduce this
complexity of modeling we have to make improvements or enhancements were
made to the existing ER model to make it able to handle the complex application
in a better way.
Enhanced entity-relationship diagrams are advanced database diagrams very
similar to regular ER diagrams which represent requirements and complexities
of complex databases.
It is a diagrammatic technique for displaying the Sub Class and Super Class;
Specialization and Generalization; Union or Category; Aggregation etc.
Generalization and Specialization –
These are very common relationships found in real entities. However, this kind of
relationship was added later as an enhanced extension to the classical ER
model. Specialized classes are often called subclass while a generalized class is
called a superclass, probably inspired by object-oriented programming. A sub-
class is best understood by “IS-A analysis”. Following statements hopefully
makes some sense to your mind “Technician IS-A Employee”, “Laptop IS-A
Computer”.
An entity is a specialized type/class of another entity. For example, a Technician
is a special Employee in a university system Faculty is a special class of
Employee. We call this phenomenon generalization/specialization. In the
example here Employee is a generalized entity class while the Technician and
Faculty are specialized classes of Employee.

Example – This example instance of “sub-class” relationships. Here we have


four sets of employees: Secretary, Technician, and Engineer. The employee is
super-class of the rest three sets of individual sub-class is a subset of Employee
set.
•An entity belonging to a sub-class is related to some super-class entity.
For instance emp, no 1001 is a secretary, and his typing speed is 68.
Emp no 1009 is an engineer (sub-class) and her trade is “Electrical”, so
forth.
• Sub-class entity “inherits” all attributes of super-class; for example,
employee 1001 will have attributes eno, name, salary, and typing
speed.
Enhanced ER model of above example –
Constraints – There are two types of constraints on the “Sub-class” relationship.

1. Total or Partial – A sub-classing relationship is total if every super-


class entity is to be associated with some sub-class entity, otherwise
partial. Sub-class “job type based employee category” is partial sub-
classing – not necessary every employee is one of (secretary, engineer,
and technician), i.e. union of these three types is a proper subset of all
employees. Whereas other sub-classing “Salaried Employee AND
Hourly Employee” is total; the union of entities from sub-classes is
equal to the total employee set, i.e. every employee necessarily has to
be one of them.
2. Overlapped or Disjoint – If an entity from super-set can be related
(can occur) in multiple sub-class sets, then it is overlapped sub-
classing, otherwise disjoint. Both the examples: job-type based and
salaries/hourly employee sub-classing are disjoint.
Note – These constraints are independent of each other: can be “overlapped and
total or partial” or “disjoint and total or partial”. Also, sub-classing has transitive
property.

Multiple Inheritance (sub-class of multiple superclasses) –


An entity can be a sub-class of multiple entity types; such entities are sub-class of
multiple entities and have multiple super-classes; Teaching Assistant can
subclass of Employee and Student both. A faculty in a university system can be a
subclass of Employee and Alumnus. In multiple inheritances, attributes of sub-
class are the union of attributes of all super-classes.
Union –

• Set of Library Members is UNION of Faculty, Student, and Staff. A union


relationship indicates either type; for example, a library member is
either Faculty or Staff or Student.
• Below are two examples show how UNION can be depicted in ERD –
Vehicle Owner is UNION of PERSON and Company, and RTO Registered
Vehicle is UNION of Car and Truck.

You might see some confusion in Sub-class and UNION; consider an example in
above figure Vehicle is super-class of CAR and Truck; this is very much the
correct example of the subclass as well but here use it differently we are saying
RTO Registered vehicle is UNION of Car and Vehicle, they do not inherit any
attribute of Vehicle, attributes of car and truck are altogether independent set,
where is in sub-classing situation car and truck would be inheriting the attribute
of vehicle class.

Generalization, Specialization and


Aggregation in ER Model
Generalization, Specialization and Aggregation in ER model are used for
data abstraction in which abstraction mechanism is used to hide details of a
set of objects.
Generalization –
Generalization is the process of extracting common properties from a set of
entities and create a generalized entity from it. It is a bottom-up approach in
which two or more entities can be generalized to a higher level entity if they
have some attributes in common. For Example, STUDENT and FACULTY
can be generalized to a higher level entity called PERSON as shown in
Figure 1. In this case, common attributes like P_NAME, P_ADD become part
of higher entity (PERSON) and specialized attributes like S_FEE become
part of specialized entity (STUDENT).

Specialization –
In specialization, an entity is divided into sub-entities based on their
characteristics. It is a top-down approach where higher level entity is
specialized into two or more lower level entities. For Example, EMPLOYEE
entity in an Employee management system can be specialized into
DEVELOPER, TESTER etc. as shown in Figure 2. In this case, common
attributes like E_NAME, E_SAL etc. become part of higher entity
(EMPLOYEE) and specialized attributes like TES_TYPE become part of
specialized entity (TESTER).

Aggregation –
An ER diagram is not capable of representing relationship between an entity
and a relationship which may be required in some scenarios. In those cases,
a relationship with its corresponding entities is aggregated into a higher level
entity. Aggregation is an abstraction through which we can represent
relationships as higher level entity sets.
For Example, Employee working for a project may require some machinery.
So, REQUIRE relationship is needed between relationship WORKS_FOR
and entity MACHINERY. Using aggregation, WORKS_FOR relationship with
its entities EMPLOYEE and PROJECT is aggregated into single entity and
relationship REQUIRE is created between aggregated entity and
MACHINERY.
Representing aggregation via schema –
To represent aggregation, create a schema containing:
1. primary key of the aggregated relationship
2. primary key of the associated entity set
3. descriptive attribute, if exists.

You might also like