Download as pdf or txt
Download as pdf or txt
You are on page 1of 40

Practical File

On
Database Management System
(Subject Code – BBAN404)
Branch - BBA-B / 2nd-Year (4th Semester)

Submitted To: Submitted By:


Ms. Deepika Prateek Chaurasia
INDEX
SNo. Experiments Date Page Teacher’s
No. Signature
1.

2.

3.

4.

5.

6.

7.

8.

9.

10.

11.
Question 1
How do you define Data and Information? Differentiate with the help of
Examples.
Answer 1 :-
DATA
Data is a collection of raw, unorganized plain facts, observations, statistics,
characters, symbols, images, numbers, and more that are collected and can be
used for analysis.
We collect data using manual or automation from both primary and secondary
sources. Data acquired by researchers, such as interviews, observations, case
studies, and so on, are examples of primary sources. Web material, reports, and
other secondary sources are examples.
When it comes to computers, data is represented in the form of 0’s and 1’s
patterns that may be interpreted to indicate a value or fact. Bit, Nibble, Byte, KB
(kilobytes), MB (Megabytes), GB (Gigabytes), TB (Terabytes), and so on are data
measurement units.
INFORMATION
When data is processed, evaluated, organized, structured, or presented in such a
way that it becomes meaningful or helpful, it is referred to be information. Data is
given context through information.
Information is data that has been structured or categorized and has some
meaningful values for the recipient. The processed data on which judgments and
actions are based is referred to as information. In a nutshell, Information is data
with meaning.
It is critical for decision-makers in any firm to have access to relevant and
trustworthy information. This is dependent on gathering high-quality data that
can be processed, evaluated, and formatted in a consistent and reliable manner
to yield meaningful information.

1|Page
Difference between data and information

Examples of Data and Information


 A book is made up of multiple pages that contain many words, which
constitute data. When we read these pages, examine the material, and
process it in our minds, it becomes information.
 Consider product reviews on Amazon or any other website; they are
statistics for Amazon. Ratings and classification by gender, for example, are
the information we got when Amazon analyzed the data. It is very useful to
decide whether or not to purchase a product.

2|Page
Question 2
Write Short Note on:-
 Data Dictionary & Its Types
 Database Administrator
 Traditional File System Approach
 Data Description language (DDL)
 Data Manipulation Language (DML)
Answer 2 :-
(i) Data Dictionary Definition
A Data Dictionary is a collection of names, definitions, and attributes about data
elements that are being used or captured in a database, information system, or
part of a research project.
There are two types of data dictionary.
Active data dictionaries are created within the databases they describe and
automatically reflect any updates or changes in their host databases. This avoids
any discrepancies between the data dictionaries and their database structures.
Passive data dictionaries are created separately from the databases they describe
to act as a repository for data information. Passive data dictionaries require
additional work to stay in sync with the databases they describe.

(ii) Database Administrator


A person responsible for the design and management of an organization's
databases as well as the evaluation, selection and implementation of the
database management system (DBMS) software. The position is also called a
"data architect," "data officer," "database architect," "IT architect" and
"information architect."

3|Page
(iii) Traditional File System Approach
File based systems were an early attempt to computerize the manual system. It is
also called a traditional based approach in which a decentralized approach was
taken where each department stored and controlled its own data with the help of
a data processing specialist. The main role of a data processing specialist was to
create the necessary computer file structures, and also manage the data within
structures and design some application programs that create reports based on file.

(iv) Data Description language (DDL)


The DDL commands help to define the structure of the databases or schema.
When we execute DDL statements, it takes effect immediately. The changes made
in the database using this command are saved permanently. The following
commands come under DDL language:
o CREATE: It is used to create a new database such as table, function etc.
o DROP: It is used to delete the database and its objects.
o ALTER: It's used to update the database structure.
o TRUNCATE: It is used to completely remove all data from a table.
o RENAME: This command renames the content in the database.

(v) Data Manipulation Language (DML)


The DML commands deal with the manipulation of existing records of a database.
It is responsible for all changes that occur in the database. The changes made in
the database using this command can't save permanently because its commands
are not auto-committed. The following commands come under DML language:
o SELECT: This command is used to extract information from a table.
o INSERT: It is a SQL query that allows us to add data into a table's row.
o UPDATE: This command is used to alter or modify the contents of a table.
o DELETE: This command is used to delete records from a database table.

4|Page
Question 3
Explain three levels Database architecture with the help of diagram.
Answer 3 :-
Database Architecture
A Database Architecture is a representation of DBMS design. It helps to design,
develop, implement, and maintain the database management system. DBMS
architecture allows dividing the database system into individual components that
can be independently modified, changed, replaced, and altered. It also helps to
understand the components of a database.
A Database stores critical information and helps access data quickly and securely.
Therefore, selecting the correct Architecture of DBMS helps in easy and efficient
data management.
DBMS Three Level Architecture Diagram

5|Page
This architecture has three levels:
1. External level
2. Conceptual level
3. Internal level
1. External level
It is also called view level. The reason this level is called “view” is because several
users can view their desired data from this level which is internally fetched from
database with the help of conceptual and internal level mapping.
The user doesn’t need to know the database schema details such as data
structure, table definition etc. user is only concerned about data which is what
returned back to the view level after it has been fetched from database (present
at the internal level).
External level is the “top level” of the Three Level DBMS Architecture.
2. Conceptual level
It is also called logical level. The whole design of the database such as relationship
among data, schema of data etc. are described in this level.
Database constraints and security are also implemented in this level of
architecture. This level is maintained by DBA (database administrator).
3. Internal level
This level is also known as physical level. This level describes how the data is
actually stored in the storage devices. This level is also responsible for allocating
space to the data. This is the lowest level of the architecture.

6|Page
Question 4
What are the basic features of data in a database? What are the common
operations on a database?
Answer 4 :-
DATA
Data, in the context of databases, refers to all the single items that are stored in a
database, either individually or as a set. Data in a database is primarily stored in
database tables, which are organized into columns that dictate the data types
stored therein.
Feature of data in a database
The data in the database possess several characteristics data in the database are
consistent, integral, non-redundant, secured, centrally managed and shared
among multiple applications. There are several advantages of using database
approach, such as the following:
 Single repository of data is maintained
 All users access the data from the same resource
 Quick retrieval of data
 Reduce application development time
 Flexibility in change of database structure
 Enforce standardization
 Up-to-date information availability
 Authorized access security of data
 Enforce integrity constraints and business rules
 Provide backup and recovery procedure
 Common database operation
There are various common operations on a database are as follow:-
Data processing
Data processing tasks includes automation data collection, replication, cleanup,
and migration tasks that help make the data more meaningful, secure,
dependable, and available for immediate processing when necessary.

7|Page
Disaster recovery & protection
Losing critical information assets can compromise the ability of an organization to
operate within regulatory guidelines and customer expectations. Therefore,
organizations must introduce risk mitigation strategies such as:
 Database redundancy
 Distribution across geographic disparate server regions
 Automatic-trigger defense systems against compromise and cyber-attacks
Backup & restoration
The systems should be programmed to automatically backup and restore at
specific instances in order to reduce the risk of data loss, especially in the event of
a network intrusion that compromises the integrity of sensitive and mission-
critical databases.
Security improvements
Deploying strong authentication and access management processes. Enrolling
specific database systems to comply with specific organization policies for access
privilege can be challenging and lead to several security issues unless appropriate
security policies are enforced.
Audits & reporting
Use automation to monitor and track how databases and the information
contained within changes. Auditing databases can help organizations understand
how operations, systems, and data assets comply with organizational and
regulatory policies.

8|Page
Question 5
Explain the concept of field, records, and files with the help of an example.
FIELD
Fields are the components that provide structure for a table. You can't have a
table without fields. For instance, you can create an empty table that has fields
defined but no rows (records). In databases, fields are used to maintain
relationships between tables. This is done by creating matching fields in two or
more tables.
For example, if you stored a table named toy store in a database and you also
stored a staff table to track the employees in each store, you would create a
common field between the two tables that would be populated with, for instance,
a store identifier. The store ID value for a specific toy store would be the same in
both tables.
RECORDS
In relational databases, a record is a group of related data held within the same
structure. More specifically, a record is a grouping of fields within a table that
reference one particular object. The term record is frequently used synonymously
with row. A record is also known as a tuple.
For example, a customer record may include items, such as first name, physical
address, email address, date of birth and gender.
FILES
A file is a container in a computer system for storing information. Files used in
computers are similar in features to that of paper documents used in library and
office files. There are different types of files such as text files, data files, directory
files, binary and graphic files, and these different types of files store different
types of information. In a computer operating system, files can be stored on
optical drives, hard drives or other types of storage devices.
Most modern computer systems provide security or protection measures against
file corruption or damage. The data contained in the files could range from
system generated information to user-specified information. File management is

9|Page
done with the help of operating systems, third-party tools or done manually at
times with the help of the user.

The basic operations that can be performed on a file are:


 Creation of a new file
 Modification of data or file attributes
 Reading of data from the file
 Opening the file in order to make the contents available to other programs
 Writing data to the file
 Closing or terminating a file operation

Question 6
What do you mean by file system approach. Discuss its merits and demerits.
FILE SYSTEM APPROACH
File based systems were an early attempt to computerize the manual system. It is
also called a traditional based approach in which a decentralized approach was
taken where each department stored and controlled its own data with the help of
a data processing specialist. The main role of a data processing specialist was to
create the necessary computer file structures, and also manage the data within
structures and design some application programs that create reports based on file
data.

10 | P a g e
Consider an example of a student's file system. The student file will contain
information regarding the student (i.e., roll no, student name, course etc.).
Similarly, we have a subject file that contains information about the subject and
the result file which contains the information regarding the result. Some fields are
duplicated in more than one file, which leads to data redundancy.
Merits of file system approach
1. Backup:
 It is possible to take faster and automatic back-up of database stored in
files of computer-based systems.
 Computer systems provide functionalities to serve this purpose.it is also
possible to develop specific application program for this purpose.
2. Compactness:
 It is possible to store data compactly.
3. Data Retrieval:
 Computer-based systems provide enhanced data retrieval techniques to
retrieve data stored in files in easy and efficient way.
4. Editing:
 It is easy to edit any information stored in computers in form of files.

11 | P a g e
 Specific application programs or editing software can be used for this
purpose.
5. Remote Access:
 In computer-based systems, it is possible to access data remotely.
 so, to access data it is not necessary for a user to remain present at location
where these data are kept.
6. Sharing:
 Data stored in files of computer-based systems ca be shared among
multiple users at a same time.
Demerits of file system approach
1. Data Redundancy:
 It is possible that the same information may be duplicated in different files.
This leads to data redundancy results in memory wastage.
2. Data Inconsistency:
 Because of data redundancy, it is possible that data may not be in
consistent state.
3. Difficulty in Accessing Data:
 Accessing data is not convenient and efficient in file processing system.
4. Limited Data Sharing:
 Data are scattered in various files. Also, different files may have different
formats and these files may be stored in different folders may be of
different departments.
5. Integrity Problems:
 Data integrity means that the data contained in the database is both
correct and consistent. For this purpose the data stored in database must
satisfy correct and constraints.

12 | P a g e
6. Atomicity Problems:
 Any operation on database must be atomic.
 This means, it must happen in its entirely or not at all.
7. Concurrent Access Anomalies:
 Multiple users are allowed to access data simultaneously. This is for the
sake of better performance and faster response.
8. Security Problems:
 Database should be accessible to users in limited way.
 Each user should be allowed to access data concerning his requirements
only.

13 | P a g e
Question 7
Write short note on
a. Data Independence and its types
b. Mapping and Its Types
c. Schema and sub-schema
Answer 7
a. Data Independence and its types
Data independence is the type of data transparency that matters for a centralized
DBMS. It refers to the immunity of user applications to changes made in the
definition and organization of data. A database is viewed through any three levels
of abstraction. Any change at any level may affect other levels’ schema (structure
of database). As the databases keep growing, there may be changes made at
some level. However, this should never lead to redesigning and re-
implementation a database.
There are two types of data independence: physical and logical data
independence.
Physical data independence is the ability to modify the physical schema without
causing application programs to be rewritten. Modifications at the physical level
are occasionally necessary to improve performance. It means we change the
physical storage/level without affecting the conceptual or external view of the
data. The new changes are absorbed by mapping techniques.
Logical data independence is the ability to modify the logical schema without
causing application programs to be rewritten. Modifications at the logical level
are necessary whenever the logical structure of the database is altered (for
example, when money-market accounts are added to banking system). Logical
Data independence means if we add some new columns or remove some columns
from table then the user view and programs should not change
b. Mapping and Its Types
The process of transforming a request and result between three levels is called
mapping. The user’s request is specified at an external level which must be

14 | P a g e
transferred at a conceptual level. Finally, this request is transformed at an internal
level for the final processing of data in the stored database. Hence mapping in
DBMS is used to interconnect the three levels of databases.
There are two types of mapping in DBMS as shown below:
External/Conceptual Mapping
The process of transforming requests and results between external and
conceptual levels is called external/conceptual mapping. As shown in the above
mapping block, the diagram, external views are connected to conceptual views
with the help of external/conceptual mapping.
Conceptual/Internal Mapping
The conceptual schema is connected to the internal schema with the help of
conceptual/internal mapping. This enables DBMS to finalize the actual record in
physical storage. The process of transforming requests and results between the
conceptual and internal levels is called conceptual/external mapping
c. Schema and Subschema
Schema
The term schema shows an overall structure (organization) of all data-items
including their record types stored in a database. To set up a database, we must
first define its structure and the schema definitions. This is done by identifying the
characteristics of each field contained in a database.
A good way to begin defining the schema of a database should also consider the
possible future needs of all types of users. That is all possible fields that may be
needed shortly should be included in the database structure at the time of
defining it.
Subschema
A subschema is a set of Data Elements that belong to the composition of a
Table. The use of subschemas provides a partial view of this data. The term
Subschema refers to an application programmer’s view of the data-items and
record types which the user would use. Many subschemas can be derived from
one schema. The subschema is also referred to as the logical view.

15 | P a g e
Question 8
Explain the following with the help of suitable Diagram:-
1. Hierarchical Data model
2. Network Data model
3. Relationship Data model
4. Entity-Relationship (E-R) Data model
Answer 8
1. Hierarchical Data Model:
Hierarchical data model is the oldest type of the data model. It was developed by
IBM in 1968. It organizes data in the tree-like structure. Hierarchical model
contains nodes which are connected by branches. The topmost node is called the
root node. If there are multiple nodes appear at the top level, then these can be
called as root segments. Each node has exactly one parent. One parent may have
many child.

In the above figure, Electronics is the root node which has two children i.e.
Televisions and Portable Electronics. These two has further children for which
they act as parent. For example: Television has children as Tube, LCD and Plasma,
for these three Television act as parent. It follows one to many relationship.
2. Network Data Model:
It is the advance version of the hierarchical data model. To organize data it uses
directed graphs instead of the tree-structure. In this child can have more than one
parent. It uses the concept of the two data structures i.e. Records and Sets.

16 | P a g e
In the above figure, Project is the root node which has two children i.e. Project 1
and Project 2. Project 1 has 3 children and Project 2 has 2 children. Total there are
5 children i.e Department A, Department B and Department C, they are network
related children as we said that this model can have more than one parent. So, for
the Department B and Department C have two parents i.e. Project 1 and Project 2.
3. Relational Data Model:
The relational data model was developed by E.F. Codd in 1970. There are no
physical links as they are in the hierarchical data model. Data is represented in the
form of table only.It deals only with the data not with the physical structure.It
provides information regarding metadata. At the intersection of row and column
there will be only one value for the tuple.It provides a way to handle the queries
with ease.

17 | P a g e
4. ER (Entity Relationship):
ER model stands for an Entity-Relationship model. It is a high-level data model.
This model is used to define the data elements and relationship for a specified
system. It develops a conceptual design for the database. It also develops a very
simple and easy to design view of data. In ER modelling, the database structure is
portrayed as a diagram called an entity-relationship diagram.

For example, Suppose we design a school database. In this database, the student
will be an entity with attributes like address, name, id, age, etc. The address can
be another entity with attributes like city, street name, pin code, etc. and there
will be a relationship between them.
Question 9
How is database system classified according to:-
1. Number of Users
2. Type of users
3. Database Site location
1. Depending On Number of Users
The number of users determines whether the database is classified as single-user
or multiuser.

18 | P a g e
Single User Database: A single-user database supports only one user at a time. In
other words, if user A is using the database, users B and C must wait until user A is
done. A single-user database that runs on a personal computer is called a desktop
database.
Multi User Database: A multiuser database supports multiple users at the same
time. When the multiuser database supports a relatively small number of users
(usually fewer than 50) or a specific department within an organization, it is called
a workgroup database. When the database is used by the entire organization and
supports many users (more than 50, usually hundreds) across many departments,
the database is known as an enterprise database.
2. Depending on Database location:
Location might also be used to classify the database. For example, a database that
supports data located at a single site is called a centralized database. A database
that supports data distributed across several different sites is called a distributed
database. there are following types of databases available in the market −
 Centralised database.
 Distributed database.
 Personal database.
 End-user database. etc.
3. Based on type of use
The most popular way of classifying databases today, however, is based on how
they will be used and on the time sensitivity of the information gathered from
them. For example, transactions such as product or service sales, payments, and
supply purchases reflect critical day-to-day operations. Such transactions must be
recorded accurately and immediately. A database that is designed primarily to
support a company’s day-to-day operations is classified as an operational
database (sometimes referred to as a transactional or production database).
In contrast, a data warehouse focuses primarily on storing data used to generate
information required to make tactical or strategic decisions. Such decisions
typically require extensive “data massaging” (data manipulation) to extract
information to formulate pricing decisions, sales forecasts, market positioning,
and so on.
19 | P a g e
Question 10
Define the term Database security. What are the various threats to database
security? Explain in detail.
Answer 10
Database security refers to the range of tools, controls, and measures designed to
establish and preserve database confidentiality, integrity, and availability. This
article will focus primarily on confidentiality since it’s the element that’s
compromised in most data breaches.
Database security must address and protect the following:
 The data in the database
 The database management system (DBMS)
 Any associated applications
 The physical database server and/or the virtual database server and the
underlying hardware
 The computing and/or network infrastructure used to access the
database
Database security is a complex and challenging endeavor that involves all aspects
of information security technologies and practices. It’s also naturally at odds with
database usability. The more accessible and usable the database, the more
vulnerable it is to security threats.
Common threats and challenges
The following are among the most common types or causes of database security
attacks and their causes.
Insider threats
An insider threat is a security threat from any one of three sources with privileged
access to the database:
 A malicious insider who intends to do harm
 A negligent insider who makes errors that make the database vulnerable to
attack

20 | P a g e
 An infiltrator—an outsider who somehow obtains credentials via a scheme
such as phishing or by gaining access to the credential database itself
Human error
Accidents, weak passwords, password sharing, and other unwise or uninformed
user behaviors continue to be the cause of nearly half (49%) of all reported data
breaches.
Exploitation of database software vulnerabilities
Hackers make their living by finding and targeting vulnerabilities in all kinds of
software, including database management software. All major commercial
database software vendors and open source database management platforms
issue regular security patches to address these vulnerabilities, but failure to apply
these patches in a timely fashion can increase your exposure.
SQL/NoSQL injection attacks
A database-specific threat, these involve the insertion of arbitrary SQL or non-
SQL attack strings into database queries served by web applications or HTTP
headers. Organizations that don’t follow secure web application coding practices
and perform regular vulnerability testing are open to these attacks.
Buffer overflow exploitations
Buffer overflow occurs when a process attempts to write more data to a fixed-
length block of memory than it is allowed to hold. Attackers may use the excess
data, stored in adjacent memory addresses, as a foundation from which to launch
attacks.
Malware
Malware is software written specifically to exploit vulnerabilities or otherwise
cause damage to the database. Malware may arrive via any endpoint device
connecting to the database’s network.
Attacks on backups
Organizations that fail to protect backup data with the same stringent controls
used to protect the database itself can be vulnerable to attacks on backups.

21 | P a g e
Question 11
What is Data encryption? Discuss its process, types with the help of suitable
diagram.
Answer 11
Encryption is a technique for transforming information on a computer so it
becomes unreadable. So, even if someone is able to gain access to a computer
with personal data on it, they likely won’t be able to do anything with the data
unless they have complicated, expensive software or the original data key.
The basic function of encryption essentially translates normal text into ciphertext.
Encryption methods can help ensure that data doesn’t get read by the wrong
people, but can also ensure that data isn’t altered in transit, and verify the
identity of the sender.

Data encryption converts data into a different form (code) that can only be
accessed by people who have a secret key (formally known as a decryption key) or
password. Data that has not been encrypted is referred to as plaintext, and data
that has been encrypted is referred to as ciphertext. Encryption is one of the most
widely used and successful data protection technologies in today’s corporate
world.
Types of encryption methods
According to Wisegeek, different encryption methods exist, each with their own
advantages.

22 | P a g e
1. Symmetric encryption methods, also known as private-key cryptography,
earned its name because the key used to encrypt and decrypt the message
must remain secure. Anyone with access to the key can decrypt the data.
Using this method, a sender encrypts the data with one key, sends the data
(the ciphertext), and then the receiver uses the key to decrypt the data.

2. Asymmetric encryption methods, or public-key cryptography, differ from


the previous method because it uses two keys for encryption or decryption
(giving it the potential to be more secure). With this method, a public key
freely available to everyone is used to encrypt messages, and a different,
private key is used by the recipient to decrypt messages.

Question 12
Write Short Note on following:-
a. Firewalls

23 | P a g e
b. Data Security requirements (Confidentiality/Integrity/Availability)
c. Privilege
d. Authorisation in Database security
Answer 12
a. Firewalls
A Firewall is a network security device that monitors and filters incoming and
outgoing network traffic based on an organization’s previously established
security policies. At its most basic, a firewall is essentially the barrier that sits
between a private internal network and the public Internet. A firewall’s main
purpose is to allow non-threatening traffic in and to keep dangerous traffic out
Firewalls create 'choke points' to funnel web traffic, at which they are then
reviewed on a set of programmed parameters and acted upon accordingly.
Some firewalls also track the traffic and connections in audit logs to reference
what has been allowed or blocked.
Firewalls are typically used to gate the borders of a private network or its host
devices. As such, firewalls are one security tool in the broader category of user
access control. These barriers are typically set up in two locations — on
dedicated computers on the network or the user computers and other
endpoints themselves (hosts).
b. Data Security requirements (Confidentiality/Integrity/Availability)
The CIA Triad—Confidentiality, Integrity, and Availability—is a guiding model in
information security. A comprehensive information security strategy includes
policies and security controls that minimize threats to these three crucial
components.
The CIA triad guides the information security in a broad sense and is also
useful for managing the products and data of research.
Confidentiality - Confidentiality refers to protecting information from
unauthorized access.
Integrity - Integrity means data are trustworthy, complete, and have not been
accidentally altered or modified by an unauthorized user.

24 | P a g e
Availability - Availability means data are accessible when you need them.
c. Privilege
A privilege is a right to execute a particular type of SQL statement or to access
another user's object. Some examples of privileges include the right to:
 Connect to the database (create a session)
 Create a table
 Select rows from another user's table
 Execute another user's stored procedure
You grant privileges to users so these users can accomplish tasks required for
their job. You should grant a privilege only to a user who absolutely requires
the privilege to accomplish necessary work. Excessive granting of unnecessary
privileges can compromise security. A user can receive a privilege in two
different ways:
 You can grant privileges to users explicitly. For example, you can
explicitly grant the privilege to insert records into the employees table
to the user SCOTT.
 You can also grant privileges to a role (a named group of privileges), and
then grant the role to one or more users. For example, you can grant the
privileges to select, insert, update, and delete records from
the employees table to the role named clerk, which in turn you can
grant to the users scott and brian.
d. Authorisation in Database security
Authorization is a process by which a server determines if the client has
permission to use a resource or access a file. Authorization is usually coupled
with authentication so that the server has some concept of who the client is
that is requesting access.
The type of authentication required for authorization may vary; passwords
may be required in some cases but not in others.
In some cases, there is no authorization; any user may be use a resource or
access a file simply by asking for it. Most of the web pages on the Internet
require no authentication or authorization.
25 | P a g e
Question 13
Explain Database recovery. Explain its needs. Discuss the various techniques of
database recovery.
Answer 13
Database Recovery is a process of recovering or restoring data in the database
when a data loss occurs or data gets deleted by system crash, hacking, errors in
the transaction, damage occurred coincidentally, by viruses, sudden terrible
failure, commands incorrect implementation, etc. Data loss or failures happen in
databases like other systems but the data stored in the database should be
available whenever it's required. For fast restoration or recovery of data, the
database must hold tools which recover the data efficiently. It should have
atomicity means either the transactions showing the consequence of successful
accomplishment perpetually in the database or the transaction must have no sign
of accomplishment consequence in the database.
Need for recovery - From any failure set of circumstances, there are both
voluntary and involuntary ways for both, backing up of data and recovery. So,
recovery techniques which are based on deferred update and immediate update
or backing up data can be used to stop loss in the database.
Recovery Techniques
Recovery Techniques of the information base are demonstrated as follows −
Log Based Recovery
Logs are the continuation of records which are used to oversee records of the
activities during an exchange. Logs are composed before the real change and put
away on a steady stockpiling media.
Log Based Recovery procedure works in three distinct habits as follows −
 Conceded Update
 Quick Update
 Checkpoint

26 | P a g e
Conceded Update Method
In this technique, an information base isn't truly refreshed on a circle until after
an exchange arrives at its submitting point. After it, the updates are put away
perseveringly in the log and afterward kept in touch with the information base.
Before the submitting point, the exchange refreshes are overseen in the nearby
exchange workspace like cradles. In the event that an exchange comes up short
prior to coming to the submit point, it won't have changed the information base.
Subsequently, there is no compelling reason to UNDO. So it is important to REDO
the impact of the tasks of a submitted exchange from the log, since then impact
may not yet have been recorded.
Quick Update Method
In this technique, the information base might be refreshed by certain activities of
an exchange before the exchange compasses its submit point. These activities are
reliably recorded in the sign on circle viably composing before adjusted.
In the event that an exchange prematurely ends subsequent to keeping record of
a few changes to the information base, however before submit point, the impact
of its procedure on the data set should be fixed.
Reserving/Buffering
In this at least one circle, pages that incorporate information things to be
refreshed are stored into principal memory supports and afterward refreshed in
memory prior to being composed back to plate.
An assortment of in-memory cushions called the DBMS reserve is monitored by
DBMS for holding these cradles. A catalogue is utilized to monitor which
information base things are in the cradle.

Question 14
Elaborate distributed database. What are its functions and components? Also
discuss its merits and demerits.
Answer 14

27 | P a g e
Distributed Database Definition
A distributed database represents multiple interconnected databases spread out
across several sites connected by a network. Since the databases are all
connected, they appear as a single database to the users.
Distributed databases utilize multiple nodes. They scale horizontally and develop
a distributed system. More nodes in the system provide more computing power,
offer greater availability, and resolve the single point of failure issue.
Different parts of the distributed database are stored in several physical
locations, and the processing requirements are distributed among processors on
multiple database nodes.
A centralized distributed database management system (DDBMS) manages the
distributed data as if it were stored in one physical location. DDBMS synchronizes
all data operations among databases and ensures that the updates in one
database automatically reflect on databases in other sites.
Distributed Database Features
Some general features of distributed databases are:
 Location independency - Data is physically stored at multiple sites and
managed by an independent DDBMS.
 Distributed query processing - Distributed databases answer queries in a
distributed environment that manages data at multiple sites. High-level
queries are transformed into a query execution plan for simpler
management.
 Distributed transaction management - Provides a consistent distributed
database through commit protocols, distributed concurrency control
techniques, and distributed recovery methods in case of many transactions
and failures.
 Seamless integration - Databases in a collection usually represent a single
logical database, and they are interconnected.
 Network linking - All databases in a collection are linked by a network and
communicate with each other.

28 | P a g e
 Transaction processing - Distributed databases incorporate transaction
processing, which is a program including a collection of one or more
database operations.
Merits
 Modular Development. Modular development of a distributed database
implies that a system can be expanded to new locations or units by adding
new servers and data to the existing setup and connecting them to the
distributed system without interruption.
 Reliability. Distributed databases offer greater reliability in contrast to
centralized databases. In case of a database failure in a centralized
database, the system comes to a complete stop. In a distributed database,
the system functions even when failures occur, only delivering reduced
performance until the issue is resolved.
 Lower Communication Cost. Locally storing data reduces communication
costs for data manipulation in distributed databases. Local data storage is
not possible in centralized databases.
 Better Response. Efficient data distribution in a distributed database
system provides a faster response when user requests are met locally
Demerits
 Costly Software. Ensuring data transparency and coordination across
multiple sites often requires using expensive software in a distributed
database system.
 Large Overhead. Many operations on multiple sites requires numerous
calculations and constant synchronization when database replication is
used, causing a lot of processing overhead.
 Data Integrity. A possible issue when using database replication is data
integrity, which is compromised by updating data at multiple sites.
 Improper Data Distribution. Responsiveness to user requests largely
depends on proper data distribution. That means responsiveness can be
reduced if data is not correctly distributed across multiple sites.

29 | P a g e
Question 15
Define the following:-
a) Data fragmentation
b) Data Replication
c) Types of Distributed Database
Answer 15
a) Data fragmentation
Data fragmentation is data that's stored in multiple locations, creating huge
caches of secondary data that aren't essential to business operations and affect
storage capabilities. Examples of data fragmentation are:
 Backups
 Archives
 File shares
 Object stores
 Test systems
 Development systems
 Analytics
This data may include duplicated data, or versions that were created for specific
circumstances. You can store this data in a variety of locations, causing it to take
up space in your storage centers. The variety of systems and uses for each data
point often means duplicating data or separating it from its context, leaving it to
be stored in multiple locations that aren't connected. If companies don't address
their data fragmentation, it can become difficult to find relevant data in the mass
data stored in their systems.
b) Data Replication
Data Replication is the process of storing data in more than one site or node. It is
useful in improving the availability of data. It is simply copying data from a
database from one server to another server so that all the users can share the
same data without any inconsistency. The result is a distributed database in which
users can access data relevant to their tasks without interfering with the work of
others. Data replication encompasses duplication of transactions on an ongoing
30 | P a g e
basis, so that the replicate is in a consistently updated state and synchronized
with the source. However in data replication data is available at different
locations, but a particular relation has to reside at only one location. There can be
full replication, in which the whole database is stored at every site. There can also
be partial replication, in which some frequently used fragment of the database
are replicated and others are not replicated. Types of Data Replication –
c) Distributed Database Types
There are two types of distributed databases:
Homogeneous
A homogenous distributed database is a network of identical databases stored on
multiple sites. The sites have the same operating system, DDBMS, and data
structure, making them easily manageable.
Homogenous databases allow users to access data from each of the databases
seamlessly.

Heterogeneous
A heterogeneous distributed database uses different schemas, operating systems,
DDBMS, and different data models.
In the case of a heterogeneous distributed database, a particular site can be
completely unaware of other sites causing limited cooperation in processing user
requests. The limitation is why translations are required to establish
communication between sites.
The following diagram shows an example of a heterogeneous database:

31 | P a g e
Question 16
Define the following:-
a) Data mining, phases, models and application
b) Data warehousing, phases, models and application
Answer 16

a) Data Mining
Data mining is a process used by companies to turn raw data into useful
information. By using software to look for patterns in large batches of data,
businesses can learn more about their customers to develop more effective
marketing strategies, increase sales and decrease costs. Data mining depends
on effective data collection, warehousing, and computer processing.
The Data Mining Process
To be most effective, data analysts generally follow a certain flow of tasks along
the data mining process.
Step 1: Understand the Business
Before any data is touched, extracted, cleaned, or analyzed, it is important to
understand the underlying entity and the project at hand. What are the goals the
company is trying to achieve by mining data? Before looking at any data, the
mining process starts by understanding what will define success at the end of the
process.

32 | P a g e
Step 2: Understand the Data
Once the business problem has been clearly defined, it's time to start thinking
about data. This includes what sources are available, how it will be secured
stored, how information will be gathered, and what the final outcome or analysis
may look like.
Step 3: Prepare the Data
Data is gathered, uploaded, extracted, or calculated. It is then cleaned,
standardized, scrubbed for outliers, assessed for mistakes, and checked for
reasonableness. During this stage of data mining, the data may also be checked
for size as an overbearing collection of information may unnecessarily slow
computations and analysis.
Step 4: Build the Model
With our clean data set in hand, it's time to crunch the numbers. Data scientists
use the types of data mining above to search for relationships, trends,
associations, or sequential patterns.
Step 5: Evaluate the Results
The data-centered aspect of data mining concludes by assessing the findings of
the data model(s). The outcomes from the analysis may be aggregated,
interpreted, and presented to decision-makers that have largely be excluded from
the data mining process to this point.
Step 6: Implement Change and Monitor
The data mining process concludes with management taking steps in response to
the findings of the analysis. The company may decide the information was not
strong enough or the findings were not relevant to change course. Alternatively,
the company may strategically pivot based on findings.
Applications of Data Mining
Sales: The ultimate goal of a company is to make money, and data mining
encourages smarter, more efficient use of capital to drive revenue growth.
Marketing: However, to make its marketing efforts more effective, the store can
use data mining to understand where its clients see ads, what demographics to
33 | P a g e
target, where to place digital ads, and what marketing strategies most resonate
with customers.
Manufacturing: For companies that produce their own goods, data mining plays
an integral part in analyzing how much each raw material costs, what materials
are being used most efficiently, how time is spent along the manufacturing
process, and what bottlenecks negatively impact the process.
Fraud Detection: The heart of data mining is finding patterns, trends, and
correlations that link data points together. Therefore, a company can use data
mining to identify outliers or correlations that should not exist.
Customer Service: Customer satisfaction may be caused (or destroyed) for a
variety of reasons. Imagine a company that ships goods. A customer may become
unhappy with ship time, shipping quality, or communication on shipment
expectations. That same customer may become frustrated with long telephone
wait times or slow e-mail responses.

b) Data Warehousing
A Data Warehousing (DW) is process for collecting and managing data from
varied sources to provide meaningful business insights. A Data warehouse is
typically used to connect and analyze business data from heterogeneous sources.
The data warehouse is the core of the BI system which is built for data analysis
and reporting.
It is a blend of technologies and components which aids the strategic use of data.
It is electronic storage of a large amount of information by a business which is
designed for query and analysis instead of transaction processing. It is a process
of transforming data into information and making it available to users in a timely
manner to make a difference.
Working of Data warehouse
A Data Warehouse works as a central repository where information arrives from
one or more data sources. Data flows into a data warehouse from the
transactional system and other relational databases.

34 | P a g e
Data may be:
1. Structured
2. Semi-structured
3. Unstructured data
The data is processed, transformed, and ingested so that users can access the
processed data in the Data Warehouse through Business Intelligence tools, SQL
clients, and spreadsheets. A data warehouse merges information coming from
different sources into one comprehensive database.
By merging all of this information in one place, an organization can analyze its
customers more holistically. This helps to ensure that it has considered all the
information available. Data warehousing makes data mining possible. Data mining
is looking for patterns in the data that may lead to higher sales and profits.
Components of Data warehouse
Load manager: Load manager is also called the front component. It performs with
all the operations associated with the extraction and load of data into the
warehouse. These operations include transformations to prepare the data for
entering into the Data warehouse.
Warehouse Manager: Warehouse manager performs operations associated with
the management of the data in the warehouse. It performs operations like
analysis of data to ensure consistency, creation of indexes and views, generation
of renormalization and aggregations, transformation and merging of source data
and archiving and baking-up data.
Query Manager: Query manager is also known as backend component. It
performs all the operation operations related to the management of user queries.
The operations of these Data warehouse components are direct queries to the
appropriate tables for scheduling the execution of queries.
Application
Airline: In the Airline system, it is used for operation purpose like crew
assignment, analyses of route profitability, frequent flyer program promotions,
etc.

35 | P a g e
Banking: It is widely used in the banking sector to manage the resources available
on desk effectively. Few banks also used for the market research, performance
analysis of the product and operations.
Healthcare: Healthcare sector also used Data warehouse to strategize and predict
outcomes, generate patient’s treatment reports, share data with tie-in insurance
companies, medical aid services, etc.
Public sector: In the public sector, data warehouse is used for intelligence
gathering. It helps government agencies to maintain and analyze tax records,
health policy records, for every individual.
Investment and Insurance sector: In this sector, the warehouses are primarily
used to analyze data patterns, customer trends, and to track market movements.
Retail chain: In retail chains, Data warehouse is widely used for distribution and
marketing. It also helps to track items, customer buying pattern, promotions and
also used for determining pricing policy.
Question 16
Write Short note on:-
a) Internet Database
b) World Wide Web (WWW)
c) Digital Library
d) Multimedia Database
e) Spatial Database
f) Mobile database
Answer 16
a) Internet Database
An Internet database is a database that can be accessed via the Internet. For an
enterprise, an online database is a method of keeping critical data accessible to
employees—or even clients and vendors. An online database may drive an
ecommerce site or a customer relationship management suite.
Internet databases are usually hosted by a website or a service provider, and they
allow users to search for and retrieve data from the database with a web

36 | P a g e
browser. IMDb.com (Internet Movie Database), for instance, is a tremendously
large online library database that documents practically every movie ever made.
b) World Wide Web (WWW)
The World Wide Web is abbreviated as WWW and is commonly known as the
web. The WWW was initiated by CERN (European library for Nuclear Research) in
1989.
WWW can be defined as the collection of different websites around the world,
containing different information shared via local servers(or computers). The
World Wide Web is based on several different technologies: Web browsers,
Hypertext Markup Language (HTML) and Hypertext Transfer Protocol (HTTP).
A Web browser is used to access web pages. Web browsers can be defined as
programs which display text, data, pictures, animation and video on the Internet.
Hyperlinked resources on the World Wide Web can be accessed using software
interfaces provided by Web browsers.
c) Digital Library
A digital library is a collection of digital objects, such as books, magazines, audio
recordings, video recordings and other documents that are accessible
electronically.
Digital libraries provide users with online access to a wide range of resources.
They are often used by students for research or by professionals seeking to stay
current on the latest developments in their field.
Digital libraries can provide users with access to rare and out-of-print materials
that might be difficult or impossible to locate in physical libraries. Digital libraries
also offer a variety of search and sorting features, as well as social media-like
features that can connect users with others to discuss topics.
d) Multimedia Database
Multimedia database is the collection of interrelated multimedia data that
includes text, graphics (sketches, drawings), images, animations, video, audio etc.
and have vast amounts of multisource multimedia data. The framework that
manages different types of multimedia data which can be stored, delivered and

37 | P a g e
utilized in different ways is known as multimedia database management system.
There are three classes of the multimedia database which includes static media,
dynamic media and dimensional media.
Content of Multimedia Database management system:
Media data – The actual data representing an object.
Media format data – Information such as sampling rate, resolution, encoding
scheme etc. about the format of the media data after it goes through the
acquisition, processing and encoding phase.
Media keyword data – Keywords description relating to the generation of data. It
is also known as content descriptive data. Example: date, time and place of
recording.
Media feature data – Content dependent data such as the distribution of colors,
kinds of texture and different shapes present in data.
e) Spatial Database
A SDBMS manages the database structure and controls access to data stored in a
spatial database. The SDBMS plays a prominent role in the management of query
of spatial data. Spatial data management is of use in many disciplines, including
geography, remote sensing, urban planning, and natural resource management.
The spatial indexes (e.g., R-trees) were designed to facilitate efficient accesses to
spatial databases, and the filter–refine strategy can be applied to process spatial
queries efficiently.
f) Mobile database
A Mobile database is a database that can be connected to a mobile computing
device over a mobile network (or wireless network). Here the client and the
server have wireless connections. In today’s world, mobile computing is growing
very rapidly, and it is huge potential in the field of the database. It will be
applicable on different-different devices like android based mobile databases, iOS
based mobile databases, etc. Common examples of databases are Couch base
Lite, Object Box, etc.

38 | P a g e

You might also like