Professional Documents
Culture Documents
Unit 4 Rdbms
Unit 4 Rdbms
Unit 4 Rdbms
For storing the data, there are different types of storage options available.
These storage types differ from one another as per the speed and
accessibility. There are the following types of storage devices used for
storing the data:
o Primary Storage
o Secondary Storage
o Tertiary Storage
Primary Storage
It is the primary area that offers quick access to the stored data. We also know
the primary storage as volatile storage. It is because this type of memory does
not permanently store the data. As soon as the system leads to a power cut or
a crash, the data also get lost. Main memory and cache are the types of
primary storage.
o Main Memory: It is the one that is responsible for operating the data
that is available by the storage medium. The main memory handles
each instruction of a computer machine. This type of memory can
store gigabytes of data on a system but is small enough to carry the
entire database. At last, the main memory loses the whole content if
the system shuts down because of power failure or other reasons.
Secondary Storage
Secondary storage is also called as Online storage. It is the storage area that
allows the user to save and store data permanently. This type of memory
does not lose the data due to any power failure or system crash. That's why
we also call it non-volatile storage.
There are some commonly described secondary storage media which are
available in almost every type of computer system:
Tertiary Storage
It is the storage type that is external from the computer system. It has the
slowest speed. But it is capable of storing a large amount of data. It is also
known as Offline storage. Tertiary storage is generally used for data backup.
There are following tertiary storage devices available:
Storage Hierarchy
Besides the above, various other storage devices reside in the computer
system. These storage media are organized on the basis of data accessing
speed, cost per unit of data to buy the medium, and by medium's reliability.
Thus, we can create a hierarchy of storage media on the basis of its cost
and speed.
What is RAID?
RAID (redundant array of independent disks) is a way of storing the same data
in different places on multiple hard disks or solid-state drives (SSDs) to protect
data in the case of a drive failure. There are different RAID levels, however, and
not all have the goal of providing redundancy.
RAID arrays appear to the operating system (OS) as a single logical drive.
RAID employs the techniques of disk mirroring or disk striping. Mirroring will
copy identical data onto more than one drive. Striping partitions help spread
data over multiple disk drives. Each drive's storage space is divided into units
ranging from a sector of 512 bytes up to several megabytes. The stripes of all
the disks are interleaved and addressed in order. Disk mirroring and disk
striping can also be combined in a RAID array.
In a single-user system where large records are stored, the stripes are typically
set up to be small (512 bytes, for example) so that a single record spans all the
disks and can be accessed quickly by reading all the disks at the same time.
With software-based RAID, the controller uses the resources of the hardware
system, such as the central processor and memory. While it performs the same
functions as a hardware-based RAID controller, software-based RAID
controllers may not enable as much of a performance boost and can affect the
performance of other applications on the server.
Firmware-based RAID controller chips are located on the motherboard, and all
operations are performed by the central processing unit (CPU), similar to
software-based RAID. However, with firmware, the RAID system is only
implemented at the beginning of the boot process. Once the OS has loaded, the
controller driver takes over RAID functionality. A firmware RAID controller is
not as pricey as a hardware option, but it puts more strain on the computer's
CPU. Firmware-based RAID is also called hardware-assisted software RAID,
hybrid model RAID and fake RAID.
Why data redundancy?
RAID 0
In this level, a striped array of disks is implemented. The data is broken down
into blocks and the blocks are distributed among disks. Each disk receives a
block of data to write/read in parallel. It enhances the speed and performance
of the storage device. There is no parity and backup in Level 0.
RAID 1
RAID 1 uses mirroring techniques. When data is sent to a RAID controller, it
sends a copy of data to all the disks in the array. RAID level 1 is also
called mirroring and provides 100% redundancy in case of a failure.
RAID 2
RAID 2 records Error Correction Code using Hamming distance for its data,
striped on different disks. Like level 0, each data bit in a word is recorded on a
separate disk and ECC codes of the data words are stored on a different set
disks. Due to its complex structure and high cost, RAID 2 is not commercially
available.
RAID 3
RAID 3 stripes the data onto multiple disks. The parity bit generated for data
word is stored on a different disk. This technique makes it to overcome single
disk failures.
RAID 4
In this level, an entire block of data is written onto data disks and then the
parity is generated and stored on a different disk. Note that level 3 uses byte-
level striping, whereas level 4 uses block-level striping. Both level 3 and level 4
require at least three disks to implement RAID.
RAID 5
RAID 5 writes whole data blocks onto different disks, but the parity bits
generated for data block stripe are distributed among all the data disks rather
than storing them on a different dedicated disk.
RAID 6
RAID 6 is an extension of level 5. In this level, two independent parities are
generated and stored in distributed fashion among multiple disks. Two
parities provide additional fault tolerance. This level requires at least four disk
drives to implement RAID.
Storage Access
ii. Sorted File Method –In this method, As the name itself
suggest whenever a new record has to be inserted, it is
always inserted in a sorted (ascending or descending)
manner. Sorting of records may be based on any primary key
or any other key.
Suppose we have four records in the heap R1, R5, R6, R4 and R3
and suppose a new record R2 has to be inserted in the heap then, since
the last data block i.e data block 3 is full it will be inserted in any of the
data blocks selected by the DBMS, lets say data block 1.
If we want to search, delete or update data in heap file Organization the
we will traverse the data from the beginning of the file till we get the
requested record. Thus if the database is very huge, searching, deleting
or updating the record will take a lot of time.
Pros –
• Fetching and retrieving records is faster than sequential record but
only in case of small databases.
• When there is a huge number of data needs to be loaded into the
database at a time, then this method of file Organization is best
suited.
Cons –
• Problem of unused memory blocks.
• Inefficient for larger databases.
In this method, there is no effort for searching and sorting the entire file. In
this method, each record will be stored randomly in the memory.
4. B+ File Organization
o When the two or more records are stored in the same file, it is known
as clusters. These files will have two or more tables in the same data
block, and key attributes which are used to map these tables together
are stored only once.
o This method reduces the cost of searching for various records in
different files.
o The cluster file organization is used when there is a frequent need for
joining the tables with the same condition. These joins will give only a
few records from both tables. In the given example, we are retrieving
the record for only particular departments. This method can't be used
to retrieve the record for the entire department.
File Operations
Operations on database files can be broadly classified into two categories −
• Update Operations
• Retrieval Operations
Update operations change the data values by insertion, deletion, or update.
Retrieval operations, on the other hand, do not alter the data but retrieve
them after optional conditional filtering. In both types of operations,
selection plays a significant role. Other than creation and deletion of a file,
there could be several operations, which can be done on files.
• Open − A file can be opened in one of the two modes, read
mode or write mode. In read mode, the operating system does not
allow anyone to alter data. In other words, data is read only. Files
opened in read mode can be shared among several entities. Write
mode allows data modification. Files opened in write mode can be
read but cannot be shared.
• Locate − Every file has a file pointer, which tells the current position
where the data is to be read or written. This pointer can be adjusted
accordingly. Using find (seek) operation, it can be moved forward or
backward.
• Read − By default, when files are opened in read mode, the file
pointer points to the beginning of the file. There are options where
the user can tell the operating system where to locate the file pointer
at the time of opening a file. The very next data to the file pointer is
read.
• Write − User can select to open a file in write mode, which enables
them to edit its contents. It can be deletion, insertion, or
modification. The file pointer can be located at the time of opening or
can be dynamically changed if the operating system allows to do so.
• Close − This is the most important operation from the operating
system’s point of view. When a request to close a file is generated, the
operating system
o removes all the locks (if in shared mode),
o saves the data (if altered) to the secondary storage media, and
o releases all the buffers and file handlers associated with the file.
The organization of data inside a file plays a major role here. The process to
locate the file pointer to a desired record inside a file various based on
whether the records are arranged sequentially or clustered.
A data dictionary is like the A-Z dictionary of the relational database system
holding all information of each relation in the database. Also known as Data
Dictionary or System Catalog or Meta data.
Example
<StudentPersonalDetails>
Student_ID Student_Name Student_Address Student_City
a. Active
In DBMS, an active data dictionary gets automatically updated by DBMS
when every database access occurs, and thus it keeps each access
information, up-to-date.
b. Passive
In DBMS, a passive data dictionary does not getautomatically updated and
often needs a batch method to run.
The access information of the Data Dictionary is mainly used for query
optimization purpose by DBMS. The main function of the Data Dictionary is
to store the report of all database objects. In DBMS, Integrated Data
Dictionary has a tendency to bind their metadata into the data.
Data Dictionary doesn't have any standard format to store the information.
But there are some features that are common.
• Data Elements:
Data Dictionary stores the definition of all the data elements. It stores
name, data types, display formats, internal storage formats, and validation
rules. It also explains the use of data, where an element gets used, who has
used it and so on.
• Tables:
Data Dictionary stores the name of the user who created the table, number
of rows and columns, date at which table has been created and authorized
access and so on.
• Index:
Data Dictionary stores the Indexes that are defined for database tables. In
every index, DBMS store index name used by the attributes, location,
characteristics of the index and the date of creation.
• Programs:
Data Dictionary stores the programs that are created to access database
including report, application and screen format, SQL queries and so on.
• Relationship between data elements:
Data Dictionary stores whether the relationship and compulsory or optional,
cardinality and connectivity and so on.
• Administrations and End Users:
Data Dictionary stores the information of all administrations and ends
users as well.
2 Marks
1. Define File
2. Define Storage
3. Types of storage
4. Define Disk and types
5. What is meant by RAID
6. Define Data Dictionary
7. Types of Data dictionary
5 Marks
1. Short notes on File organization
2. Short notes on Data dictionary
3. Explain types of storege.
10 marks
1. Discus about the types of file organizations