Professional Documents
Culture Documents
Introduction To RDBMS
Introduction To RDBMS
Introduction To RDBMS
Data
Facts or ideas recorded onto some media suitable for future processing
Database
a collection of related data that are given a relational or structural foundation for
efficient storage and control, authorized and easy accessibility, purposive searching
and retrieval, convenient use and transmission of the data.
DBMS
a collection of interrelated data and a set of programs that provides a data
management environment for efficient storage and control, authorized and easy
accessibility, purposive searching and retrieval, convenient use and transmission of
the data.
The relational model was formally introduced by Dr. E. F. Codd in 1970 and has
evolved since then, through a series of writings. The relational model represents
data in the form of two-dimension tables.
o Derived attribute: An attribute that does not physically exist within the
entity and is calculated via an algorithm from the value of other
attribute(s). Student: semester_fee =(tuition fee+ session charge+
exam fee - Discount)
Domain constraint: Each attribute takes values from a domain. The domain
constraint refers to the definition of domain which means specifying the
type, width, whether null values are allowed for that domain or not and
values of compatible domains.
i. Data type: Specify the supported data type for the column of the relation.
ii. Length: Specify the length of data.
iii. Unique: Prevents duplicates in non-primary keys and ensures that an
index is created to enhance performance. Null values are allowed.
iv. Not null: Resists the user in case of skipping data entry of required
attribute.
v. Check: Specifies a validity rule for the data values in a column.
DATA STORAGE FORMAT ON DISK
A disk is divided into concentric circles called tracks where data is stored. A
block is a collection of contiguous bytes of a single track. A block contains a
number of records depending on the block size and record size and a file is a
collection of records, which are mapped into disk blocks.
Track format
Each track has an index point, which is a special mark to identify the beginning
of each track. Since the track is circular, it also identifies the end of the track.
The home address (HA) identifies the cylinder and the number of the read/write
head that services the track, as well as the condition of the track (flag)- whether
it is operative or defective. If the track is defective, an alternative track to be
used is indicated. A two-byte cyclic check is included as a means of error
detection in input/output operations.
Gaps (G) separate the different areas on the track. The length of the gap may
vary with the device, the location of the gap, and the length of the preceding
area. The gap that follows the index point is different in length from the gap that
follows the home address, and the length of the gap that follows a record
depends on the length of that record. The reason for this is to provide adequate
time for required equipment functions that are necessary as the gap rotates past
the read/write head. These functions may vary with the type of area that has just
preceded the gap.
The address marker (A) is a two-byte segment supplied by the control unit (the
hardware that controls the disk drive) as the record is written. It enables the
control unit to locate the beginning of the record at a later time.
The count area is detailed in Figure 10.5. The flag field repeats the information
about the track condition and adds information used by the control unit. The
cylinder number, head number, and record number fields collectively provide a
unique identification for the record. The key-length field is a one-byte field. It
always contains a 0 for a record of the count-data format. The data-length field
supplies two bytes, which specify the number of bytes in the data area of the
record, excluding the cyclic check. The cyclic check provides two bytes for
error detection.
Record Format
Physical records, or blocks, can be stored on tracks in any of the four formats.
Fixed length unblocked: There will one logical record for each
physical record- the data that are actually stored in the record area
of the track. For this we need to use a header based structure with
additional space in records for pointers to other vacant fields. In this
scheme the first record of a file is known as the file header which
contains some information of the file, among which one is the
address of the first available (free) record. And the first such record
will store the address of the next available record in its pointer field
and so on.
Fixed length blocked: More than one logical record will comprise
each physical record. In this case the key area is typically assigned
the key of the highest record of the block. Suppose that we have
two succeeding blocks containing records 10,12,14, and 15,19,24,
respectively. If the operating system is seeking logical record 15,
the key for the first block will read 14, so record 15 cannot be in
that block. The key for the next block will read 24. Since 24 is
greater than 15, record 15 must be in that block. The entire block is
then read into main memory where it is searched for record 15.
ii) Record types that all allow variable lengths for one or more
fields
Because the record length is not uniform, a method of indicating where the
record ends is required. This information is provided by the BL (block-length)
and RL (record-length) areas. There are several techniques for implementing
variable length records. Of them byte string representation and slotted page
structure are prominent:
i) Byte string representation: The simple method of implementing variable
length records is to attach a special end-of-record (⊥) to the end of each record.
An alternative version byte string representation stores the record length at the
beginning of each record in stead of end-of-record symbol. it is easy but has
some disadvantages:
ii) Slotted page structure: There is a header at the beginning of each block
containing the following information:
iii. An array whose entries contain the location and size of each record.
The actual records are allocated contiguously in the block, starting from the end
of the block. The free space in the block in contiguous, between the final entity
in the header array and the first record. If the record is inserted, space is
allocated for it at the end of the free space and an entity containing the size and
location is added to the header.
Heap or Pile
Sequential
Indexed Sequential
B - Tree Indexing
Heap or Pile
Sequential file
Indexed-sequential file
Sparse index - Here the index file will contain some of the records instead of all
the records. To find a record, we search the index record with the largest search
key value that is less than or equal to the search key value which we are looking
for. We start at that point and continue sequentially in the data file to find the
desired record.
Multilevel Index
The index file itself may be too large even if we use sparse indexing. We can
use another level of indexing for index files to increase the searching speed.
This technique is known as multilevel sparse indexing. Fig 11.13 is shows the
multilevel spare indexing structure.
Secondary Index
We often want to search records based on attributes other than the primary- key.
For this we create an index structure on those attributes. Index on non- prime
attributes is known as a secondary index. Assume that in the book file acc_no is
the primary key and we wish to retrieve records using the attribute title. In the
index file records are arranged depending on title but in the data file records are
arranged on acc To link records in the data block we have to use another set of
pointers as shown in Fig. 11.14.
p1, p2, p3 are 3 pointers. p1 points to record with acc_no 5272, p points to
nodes with ace_no> 5272 and 5384 and p3 points to nodes with acc_no> 5384.
Records in a node are stored in sorted order. The top most level of a tree is
known as root and nodes of lowest level are known as leaf nodes. The pointer of
leaf node directly points to actual data records in the data file. An index tree
structure is shown in Fig. 11.17.