Numerical Based On Indexing: Problem 1.2

Numerical Based on Indexing
Problem 1.1
Consider a disk with a sector size of 512 bytes, 1,000 tracks per surface, 100 sectors per track, 5
double-sided platters and a block size of 2,048 bytes. Suppose that the average seek time is 5 msec,
the average rotational delay is 5 msec, and the transfer rate is 100 MB per second. Suppose that a file
containing 1,000,000 records of 100 bytes each is to be stored on such a disk and that no record is
allowed to span two blocks.
a) How many records fit onto a block?
2048/100 = 20. We can have at most 20 records in a block.
b) How many blocks are required to store the entire file? If the file is arranged sequentially
(according to the ‘next block concept’) on disk, how many cylinders are needed?
1,000,000/20=50,000 blocks are required to store the entire file.
A track has 25 blocks, a cylinder has 25*10=250 blocks. Therefore, we need 50,000/250=200
cylinders to store the file sequentially.
c) How many records of 100 bytes each can be stored using this disk?
The disk has 1000 cylinders with 250 blocks each, i.e. it has 250,000 blocks. A block contains
20 records. Thus, the disk can store 5,000,000 records.
d) If blocks of the file are stored on disk according to the next block concept, with the first block
on block 1 of track 1, what is the number of the block stored on block 1 of track 1 on the next
disk surface?
There are 25 blocks in each track. It is block 26 on block 1 of track 1 on the next disk surface.
e) What is the time required to read the file sequentially?

We need to read 200 cylinders. In order to read one cylinder, we have 1 seek, no rotational
delays and the transfer of 250 blocks = 500KB. Therefore, the time to read one cylinder is
5msec + 500K/100M sec = 5msec + 5 msec = 10 msec. The time to read the entire file is
200*10msec =2sec.
f) What is the time required to read the file in random order? Note that in order to read a record,
the block containing the record has to be fetched from disk.
In random access, the read of every block requires an average seek time of 5 msec, an average
rotational delay of 5 msec and a transfer time of 2K/100M sec = 0.02 msec, i.e. 10.02 msec.
Therefore, the time to read the entire file is 50,000*10.02msec ~ 500sec.
Problem 1.2
Suppose that a file has r = 100,000 STUDENT records with the following fields:
NAME (30 bytes), SSN (9 bytes), ADDRESS (40 bytes), PHONE (9 bytes), BIRTHDATE (8 bytes),
SEX (1 byte), MAJORDEPTCODE (4 bytes), MINORDEPTCODE (4 bytes), CLASSCODE (4 bytes,
integer), and DEGREEPROGRAM (3 bytes).
The fields are of fixed-length.

Suppose only 75% of the STUDENT records have a value for PHONE, 80% for MAJORDEPTCODE,
15% for MINORDEPTCODE, and 95% for DEGREEPROGRAM, and 100% of the STUDENT records
have a value for the other fields. We use a variable-length record format. Each record has a 2-byte field
type for each field occurring in the record, plus the 1-byte deletion marker and a 1-byte end-of-record
marker. Suppose we use a spanned record organization, where each block has a 5-byte pointer to the next
block (this space is not used for record storage). Each block contains 1,024 bytes.
a) Calculate the average record length R in bytes.

Assuming that every field has a 2-byte field type, and that the fields not mentioned above (NAME,
SSN, ADDRESS, BIRTHDATE, SEX, CLASSCODE) have values in every record, we need the
following number of bytes for these fields in each record, plus 1 byte for the deletion marker, and 1
byte for the end-of-record marker:
R fixed = (30+2) + (9+2) + (40+2) + (8+2) + (1+2) + (4+2) +1+1 = 106 bytes
For the other fields (PHONE, MAJORDEPTCODE, MINORDEPTCODE DEGREEPROGRAM),

the average number of bytes per record is:
R variable = ((9+2)*0.75)+((4+2)*0.80)+((4+2)*0.15)+((3+2)*0.95) = 8.25+4.8+0.9+4.75 = 18.7
bytes
The average record size R = R fixed + R variable = 106 + 18.7 = 124.7 bytes
The total number of bytes needed for the whole file is r * R = 100,000 * 124.7 = 12,470,000 bytes.
b) Calculate the number of blocks needed for the file.

Using a spanned record organization with a 5-byte pointer at the end of each block, the number of
bytes available in each block is B = 1024 - 5 = 1019 bytes.
The number of blocks b needed for the file is:

b = ceiling((r * R) / B) = ceiling(12470000 / 1019) = 12,238 blocks
Problem 1.3
Consider a disk with block size B = 512 bytes. A block pointer is P = 6 bytes long, and a record
pointer is PR = 7 bytes long. A file has r = 30,000 EMPLOYEE records of fixed length. Each record
has the following fields: Name (30 bytes),Ssn (9 bytes), Department_code (9 bytes), Address (40
bytes), Phone (10 bytes), Birth_date (8 bytes), Sex (1 byte), Job_code (4 bytes), and Salary (4 bytes,
real number). An additional byte is used as a deletion marker.
a. Calculate the record size R in bytes.

Record length R = (30 + 9 + 9 + 40 + 9 + 8 + 1 + 4 + 4) + 1 = 115 bytes
b. Calculate the blocking factor bfr and the number of file blocks b, assuming an unspanned
organization.
Blocking factor bfr = floor (B/R) = floor (512/115) = 4 records per block
Number of blocks needed for file = ceiling(r/bfr) = ceiling (30000/4) = 7500
c. Suppose that the file is ordered by the key field Ssn and we want to construct a primary
index on Ssn. Calculate
(i) The index blocking factor bfri (which is also the index fan-out fo)
Index record size R i = (V SSN + P) = (9 + 6) = 15 bytes
Index blocking factor bfr i = fo = floor (B/R i) = floor (512/15) = 34
(ii) The number of first-level index entries and the number of first-level index blocks
Number of first-level index entries r1 = number of file blocks b = 7500 entries
Number of first-level index blocks b1 = ceiling (r1 / bfr i) = ceiling (7500/34) = 221 blocks
(iii) The number of levels needed if we make it into a multilevel index

Number of second-level index entries r2 = number of first-level blocks b 1= 221 entries
Number of second-level index blocks b2= ceiling (r2 /bfr i) = ceiling (221/34) = 7 blocks
Number of third-level index entries r3 = number of second-level index blocks b2 = 7 entries
Number of third-level index blocks b3 = ceiling (r3 /bfr i) = ceiling (7/34) = 1
Since the third level has only one block, it is the top index level. Hence, the index has x = 3
levels
(iv) The total number of blocks required by the multilevel index

Total number of blocks for the index bi = b 1 + b 2 + b 3 = 221 + 7 + 1 = 229 blocks
(v) The number of block accesses needed to search for and retrieve a record from the file—
given its Ssn value—using the primary index
Number of block accesses to search for a record = x + 1 = 3 + 1 = 4
d. Suppose that the file is not ordered by the key field Ssn and we want to construct a
secondary index on Ssn. Repeat the previous exercise (part c) for the secondary index and
compare with the primary index.
(Same as C )
e. Suppose that the file is ordered by the nonkey field Department_code and we want to
construct a clustering index on Department_code that uses block anchors (every new value of
Department_code starts at the beginning of a new block). Assume there are 1,000 distinct
values of Department_code and that the EMPLOYEE records are evenly distributed among
these values. Calculate
(i) the index blocking factor bfri (which is also the index fan-out fo)
Index record size Ri = (V DEPARTMENTCODE + P) = (9 + 6) = 15 bytes
Index blocking factor bfr i = (fan-out) fo = floor (B/Ri) = floor (512/15)= 34 index records per
block
(ii) The number of first-level index entries and the number of first-level index blocks
No. of first-level index entries r1 = no. of distinct DEPARTMENTCODE values= 1000 entries
Number of first-level index blocks b1 = ceiling(r 1 /bfr i) = ceiling (1000/34) = 30 blocks
(iii) The number of levels needed if we make it into a multilevel index

We can calculate the number of levels as follows:
Number of second-level index entries r 2 = number of first-level index blocks b1= 30 entries
Number of second-level index blocks b 2 = ceiling(r 2 /bfr i) = ceiling (30/34) = 1
Since the second level has one block, it is the top index level.
Hence, the index has x = 2 levels
(iv) The total number of blocks required by the multilevel index

Total number of blocks for the index b i = b 1 + b 2 = 30 + 1 = 31 blocks
(v) The number of block accesses needed to search for and retrieve all records in the file that
have a specific Department_code value, using the clustering index (assume that multiple blocks
in a cluster are contiguous).
Number of block accesses to search for the first block in the cluster of blocks = x + 1 = 2 + 1 = 3
The 30 records are clustered in ceiling (30/bfr) = ceiling (30/4) = 8 blocks.
Hence, total block accesses needed on average to retrieve all the records with a given
DEPARTMENTCODE = x + 8 = 2 + 8 = 10 block accesses.

Numerical Based On Indexing: Problem 1.2

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Numerical Based On Indexing: Problem 1.2

Uploaded by

Copyright:

Available Formats

Numerical Based on Indexing

e) What is the time required to read the file sequentially?

The fields are of fixed-length.

a) Calculate the average record length R in bytes.

For the other fields (PHONE, MAJORDEPTCODE, MINORDEPTCODE DEGREEPROGRAM),

b) Calculate the number of blocks needed for the file.

The number of blocks b needed for the file is:

a. Calculate the record size R in bytes.

(iii) The number of levels needed if we make it into a multilevel index

(iv) The total number of blocks required by the multilevel index

(iii) The number of levels needed if we make it into a multilevel index

(iv) The total number of blocks required by the multilevel index

You might also like