Answers to Selected Eerc!ses 18"18 - Consider a disk with block size B!1" b#tes$ % block pointer is &' b#tes long( and a record pointer is & ) * b#tes long$ % file has r+,(,,, -.&/01-- records of fixed-length$ -ach record has the following fields: 2%.- 3+, b#tes4( SS2 35 b#tes4( 6-&%)7.-27C06- 35 b#tes4( %66)-SS 38, b#tes4( &902- 35 b#tes4( BI)796%7- 38 b#tes4( S-: 31 b#te4( ;0BC06- 38 b#tes4( S%/%)1 38 b#tes( real nu<ber4$ %n additional b#te is used as a deletion <arker$ 3a4 Calculate the record size ) in b#tes$ 3b4 Calculate the blocking factor bfr and the nu<ber of file blocks b assu<ing an unspanned organization$ 3c4 Suppose the file is ordered b# the ke# field SS2 and we want to construct a pri<ar# index on SS2$ Calculate 3i4 the index blocking factor bfr i 3which is also the index fan-out fo4= 3ii4 the nu<ber of first-le>el index entries and the nu<ber of first-le>el index blocks= 3iii4 the nu<ber of le>els needed if we <ake it into a <ulti-le>el index= 3i>4 the total nu<ber of blocks re?uired b# the <ulti-le>el index= and 3>4 the nu<ber of block accesses needed to search for and retrie>e a record fro< the file--gi>en its SS2 >alue--using the pri<ar# index$ 3d4 Suppose the file is not ordered b# the ke# field SS2 and we want to construct a secondar# index on SS2$ )epeat the pre>ious exercise 3part c4 for the secondar# index and co<pare with the pri<ar# index$ 3e4 Suppose the file is not ordered b# the non-ke# field 6-&%)7.-27C06- and we want to construct a secondar# index on SS2 using 0ption + of Section 18$1$+( with an extra le>el of indirection that stores record pointers$ %ssu<e there are 1,,, distinct >alues of 6-&%)7.-27C06-( and that the -.&/01-- records are e>enl# distributed a<ong these >alues$ Calculate 3i4 the index blocking factor bfr i 3which is also the index fan-out fo4= 3ii4 the nu<ber of blocks needed b# the le>el of indirection that stores record pointers= 3iii4 the nu<ber of first-le>el index entries and the nu<ber of first-le>el index blocks= 3i>4 the nu<ber of le>els needed if we <ake it a <ulti-le>el index= 3>4 the total nu<ber of blocks re?uired b# the <ulti-le>el index and the blocks used in the extra le>el of indirection= and 3>i4 the approxi<ate nu<ber of block accesses needed to search for and retrie>e all records in the file ha>ing a specific 6-&%)7.-27C06- >alue using the index$ 3f4 Suppose the file is ordered b# the non-ke# field 6-&%)7.-27C06- and we want to construct a clustering index on 6-&%)7.-27C06- that uses block anchors 3e>er# new >alue of 6-&%)7.-27C06- starts at the beginning of a new block4$ %ssu<e there are 1,,, distinct >alues of 6-&%)7.-27C06-( and that the -.&/01-- records are e>enl# distributed a<ong these >alues$ Calculate 3i4 the index blocking factor bfr i 3which is also the index fan-out fo4= 3ii4 the nu<ber of first-le>el index entries and the nu<ber of first-le>el index blocks= 3iii4 the nu<ber of le>els needed if we <ake it a <ulti-le>el index= 3i>4 the total nu<ber of blocks re?uired b# the <ulti-le>el index= and 3>4 the nu<ber of block accesses needed to search for and retrie>e all records in the file ha>ing a specific 6-&%)7.-27C06- >alue using the clustering index 3assu<e that <ultiple blocks in a cluster are either contiguous or linked b# pointers4$ Copyright 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley. 1 Chapter 18: Indexing Structures for Files 3g4 Suppose the file is not ordered b# the ke# field Ssn and we want to construct a B @ - tree access structure 3index4 on SS2$ Calculate 3i4 the orders p and p leaf of the B @ -tree= 3ii4 the nu<ber of leaf-le>el blocks needed if blocks are approxi<atel# '5A full 3rounded up for con>enience4= 3iii4 the nu<ber of le>els needed if internal nodes are also '5A full 3rounded up for con>enience4= 3i>4 the total nu<ber of blocks re?uired b# the B @ -tree= and 3>4 the nu<ber of block accesses needed to search for and retrie>e a record fro< the file--gi>en its SS2 >alue-- using the B @ -tree$ Answer: 3a4 )ecord length ) 3+, @ 5 @ 5 @ 8, @ 5 @ 8 @ 1 @ 8 @ 84 @ 1 11! b#tes 3b4 Blocking factor bfr floor3BB)4 floor3!1"B11!4 8 records per block 2u<ber of blocks needed for file ceiling3rBbfr4 ceiling3+,,,,B84 *!,, 3c4 i$ Index record size ) i 3C SS2 @ &4 35 @ '4 1! b#tes Index blocking factor bfr i fo floor3BB) i 4 floor3!1"B1!4 +8 ii$ 2u<ber of first-le>el index entries r 1 nu<ber of file blocks b *!,, entries 2u<ber of first-le>el index blocks b 1 ceiling3r 1 Bbfr i 4 ceiling3*!,,B+84 ""1 blocks iii$ De can calculate the nu<ber of le>els as follows: 2u<ber of second-le>el index entries r " nu<ber of first-le>el blocks b 1 ""1 entries 2u<ber of second-le>el index blocks b " ceiling3r " Bbfr i 4 ceiling3""1B+84 * blocks 2u<ber of third-le>el index entries r + nu<ber of second-le>el index blocks b " * entries 2u<ber of third-le>el index blocks b + ceiling3r + Bbfr i 4 ceiling3*B+84 1 Since the third le>el has onl# one block( it is the top index le>el$ 9ence( the index has x + le>els i>$ 7otal nu<ber of blocks for the index b i b 1 @ b " @ b + ""1 @ * @ 1 ""5 blocks >$ 2u<ber of block accesses to search for a record x @ 1 + @ 1 8 3d4 i$ Index record size ) i 3C SS2 @ &4 35 @ '4 1! b#tes Index blocking factor bfr i 3fan-out4 fo floor3BB) i 4 floor3!1"B1!4 +8 index records per block 37his has not changed fro< part 3c4 abo>e4 3%lternati>e solution: 7he pre>ious solution assu<es that leaf-le>el index blocks contain block pointers= it is also possible to assu<e that the# contain record pointers( in which case the index record size would be C SS2 @ & ) 5 @ * 1' b#tes$ In this case( the calculations for leaf nodes in 3i4 below would then ha>e to use ) i 1' b#tes rather than ) i 1! b#tes( so we get: Index record size ) i 3C SS2 @ & ) 4 35 @ *4 1! b#tes /eaf-le>el ndex blocking factor bfr i floor3BB) i 4 floor3!1"B1'4 +" index records per block 9owe>er( for internal nodes( block pointers are alwa#s used so the fan-out for internal nodes fo would still be +8$4 ii$ 2u<ber of first-le>el index entries r 1 nu<ber of file records r +,,,, 2u<ber of first-le>el index blocks b 1 ceiling3r 1 Bbfr i 4 ceiling3+,,,,B+84 88+ blocks Copyright 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley. " Chapter 18: Indexing Structures for Files 3%lternati>e solution: 2u<ber of first-le>el index entries r 1 nu<ber of file records r +,,,, 2u<ber of first-le>el index blocks b 1 ceiling3r 1 Bbfr i 4 ceiling3+,,,,B+"4 5+8 blocks4 iii$ De can calculate the nu<ber of le>els as follows: 2u<ber of second-le>el index entries r " nu<ber of first-le>el index blocks b 1 88+ entries 2u<ber of second-le>el index blocks b " ceiling3r " Bbfr i 4 ceiling388+B+84 "' blocks 2u<ber of third-le>el index entries r + nu<ber of second-le>el index blocks b " "' entries 2u<ber of third-le>el index blocks b + ceiling3r + Bbfr i 4 ceiling3"'B+84 1 Since the third le>el has onl# one block( it is the top index le>el$ 9ence( the index has x + le>els 3%lternati>e solution: 2u<ber of second-le>el index entries r " nu<ber of first-le>el index blocks b 1 5+8 entries 2u<ber of second-le>el index blocks b " ceiling3r " Bbfr i 4 ceiling35+8B+84 "8 blocks 2u<ber of third-le>el index entries r + nu<ber of second-le>el index blocks b " "8 entries 2u<ber of third-le>el index blocks b + ceiling3r + Bbfr i 4 ceiling3"8B+84 1 Since the third le>el has onl# one block( it is the top index le>el$ 9ence( the index has x + le>els4 i>$ 7otal nu<ber of blocks for the index b i b 1 @ b " @ b + 88+ @ "' @ 1 51, 3%lternati>e solution: 7otal nu<ber of blocks for the index b i b 1 @ b " @ b + 5+8 @ "8 @ 1 58*4 >$ 2u<ber of block accesses to search for a record x @ 1 + @ 1 8 3e4 i$ Index record size ) i 3C 6-&%)7.-27C06- @ &4 35 @ '4 1! b#tes Index blocking factor bfr i 3fan-out4 fo floor3BB) i 4 floor3!1"B1!4 +8 index records per block ii$ 7here are 1,,, distinct >alues of 6-&%)7.-27C06-( so the a>erage nu<ber of records for each >alue is 3rB1,,,4 3+,,,,B1,,,4 +, Since a record pointer size & ) * b#tes( the nu<ber of b#tes needed at the le>el of indirection for each >alue of 6-&%)7.-27C06- is * E +, "1, b#tes( which fits in one block$ 9ence( 1,,, blocks are needed for the le>el of indirection$ iii$ 2u<ber of first-le>el index entries r 1 nu<ber of distinct >alues of 6-&%)7.-27C06- 1,,, entries 2u<ber of first-le>el index blocks b 1 ceiling3r 1 Bbfr i 4 ceiling31,,,B+84 +, blocks i>$ De can calculate the nu<ber of le>els as follows: 2u<ber of second-le>el index entries r " nu<ber of first-le>el index blocks b 1 +, entries 2u<ber of second-le>el index blocks b " ceiling3r " Bbfr i 4 ceiling3+,B+84 1 9ence( the index has x " le>els >$ total nu<ber of blocks for the index b i b 1 @ b " @ b indirection +, @ 1 @ 1,,, 1,+1 blocks >i$ 2u<ber of block accesses to search for and retrie>e the block containing the record pointers at the le>el of indirection x @ 1 " @ 1 + block accesses If we assu<e that the +, records are distributed o>er +, distinct blocks( we need an additional +, block accesses to retrie>e all +, records$ 9ence( total block accesses needed on a>erage to retrie>e all the records with a gi>en >alue for Copyright 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley. + Chapter 18: Indexing Structures for Files 6-&%)7.-27C06- x @ 1 @ +, ++ 3f4 i$ Index record size ) i 3C 6-&%)7.-27C06- @ &4 35 @ '4 1! b#tes Index blocking factor bfr i 3fan-out4 fo floor3BB) i 4 floor3!1"B1!4 +8 index records per block ii$ 2u<ber of first-le>el index entries r 1 nu<ber of distinct 6-&%)7.-27C06- >alues 1,,, entries 2u<ber of first-le>el index blocks b 1 ceiling3r 1 Bbfr i 4 ceiling31,,,B+84 +, blocks iii$ De can calculate the nu<ber of le>els as follows: 2u<ber of second-le>el index entries r " nu<ber of first-le>el index blocks b 1 +, entries 2u<ber of second-le>el index blocks b " ceiling3r " Bbfr i 4 ceiling3+,B+84 1 Since the second le>el has one block( it is the top index le>el$ 9ence( the index has x " le>els i>$ 7otal nu<ber of blocks for the index b i b 1 @ b " +, @ 1 +1 blocks >$ 2u<ber of block accesses to search for the first block in the cluster of blocks x @ 1 " @ 1 + 7he +, records are clustered in ceiling3+,Bbfr4 ceiling3+,B84 8 blocks$ 9ence( total block accesses needed on a>erage to retrie>e all the records with a gi>en 6-&%)7.-27C06- x @ 8 " @ 8 1, block accesses 3g4 i$ For a B @ -tree of order p( the following ine?ualit# <ust be satisfied for each internal tree node: 3p E &4 @ 33p - 14 E C SS2 4 F B( or 3p E '4 @ 33p - 14 E 54 F !1"( which gi>es 1!p F !"1( so p+8 For leaf nodes( assu<ing that record pointers are included in the leaf nodes( the following ine?ualit# <ust be satisfied: 3p leaf E 3C SS2 @& ) 44 @ & F B( or 3p leaf E 35@*44 @ ' F !1"( which gi>es 1'p leaf F !,'( so p leaf +1 ii$ %ssu<ing that nodes are '5A full on the a>erage( the a>erage nu<ber of ke# >alues in a leaf node is ,$'5Ep leaf ,$'5E+1 "1$+5$ If we round this up for con>enience( we get "" ke# >alues 3and "" record pointers4 per leaf node$ Since the file has +,,,, records and hence +,,,, >alues of SS2( the nu<ber of leaf-le>el nodes 3blocks4 needed is b 1 ceiling3+,,,,B""4 1+'8 blocks iii$ De can calculate the nu<ber of le>els as follows: 7he a>erage fan-out for the internal nodes 3rounded up for con>enience4 is fo ceiling3,$'5Ep4 ceiling3,$'5E+84 ceiling3"+$8'4 "8 nu<ber of second-le>el tree blocks b " ceiling3b 1 Bfo4 ceiling31+'8B"84 !* blocks nu<ber of third-le>el tree blocks b + ceiling3b " Bfo4 ceiling3!*B"84 + nu<ber of fourth-le>el tree blocks b 8 ceiling3b + Bfo4 ceiling3+B"84 1 Since the fourth le>el has onl# one block( the tree has x 8 le>els 3counting the leaf le>el4$ 2ote: De could use the for<ula: x ceiling3log fo 3b 1 44 @ 1 ceiling3log "8 1+'84 @ 1 + @ 1 8 le>els i>$ total nu<ber of blocks for the tree b i b 1 @ b " @ b + @ b 8 1+'8 @ !* @ + @ 1 18"! blocks >$ nu<ber of block accesses to search for a record x @ 1 8 @ 1 ! 18"1# - % &%)7S file with &artG as ke# field includes records with the following &artG >alues: "+( '!( +*( ',( 8'( 5"( 88( *1( !'( !5( 18( "1( 1,( *8( *8( 1!( 1'( ",( "8( "8( +5( 8+( 8*( !,( '5( *!( 8( 85( ++( +8$ Suppose the search field >alues are inserted in the gi>en order in a B @ Copyright 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley. 8 Chapter 18: Indexing Structures for Files -tree of order p8 and p leaf += show how the tree will expand and what the final tree looks like$ Answer: % B @ -tree of order p8 i<plies that each internal node in the tree 3except possibl# the root4 should ha>e at least " ke#s 3+ pointers4 and at <ost 8 pointers$ For p leaf +( leaf nodes <ust ha>e at least " ke#s and at <ost + ke#s$ 7he figure on page !, shows how the tree progresses as the ke#s are inserted$ De will onl# show a new tree when insertion causes a split of one of the leaf nodes( and then show how the split propagates up the tree$ 9ence( step 1 below shows the tree after insertion of the first + ke#s "+( '!( and +*( and before inserting ', which causes o>erflow and splitting$ 7he trees gi>en below show how the ke#s are inserted in order$ Below( we gi>e the ke#s inserted for each tree: 1 :"+( '!( +*= ":',= +:8'= 8:5"= !:88( *1= ':!'= *:!5( 18= 8:"1= 5:1,= 1,:* 8 = 11:*8= 1":1!= 1+:1'= 18:",= 1!:"8= 1':"8( +5= 1*:8+( 8*= 18:!,( '5= 15:* ! = ",:8( 85( ++( +8= 18"$%: 2o solution pro>ided$ 18"$1 - Suppose that the following search field >alues are deleted( in the gi>en order( fro< the B @ -tree of -xercise 18$15( show how the tree will shrink and show the final tree$ 7he deleted >alues are: '!( *!( 8+( 18( ",( 5"( !5( +*$ Answer: %n i<portant note about a deletion algorith< for a B @ -tree is that deletion of a ke# >alue fro< a leaf node will result in a reorganization of the tree if: 3i4 7he leaf node is less than half full= in this case( we will co<bine it with the next leaf node 3other algorith<s co<bine it with either the next or the pre>ious leaf nodes( or both4( 3ii4 If the ke# >alue deleted is the right<ost 3last4 >alue in the leaf node( in which case its >alue will appear in an internal node= in this case( the ke# >alue to the left of the deleted ke# in the left node replaces the deleted ke# >alue in the internal node$ Following is what happens to the tree nu<ber 15 after the specified deletions 3not tree nu<ber ",4: 6eleting '! will onl# affect the leaf node$ 6eleting *! will cause a leaf node to be less than half full( so it is co<bined with the next node= also( *! is re<o>ed fro< the internal node leading to the following tree: 6eleting 8+ causes a leaf node to be less than half full( and it is co<bined with the next node$ Since the next node has + entries( its right<ost 3first4 entr# 8' can replace 8+ in Copyright 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley. ! Chapter 18: Indexing Structures for Files both the leaf and internal nodes( leading to the following tree: 2ext( we delete 18( which is a right<ost entr# in a leaf node and hence appears in an internal node of the B @ -tree$ 7he leaf node is now less than half full( and is co<bined with the next node$ 7he >alue 18 <ust also be re<o>ed fro< the internal node( causing underflow in the internal node$ 0ne approach for dealing with underflow in internal nodes is to reorganize the >alues of the underflow node with its child nodes( so "1 is <o>ed up into the underflow node leading to the following tree: 6eleting ", and 5" will not cause underflow$ 6eleting !5 causes underflow( and the re<aining >alue ', is co<bined with the next leaf node$ 9ence( ', is no longer a right<ost entr# in a leaf node and <ust be re<o>ed fro< the internal node$ 7his is nor<all# done b# <o>ing !' up to replace ', in the internal node( but since this leads to underflow in the node that used to contain !'( the nodes can be reorganized as follows: Copyright 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley. ' Chapter 18: Indexing Structures for Files Finall#( re<o>ing +* causes serious underflow( leding to a reorganization of the whole tree$ 0ne approach to deleting the >alue in the root node is to use the right<ost >alue in the next leaf node 3the first leaf node in the right subtree4 to replace the root( and <o>e this leaf node to the left subtree$ In this case( the resulting tree <a# look as follows: 18"$$ & 18"$8: 2o solutions pro>ided$ Copyright 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley. *