Download as pdf or txt
Download as pdf or txt
You are on page 1of 39

H QUN TR C S D LIU

CHNG III
LU TR V CU TRC TP TIN
(Storage and File Structure)

MC CH
Chng ny trnh by cc vn lin quan n vn lu tr d liu (trn lu tr ngoi,
ch yu trn a cng). Vic lu tr d liu phi c t chc sao cho c th ct gi mt lng
ln, c th rt ln d liu nhng quan trng hn c l s lu tr phi cho php ly li d liu cn
thit mau chng. Cc cu trc tr gip cho truy xut nhanh d liu c trnh by l: ch
mc (indice), B+ cy (B+-tree), bm (hashing) ... Cc thit b lu tr (a) c th b hng hc
khng lng trc, cc k thut RAID cho ra mt gii php hiu qu cho vn ny.
YU CU
Hiu r cc c im ca cc thit b lu tr, cch t chc lu tr, truy xut a.
Hiu r nguyn l v k thut ca t chc h thng a RAID
Hiu r cc k thut t chc cc mu tin trong file
Hiu r cc k thut t chc file
Hiu v vn dng cc k thut h tr tm li nhanh thng tin: ch mc (c sp, B+-cy,
bm)

CHNG III. LU TR V CU TRC TP TIN

trang

34

H QUN TR C S D LIU

KHI QUT V PHNG TIN LU TR VT L


C mt s kiu lu tr d liu trong cc h thng my tnh. Cc phng tin lu tr c
phn lp theo tc truy xut, theo gi c v theo tin cy ca phng tin. Cc phng tin
hin c l:

Cache: l dng lu tr nhanh nht v cng t nht trong cc phng tin lu tr. B
nh cache nh; s s dng n c qun tr bi h iu hnh

B nh chnh (main memory): Phng tin lu tr dng lu tr d liu sn sng


c thc hin. Cc ch th my mc ch chung (general-purpose) hot ng trn b nh
chnh. Mc du b nh chnh c th cha nhiu megabytes d liu, n vn l qu nh (v
qu t gi) lu tr ton b mt c s d liu. Ni dung trong b nh chnh thng b
mt khi mt cp ngun

B nh Flash: c bit nh b nh ch c c th lp trnh, c th xo (EEPROM:


Electrically Erasable Programmable Read-Only Memory), B nh Flash khc b nh
chnh ch d liu cn tn ti trong b nh flash khi mt cp ngun. c d liu t b
nh flash mt t hn 100 ns , nhanh nh c d liu t b nh chnh. Tuy nhin, vit d
liu vo b nh flash phc tp hn nhiu. D liu c vit (mt ln mt khong 4 n
10 s) nhng khng th vit trc tip. vit b nh c vit, ta phi xo
trng ton b b nh sau mi c th vit ln n.

Lu tr a t (magnetic-disk): ( y, c hiu l a cng) Phng tin cn bn


lu tr d liu trc tuyn, lu di. Thng ton b c s d liu c lu tr trn a t.
D liu phi c chuyn t a vo b nh chnh trc khi c truy nhp. Khi d liu
trong b nh chnh ny b sa i, n phi c vit ln a. Lu tr a c xem l
truy xut trc tip v c th c d liu trn a theo mt th t bt k. Lu tr a vn
tn ti khi mt cp ngun. Lu tr a c th b hng hc, tuy khng thng xuyn.

Lu tr quang (Optical storage): Dng quen thuc nht ca a quang hc l loi a


CD-ROM : Compact-Disk Read-Only Memory. D liu c lu tr trn cc a quang
hc c c bi laser. Cc a quang hc CD-ROM ch c th dc. Cc phin bn khc
ca chng l loi a quang hc: vit mt ln, c nhiu ln (write-once, read-many:
WORM) cho php vit d liu ln a mt ln, khng cho php xo v vit li, v cc
a c th vit li (rewritable) v..v

Lu tr bng t (tape storage): Lu tr bng t thng dng backup d liu. Bng


t r hn a, truy xut d liu chm hn (v phi truy xut tun t). Bng t thng c
dung lng rt ln.

Cc phng tin lu tr c th c t chc phn cp theo tc truy xut v gi c.


Mc cao nht l nhanh nht nhng cng l t nht, gim dn xung cc mc thp hn.
Cc phng tin lu tr nhanh (cache, b nh chnh) c xem nh l lu tr
s cp (primary storage), cc thit b lu tr mc thp hn nh a t c xem nh lu tr th
cp hay lu tr trc tuyn (on-line storage), cn cc thit b lu tr mc thp nht v gn thp
nht nh a quang hc, bng t k c cc a mm c xp vo lu tr tam cp hay lu tr
khng trc tuyn (off-line).
Bn cnh vn tc v gi c, ta cn phi xt n tnh lu bn ca cc phng tin lu
tr.
CHNG III. LU TR V CU TRC TP TIN

trang

35

H QUN TR C S D LIU

C h
M i M
Fl h M
M

i di k

O i l di k
M

i
Phn cp thit b lu tr

A T
C TRNG VT L CA A
Mi tm a c dng hnh trn, hai mt ca n c ph bi vt liu t tnh, thng tin
c ghi trn b mt a. a gm nhiu tm a. Ta s s dng thut ng a ch cc a cng.
Khi a c s dng, mt ng c a lm quay n mt tc khng i. Mt u
c-vit c nh v trn b mt ca tm a. B mt tm a c chia logic thnh cc rnh, mi
rnh li c chia thnh cc sector, mt sector l mt n v thng tin nh c th c c, vit
ln a. Tu thuc vo kiu a, sector thay i t 32 bytes n 4095 bytes, thng thng l 512
bytes. C t 4 n 32 sectors trn mt rnh, t 20 n 1500 rnh trn mt b mt. Mi b mt ca
mt tm a c mt u c vit, n c th chy dc theo bn knh a truy cp n cc rnh
khc nhau. Mt a gm nhiu tm a, cc u c-vit ca tt c cc rnh c gn vo mt b
c gi l cnh tay a, di chuyn cng nhau. Cc tm a c gn vo mt trc quay. V cc
u c-vit trn cc tm a di chuyn cng nhau, nn khi u c-vit trn mt tm a ang
rnh th i th cc u c-vit ca cc tm a khc cng rnh th i , do vy cc rnh th i ca
tt c cc tm a c gi l tr (cylinder) th i . Mt b iu khin a -- giao din gia h
thng my tnh v phn cng hin thi ca a. N chp nhn cc lnh mc cao c v vit
mt sector, v khi ng cc hnh ng nh di chuyn cnh tay a n cc rnh ng v c vit
d liu. b iu khin a cng tham gia vo checksum mi sector c vit. Checksum c
tnh t d liu c vit ln sector. Khi sector c c li, checksum c tnh li t d liu
c ly ra v so snh vi checksum lu tr. Nu d liu b sai lc, checksum c tnh s
khng khp vi checksum lu tr. Nu li nh vy xy ra, b iu khin s lp li vic c vi
ln, nu li vn xy ra, b iu khin s thng bo vic c tht bi. B iu khin a cn c
CHNG III. LU TR V CU TRC TP TIN

trang

36

H QUN TR C S D LIU

chc nng ti nh x cc sector xu: nh x cc sector xu n mt v tr vt l khc. Hnh di


by t cc a c ni vi mt h thng my tnh:

System bus

Disk
controller
Disks

Cc a c ni vi mt h thng my tnh hoc mt b iu khin a qua mt s hp


nht tc cao. Hp nht h thng my tnh nh (Small Computer-System Interconnect: SCSI)
thng c s dng ni kt cc a vi cc my tnh c nhn v workstation. Mainframe v
cc h thng server thng c cc bus nhanh hn v t hn ni vi cc a.
Cc u c-vit c gi st vi b mt a nh c th tng dy c (density).
a u c nh (Fixed-head) c mt u ring bit cho mi rnh, s sp xp ny cho php
my tnh chuyn t rnh ny sang rnh khc mau chng, khng phi di chuyn u c-vit. Tuy
nhin, cn mt s rt ln u c-vit, iu ny lm nng gi ca thit b.

O LNG HIU NNG CA A


Cc tiu chun o lng cht lung chnh ca a l dung lng, thi gian truy xut, tc
truyn d liu v tin cy.
-

Thi gian truy xut (access time): l khong thi gian t khi yu cu c/vit c
pht i n khi bt u truyn d liu. truy xut d liu trn mt sector cho ca
mt a, u tin cnh tay a phi di chuyn n rnh ng, sau phi ch sector
xut hin di n, thi gian nh v cnh tay c gi l thi gian tm kim (seek
time), n t l vi khong cch m cnh tay phi di chuyn, thi gian tm kim nm
trong khong 2..30 ms tu thuc vo rnh xa hay gn v tr cnh tay hin ti.

Thi gian tm kim trung bnh (average seek time): Thi gian tm kim trung bnh
l trung bnh ca thi gian tm kim, c o lung trn mt dy cc yu cu ngu
nhin (phn phi u), v bng khong 1/3 thi gian tm kim trong trng hp xu
nht.

Thi gian tim n lun chuyn (rotational latency time): Thi gian ch sector c
truy xut xut hin di u c/vit. Tc quay ca a nm trong khong 60..120
vng quay trn giy, trung bnh cn na vng quay sector cn thit nm di u
c/vit. Nh vy, thi gian tim n trung bnh (average latency time) bng na thi
gian quay mt vng a.

Thi gian truy xut bng tng ca thi gian tm kim v thi gian tim n v nm trong
khong 10..40 ms.
-

Tc truyn d liu: l tc d liu c th c ly ra t a hoc c lu tr


vo a. Hin nay tc ny vo khong1..5 Mbps

CHNG III. LU TR V CU TRC TP TIN

trang

37

H QUN TR C S D LIU

Thi gian trung bnh khng s c (mean time to failure): lng thi gian trung bnh
h thng chy lin tc khng c bt k s c no. Cc a hin nay c thi gian khng
s c trung bnh khong 30000 .. 800000 gi ngha l khong t 3,4 n 91 nm.

TI U HA TRUY XUT KHI A (disk-block)


Yu cu I/O a c sinh ra c bi h thng file ln b qun tr b nh o trong hu ht
cc h iu hnh. Mi yu cu xc nh a ch trn a c tham kho, a ch ny dng s
khi. Mt khi l mt dy cc sector k nhau trn mt rnh. Kch c khi trong khong 512 bytes
n mt vi Kbytes. D liu c truyn gia a v b nh chnh theo n v khi. Mc thp
hn ca b qun tr h thng file s chuyn i a ch khi sang s ca tr, ca mt v ca sector
mc phn cng.
Truy xut d liu trn a chm hn nhiu so vi truy xut d liu trong b nh chnh, do
vy cn thit mt chin lc nhm nng cao tc truy xut khi a. Di y ta s tho lun
mt vi k thut nhm vo mc ch .
-

Scheduling: Nu mt vi khi ca mt tr cn c truyn t a vo b nh chnh, ta


c th tit kim thi gian truy xut bi yu cu cc khi theo th t m n chy qua
di u c/vit. Nu cc khi mong mun trn cc tr khc nhau, ta yu cu cc
khi theo th t sao cho lm ti thiu s di chuyn cnh tay a. Cc thut ton
scheduling cnh tay a (Disk-arm-scheduling) nhm lp th t truy xut cc rnh theo
cch lm tng s truy xut c th c x l. Mt thut ton thng dng l thut ton
thang my (elevator algorithm): Gi s ban u cnh tay di chuyn t rnh trong nht
hng ra pha ngoi a, i vi mi rnh c yu cu truy xut, n dng li, phc v
yu cu i vi rnh ny, sau tip tc di chuyn ra pha ngoi n tn khi khng c
yu cu no ch cc rnh xa hn pha ngoi. Ti im ny, cnh tay i hng, di
chuyn vo pha trong, li dng li trn cc rnh c yu cu, v c nh vy n tn
khi khng cn rnh no trong hn c yu cu, ri li i hng .. v .. v .. B iu
khin a thng lm nhim v sp xp li cc yu cu c ci tin hiu nng.

T chc file: suy gim thi gian truy xut khi, ta c th t chc cc khi trn a
theo cch tng ng gn nht vi cch m d liu c truy xut. V d, Nu ta mun
mt file c truy xut tun t, khi ta b tr cc khi ca file mt cch tun t trn
cc tr k nhau. Tuy nhin vic phn b cc khi lu tr k nhau ny s b ph v
trong qu trnh pht trin ca file file khng th c phn b trn cc khi k nhau
c na, hin tng ny dc gi l s phn mnh (fragmentation). Nhiu h iu
hnh cung cp tin ch gip suy gim s phn mnh ny (Defragmentation) nhm lm
tng hiu nng truy xut file.

Cc buffers vit khng hay thay i: V ni dung ca b nh chnh b mt khi mt


ngun, cc thng tin v c s d liu cp nht phi c ghi ln a nhm
phng s c. Hiu nng ca cc ng dng cp nht cng cao ph thuc mnh
vo tc vit a. Ta c th s dng b nh truy xut ngu nhin khng hay thay
i (nonvolatile RAM) nng tc vit a. Ni dung ca nonvolatile RAM khng
b mt khi mt ngun. Mt phng php chung thc hin nonvolatile RAM l s
dng RAM pin d phng (battery-back-up RAM). Khi c s d liu yu cu vit mt
khi ln a, b iu khin da vit khi ny ln buffer nonvolatile RAM, v thng bo
ngay cho h iu hnh l vic vit thnh cng. B iu khin s vit d liu n
ch ca n trn a, mi khi a rnh hoc buffer nonvolatile RAM y. Khi h c s
d liu yu cu mt vit khi, n ch chu mt khong lng ch i khi buffer
nonvolatile RAM y.

CHNG III. LU TR V CU TRC TP TIN

trang

38

H QUN TR C S D LIU

a log (log disk): Mt cch tip cn khc lm suy gim tim nng vit l s dng
log-disk: Mt a c tn hin cho vic vit mt log tun t. Tt c cc truy xut n
log-disk l tun t, nhm loi b thi gian tm kim, v mt vi khi k c th c
vit mt ln, to cho vit vo log-disk nhanh hn vit ngu nhin vi ln. Cng nh
trong trng hp s dng nonvolatile RAM, d liu phi c vit vo v tr hin thi
ca chng trn a, nhng vic vit ny c th c tin hnh m h c s d liu
khng cn thit phi ch n hon tt. Log-disk c th c s dng khi phc d
liu. H thng file da trn log l mt phin bn ca cch tip cn log-disk: D liu
khng c vit li ln ch gc ca n trn a; thay vo , h thng file lu vt ni
cc khi c vit mi y nht trn log-disk, v hon li chng t v tr ny. Log-disk
c "c c" li (compacting) theo mt nh k. Cch tip cn ny ci tin hiu nng
vit, song sinh ra s phn mnh i vi cc file c cp nht thng xuyn.

RAID
Trong mt h thng c nhiu a, ta c th ci tin tc c vit d liu nu cho chng
hot ng song song. Mt khc, h thng nhiu a cn gip tng tin cy lu tr bng cch lu
tr d tha thng tin trn cc a khc nhau, nu mt a c s c d liu cng khng b mt. Mt
s a dng cc k thut t chc a, c gi l RAID (Redundant Arrays of Inexpensive
Disks), c ngh nhm vo vn tng cng hiu nng v tin cy.

CI TIN TIN CY THNG QUA S D THA


Gii php cho vn tin cy l a vo s d tha: lu tr thng tin ph, bnh thng
khng cn thit, nhng n c th c s dng ti to thng tin b mt khi gp s c hng hc
a, nh vy thi gian trung bnh khng s c tng ln (xt tng th trn h thng a).
n gin nht, l lm bn sao cho mi a. K thut ny c gi l mirroring hay
shadowing. Mt a logic khi bao gm hai a vt l, v mi vic vit c thc hin trn c
hai a. Nu mt a b h, d liu c th c c t a kia. Thi gian trung bnh khng s c
ca a mirror ph thuc vo thi gian trung bnh khng s c ca mi a v ph thuc vo thi
gian trung bnh c sa cha (mean time to repair): thi gian trung bnh mt a b h c
thay th v phc hi d liu trn n.

CI TIN HIU NNG THNG QUA SONG SONG


Vi a mirror, tc c c th tng ln gp i v yu cu c c th c gi n c
hai a. Vi nhiu a, ta c th ci tin tc truyn bi phn nh (striping data) d liu qua
nhiu a. Dng n gin nht l tch cc bt ca mt byte qua nhiu a, s phn nh ny c
gi l s phn nh mc bit (bit-level striping). V d, ta c mt dn 8 a, ta vit bt th i ca mt
byte ln a th i . dn 8 a ny c th c x l nh mt a vi cc sector 8 ln ln hn kch
c thng thng, quan trng hn l tc d truy xut tng ln tm ln. Trong mt t chc nh vy,
mi a tham gia vo mi truy xut (c/vit), nh vy, s cc truy xut c th c x l trong
mt giy l tng t nh trn mt a, nhng mi truy xut c th c/vit nhiu d liu hn tm
ln.
Phn nh mc bit c th c tng qut cho s a l bi hoc c ca 8, V d, ta c mt
dn 4 a, ta s phn phi bt th i v bt th 4+i vo a th i. Hn na, s phn nh khng nht
thit phi mc bit ca mt byte. V d, trong s phn nh mc khi, cc khi ca mt file c
phn nh qua nhiu a, vi n a, khi th i c th c phn phi qua a (i mod n) + 1. Ta cng

CHNG III. LU TR V CU TRC TP TIN

trang

39

H QUN TR C S D LIU

c th phn nh mc byte, sector hoc cc sector ca mt khi. Hai ch song song trong mt h
thng a l:
1. Np nhiu truy xut nh cn bng (truy xut trang) sao cho lng d liu c np
trong mt n v thi gian ca truy xut nh vy tng ln.
2. Song song ho cc truy xut ln sao cho thi gian tr li cc truy xut ln gim.

CC MC RAID
Mirroring cung cp tin cy cao, nhng t gi. Phn nh cung cp tc truyn d
liu cao, nhng khng ci tin c tin cy. Nhiu s cung cp s d tha vi gi thp
bng cch phi hp tng ca phn nh vi "parity" bit. Cc s ny c s tho hip gi-hiu
nng khc nhau v c phn lp thnh cc mc c gi l cc mc RAID.
Mc RAID 0 : Lin quan n cc dn a vi s phn nh mc khi, nhng khng c
mt s d tha no.
Mc RAID 1 : Lin quan n mirror a
Mc RAID 2 : Cng c bit di ci tn m sa li kiu b nh (memory-style
error-correcting-code : ECC). H thng b nh thc hin pht hin li bng bit parity. Mi byte
trong h thng b nh c th c mt bit parity kt hp vi n. S sa li lu hai hoc nhiu
hn cc bit ph, v c th dng li d liu nu mt bit b li. tng ca m sa li c th c
s dng trc tip trong dn a thng qua phn nh byte qua cc a. V d, bt u tin ca mi
byte c th c lu trn a 1, bit th hai trn a 2, v c nh vy, bit th 8 trn a 8, cc bit
sa li c lu trn cc a thm vo. Nu mt trong cc a b h, cc bt cn li ca byte v
cc bit sa li kt hp c c t cc a khc c th gip ti to bt b mt trn a h, nh vy
ta c th dng li d liu. Vi mt dn 4 a d liu, RAID mc 2 ch cn thm 3 a lu cc
bit sa li (cc a thm vo ny c gi l cc a overhead), so snh vi RAID mc 1, cn 4
a overhead.
Mc RAID 3 : Cn c gi l t chc parity chen bit (bit-interleaved parity). B iu
khin a c th pht hin mt sector c c ng hay sai, nh vy c th s dng ch mt bit
parity sa li: Nu mt trong cc sector b h, ta bit chnh xc l sector no, Vi mi bit
trong sector ny ta c th hnh dung n l bt 1 hay bit 0 bng cch tnh parity ca cc bit tng
ng t cc sector trn cc a khc. Nu parity ca cc bit cn li bng vi parity c lu, bit
mt s l 0, ngoi ra bit mt l 1. RAID mc 3 tt nh mc 2 nhng it tn km hn (ch cn mt
a overhead).
Mc RAID 4 : Cn c gi l t chc parity chen khi (Block-interleaved parity), lu
tr cc khi ng nh trong cc a chnh quy, khng phn nh chng qua cc a nhng ly mt
khi parity trn mt a ring bit i vi cc khi tng ng t N a khc. Nu mt trong cc
a b h, khi parity c th c dng vi cc khi tng ng t cc a khc khi phc khi
ca a b h.
Mt c khi ch truy xut mt a, cho php cc yu cu khc c x l bi cc a
khc. Nh vy, tc truyn d liu i vi mi truy xut chm, nhng nhiu truy xut c c
th c x l song song, dn n mt tc I/O tng th cao hn. Tc truyn i v cc c
d liu ln (nhiu khi) cao do tt c cc a c th c c song song; cc vit d liu ln
(nhiu khi) cng c tc truyn cao v d liu v parity c th c vit song song. Tuy nhin,
vit mt khi n phi truy xut a trn khi c lu tr, v a parity (do khi parity cng
phi c cp nht). Nh vy, vit mt khi n yu cu 4 truy xut: hai c hai khi c, v
hai vit li hai khi.
CHNG III. LU TR V CU TRC TP TIN

trang

40

H QUN TR C S D LIU

Mc RAID 5 : Cn gi l parity phn b chen khi (Block-interleaved Distributed


Parity), ci tin ca mc 4 bi phn hoch d liu v parity gia ton b N+1 a, thay v lu tr
d liu trn N a v parity trn mt a ring bit nh trong RAID 4. Trong RAID 5, tt c cc
a c th tham gia lm tho mn cc yu cu c, nh vy s lm tng tng s yu cu c th
c t ra trong mt n v thi gian. i vi mi khi, mt a lu tr parity, cc a khc lu
tr d liu. V d, vi mt dn nm a, parity i vi khi th n c lu trn a (n mod 5)+1.
Cc khi th n ca 4 a khc lu tr d liu hin hnh ca khi .
Mc RAID 6 : Cn c gi l s d tha P+Q (P+Q redundancy scheme), n rt
ging RAID 5 nhng lu tr thng tin d tha ph canh chng nhiu a b h. Thay v s
dng parity, ngi ta s dng cc m sa li.

CHN MC RAID NG
Nu a b h, Thi gian ti to d liu ca n l ng k v thay i theo mc RAID
c dng. S ti to d dng nht i vi mc RAID 1. i vi cc mc khc, ta phi truy xut
tt c cc a khc trong dn a ti to d liu trn a b h. Hiu nng ti to ca mt mt h
thng RAID c th l mt nhn t quan trng nu vic cung cp d liu lin tc c yu cu
(thng xy ra trong cc h CSDL hiu nng cao hoc trao i). Hn na, hiu nng ti to nh
hng n thi gian trung bnh khng s c.
V RAID mc 2 v 4 c gp li bi RAID mc 3 v 5, Vic la chn mc RAID thu
hp li trn cc mc RAID cn li. Mc RAID 0 c dng trong cc ng dng hiu nng cao
vic mt d liu khng c g l trm trng c. RAID mc 1 l thng dng cho cc ng dng
lu tr cc log-file trong h CSDL. Do mc 1 c overhead cao, mc 3 v 5 thng c a thch
hn i vi vic lu tr khi lng d liu ln. S khc nhau gia mc 3 v mc 5 l tc
truyn d liu i li vi tc I/O tng th. Mc 3 c a thch hn nu truyn d liu cao
c yu cu, mc 5 c a thch hn nu vic c ngu nhin l quan trng. Mc 6, tuy hin
nay t c p dng, nhng n c tin cy cao hn mc 5.

M RNG
Cc quan nim ca RAID c khi qut ho cho cc thit b lu tr khc, bao hm cc
dn bng, thm ch i vi qung b d liu trn cc h thng khng dy. Khi p dng RAID cho
dn bng, cu trc RAID cho kh nng khi phc d liu c khi mt trong cc bng b h hi. Khi
p dng i vi qung b d liu, mt khi d liu c phn thnh cc n v nh v c
qung b cng vi mt n v parity; nu mt trong cc n v ny khng nhn c, n c th
c dng li t cc n v cn li.

LU TR TAM CP (tertiary storage)


A QUANG HC
CR-ROM c u im l c kh nng lu tr ln, d di chuyn (c th a vo v ly ra
khi a nh a mm), hn na gi li r. Tuy nhin, so vi a cng, thi gian tm kim ca
CD-ROM chm hn nhiu (khong 250ms), tc quay chm hn (khong 400rpm), t dn
n tr cao hn; tc truyn d liu cng chm hn (khong 150Kbytes/s). Gn y, mt
nh dng mi ca a quang hc - Digital video disk (DVD) - c chun ho, cc a ny c
dung lng trong khong 4,7GBytes n 17 GBytes. Cc a WORM, REWRITABLE cng tr
thnh ph bin. Cc WORM jukeboxes l cc thit b c th lu tr mt s ln cc a WORM v
c th np t ng cc a theo yu cu n mt hoc mt vi WORM.
CHNG III. LU TR V CU TRC TP TIN

trang

41

H QUN TR C S D LIU

BNG T
Bng t c th lu mt lng ln d liu, tuy nhin, chm hn so vi a t v a quang
hc. Truy xut bng buc phi l truy xut tun t, nh vy n khng thch hp cho hu ht cc
i hi ca lu tr th cp. Bng t c s dng chnh cho vic backup, cho lu tr cc thng
tin khng c s dng thng xuyn v nh mt phng tin ngoi vi (off-line medium)
truyn thng tin t mt h thng n mt h thng khc. Thi gian nh v on bng lu d
liu cn thit c th ko di n hng pht. Jukeboxes bng cha mt lng ln bng, vi mt
vi bng v c th lu tr c nhiu TeraBytes (1012 Bytes)

TRUY XUT LU TR
Mt c s d liu c nh x vo mt s cc file khc nhau c duy tr bi h iu
hnh nn. Cc file ny lu tr thng trc trn cc a vi backup trn bng. Mi file c phn
hoch thnh cc n v lu tr di c nh c gi l khi - n v cho c cp pht lu tr v
truyn d liu.
Mt khi c th cha mt vi hng mc d liu (data item). Ta gi thit khng mt hng
mc d liu no tri ra trn hai khi. Mc tiu ni tri ca h CSDL l ti thiu ho s khi
truyn gia a v b nh. Mt cch gim s truy xut a l gi nhiu khi nh c th trong
b nh chnh. Mc ch l khi mt khi c truy xut, n nm sn trong b nh chnh v
nh vy khng cn mt truy xut a no c.
Do khng th lu tt c cc khi trong b nh chnh, ta cn qun tr cp pht khng gian
sn c trong b nh chnh lu tr cc khi. B m (Buffer) l mt phn ca b nh chnh sn
c lu tr bn sao khi a. Lun c mt bn sao trn a cho mi khi, song cc bn sao trn
a ca cc khi l cc phin bn c hn so vi phin bn trong buffer. H thng con m trch
cp pht khng gian buffer c gi l b qun tr buffer.

B QUN TR BUFFER
Cc chng trnh trong mt h CSDL a ra cc yu cu cho b qun tr buffer khi chng
cn mt khi a. Nu khi ny sn sng trong buffer, a ch khi trong b nh chnh c
chuyn cho ngi yu cu. Nu khi cha c trong buffer, b qun tr buffer u tin cp pht
khng gian trong buffer cho khi, rt ra mt s khi khc, nu cn thit, ly khng gian cho
khi mi. Khi c rt ra ch c vit li trn a khi n c b sa i k t ln c vit ln
a gn nht. Sau b qun tr buffer c khi t a vo buffer, v chuyn a ch ca khi
trong b nh chnh cho ngi yu cu. B qun tr buffer khng khc g nhiu so vi b qun tr
b nh o, mt im khc bit l kch c ca mt CSDL c th rt ln khng cha ton b
trong b nh chnh do vy b qun tr buffer phi s dng cc k thut tinh vi hn cc s qun
tr b nh o kiu mu.
Chin luc thay th. Khi khng c ch trong buffer, mt khi phi c xo khi
buffer trc khi mt khi mi c c vo. Thng thng, h iu hnh s dng s LRU
(Least Recently Used) vit ln a khi t c dng gn y nht, xo b n khi buffer. Cch
tip cn ny c th c ci tin i vi ng dng CSDL.
Khi cht (pinned blocks). h CSDL c th khi phc sau s c, cn thit phi hn
ch thi gian khi vit li ln a mt khi. Mt khi khng cho php vit li ln a c gi l
khi cht.
Xut ra bt buc cc khi (Forced output of blocks). C nhng tnh hung trong
cn phi vit li mt khi ln a, cho d khng gian buffer m n chim l khng cn n. Vic
CHNG III. LU TR V CU TRC TP TIN

trang

42

H QUN TR C S D LIU

vit ny c gi l s xut ra bt buc ca mt khi. L do ngn gn ca yu cu xut ra bt


buc khi l ni dung ca b nh chnh b mt khi c s c, ngc li d liu trn da cn tn ti
sau s c.

CC I SCH THAY TH BUFFER (Buffer-Replacement Policies).


Mc ch ca chin lc thay th khi trong buffer l ti thiu ho cc truy xut a. Cc
h iu hnh thng s dng chin lc LRU thay th khi. Tuy nhin, mt h CSDL c th
d on mu tham kho tng lai. Yu cu ca mt ngi s dng i vi h CSDL bao gm mt
s bc. H CSDL c th xc nh trc nhng khi no s l cn thit bng cch xem xt mi
mt trong cc bc c yu cu thc hin hot ng c yu cu bi ngi s dng. Nh
vy, khc vi h iu hnh, h CSDL c th c thng tin lin quan n tng lai, ch t l tng
lai gn. Trong nhiu trng hp, chin lc thay th khi ti u cho h CSDL li l MRU
(Most Recently Used): Khi b thay th s l khi mi c dng gn y nht!
B qun tr buffer c th s dng thng tin thng k lin quan n xc sut m mt yu
cu s tham kho mt quan h ring bit no . T in d liu l mt trong nhng phn c
truy xut thng xuyn nht ca CSDL. Nh vy, b qun tr buffer s khng nn xo cc khi t
in d liu khi b nh chnh tr phi cc nhn t khc bc ch lm iu . Mt ch
mc (Index) i vi mt file c truy xut thng xuyn hn chnh bn thn file, vy th b
qun tr buffer cng khng nn xo khi ch mc khi b nh chnh nu c s la chn.
Chin luc thay th khi CSDL l tng cn hiu bit v cc hot ng CSDL ang c
thc hin. Khng mt chin lc n l no c bit nm bt c ton b cc vin cnh c th.
Tuy vy, mt iu ng ngc nhin l phn ln cc h CSDL s dng LRU bt chp cc khuyt
im ca chin lc .
Chin lc c s dng bi b qun tr buffer thay th khi b nh hng bi cc
nhn t khc hn l nhn t thi gian ti khi c tham kho tr li. Nu h thng
ang x l cc yu cu ca mt vi ngi s dng cnh tranh, h thng (con) iu khin
cnh tranh (concurrency-control subsystem) c th phi lm tr mt s yu cu m bo tnh
nht qun ca CSDL. Nu b qun tr buffer c cho cc thng tin t h thng iu khin cnh
tranh m n nu r nhng yu cu no ang b lm tr, n c th s dng cc thng tin ny
thay i chin lc thay th khi ca n. c bit, cc khi cn thit bi cc yu cu tch cc
(active requests) c th c gi li trong buffer, ton b cc bt li dn ln cc khi cn thit
bi cc yu cu b lm tr.
H thng (con) khi phc (crash-recovery subsystem) p t cc rng buc nghim nht
ln vic thay th khi. Nu mt khi b sa i, b qun tr buffer khng c php vit li phin
bn mi ca khi trong buffer ln a, v iu ny ph hu phin bn c. Thay vo , b qun tr
khi phi tm kim quyn t h thng khi phc trc khi vit khi. H thng khi phc c th
i hi mt s khi nht nh khc l xut bt buc (forced output) trc khi cp quyn cho b
qun tr buffer xut ra khi c yu cu.

T CHC FILE
Mt file c t chc logic nh mt dy cc mu tin (record). Cc mu tin ny c nh
x ln cc khi a. File c cung cp nh mt xy dng c s trong h iu hnh, nh vy ta s
gi thit s tn ti ca h thng file nn. Ta cn phi xt nhng phng php biu din cc m
hnh d liu logic trong thut ng file.

CHNG III. LU TR V CU TRC TP TIN

trang

43

H QUN TR C S D LIU

Cc khi c kch c c nh c xc nh bi tnh cht vt l ca a v bi h iu hnh,


song kch c ca mu tin li thay i. Trong CSDL quan h, cc b ca cc quan h khc nhau
ni chung c kch c khc nhau.
Mt tip cn nh x mt CSDL n cc file l s dng mt s file, v lu tr cc mu
tin thuc ch mt di c nh vo mt file cho no . Mt cch khc l cu trc cc file sao
cho ta c th iu tit nhiu di cho cc mu tin. Cc file ca cc mu tin di c nh d
dng thc thi hn file ca cc mu tin di thay i.

MU TIN DI C NH (Fixed-Length Records)


Xt mt file cc mu tin account i vi CSDL ngn hng, mi mu tin ca file ny c
xc nh nh sau:
type
depositor = record
branch_name: char(20);
account_number: char(10);
balance:real;
end
Gi s mi mt k t chim 1 byte v mi s thc chim 8 byte, nh vy mu tin account
c di 40 bytes. Mt cch tip cn n gin l s dng 40 byte u tin cho mu tin th nht,
40 byte k tip cho mu tin th hai, ... Cch tip cn n gin ny ny sinh nhng vn sau;

Perryridge

A-102

400

Perryridge

A-102

400

Round Hill

A-305

350

Round Hill

A-305

350

Mianus

A-215

700

Downtown

A-101

500

Downtown

A-101

500

Redwood

A-222

700

Redwood

A-222

700

Perryridge

A-201

900

Perryridge

A-201

900

Brighton

A-217

750

Brighton

A-217

750

Downtown

A-110

600

Downtown

A-110

600

Perryridge

A-218

700

Perryridge

A-218

700

2. File F sau khi xa mu tin 2 v di


chuyn cc mu tin sau n

1. File F cha cc mu tin account

header

Perryridge

A-102

400

Round Hill

A-305

350

Perryridge

A-218

700

Downtown

A-101

500

Redwood

A-222

700

5 Perryridge
A-201
900
CHNG III. LU TR V CU TRC TP TIN
6
Brighton
A-217
750
7

Downtown

A-110

600

3. File F sau khi xa mu tin 2 v di


chuyn mu tin cui vo v ch ca

Perryridge

A-102

400

Mianus

A-215

700

Downtown

A-101

500

Perryridge

A-201

900
trang

Downtown

A-110

600

Perryridge

A-218

700

4
5
6

44

H QUN TR C S D LIU

1. Kh khn khi xo mt mu tin t cu trc ny. Khng gian b chim bi mu tin b xo


phi c lp y vi mu tin khc ca file hoc ta phi nh du mu tin b xo.
2. Tr khi kch c khi l bi ca 40, nu khng mt s mu tin s bt cho qua bin
khi, c ngha l mt phn mu tin c lu trong mt khi, mt phn khc c lu
trong mt khi khc. nh vy i hi phi truy xut hai khi c/vit mt mu tin
"bc cu" .
Khi mt mu tin b xo, ta c th di chuyn mu tin k sau n vo khng gian b chim
mt cch hnh thc bi mu tin b xo, ri mu tin k tip vo khng gian b chim ca mu tin
va c di chuyn, c nh vy cho n khi mi mu tin i sau mu tin b xo c dch chuyn
hng v u. Cch tip cn ny i hi phi di chuyn mt s ln cc mu tin. Mt cch tip
cn khc n gin hn l di chuyn mu tin cui cng vo khng gian b chim bi mu tin b
xo. Song cch tip cn ny i hi phi truy xut khi b xung. V hot ng xen xy ra thng
xuyn hn hot ng xo, ta c th chp nhn vic "ng" khng gian b chim bi mu tin b
xo, v ch mt hot ng xen n sau ti s dng khng gian . Mt du trn mu tin b xo
l khng v s gy kh khn cho vic tm kim khng gian "t do" khi xen. Nh vy ta cn
a vo cu trc b xung. u ca file, ta cp pht mt s byte nht nh lm header ca file.
Header ny s cha ng thng tin v file. Header cha a ch ca mu tin b xo th nht, trong
ni dung ca mu tin ny c cha a ch ca mu tin b xo th hai v c nh vy. Nh vy, cc
mu tin b xo s to ra mt danh sch lin kt dc gi l danh sch t do (free list). Khi xen
mu tin mi, ta s dng con tr u danh sch c cha trong header xc nh danh sch, nu
danh sch khng rng ta xen mu tin mi vo vng c tr bi con tr u danh sch nu khng
ta xen mu tin mi vo cui file.
Xen v xo i vi file mu tin di c nh thc hin n gin v khng gian c gii
phng bi mu tin b xo ng bng khng gian cn thit xen mt mu tin. i vi file ca cc
mu tin di thay i vn tr nn phc tp hn nhiu.

MU TIN DI THAY I (Variable-Length Records)


Mu tin di thay i trong CSDL do bi:
o Vic lu tr nhiu kiu mu tin trong mt file
o Kiu mu tin cho php di trng thay i
o Kiu mu tin cho php lp li cc trng

CHNG III. LU TR V CU TRC TP TIN

trang

45

H QUN TR C S D LIU

C nhiu k thut thc hin mu tin di thay i. minh ho ta s xt cc biu


din khc nhau trn cc mu tin di thay i c nh dng sau:
Type account_list = record
branch_name: char(20)
;
account_info: array[ 1.. ] of record
account_number: char(10);
balance: real;
end;
end

Biu din chui byte (Byte-String Representation)


Mt cch n gin thc hin cc mu tin di thay i l gn mt k hiu c bit
End-of-record () vo cui mi record. Khi , ta c th lu mi mu tin nh mt chui byte
lin tip. Thay v s dng mt k hiu c bit cui ca mi mu tin, mt phin bn ca biu
din chui byte lu tr di mu tin bt u ca mi mu tin.
0

Perryridge

A-102

400

A-201

900

A210

Round Hill

A-301

350

Mianus

A-101

800

Downtown

A-211

500

A-222

600

Redwood

A-300

650

A-200

1200

A-255

Brighton

A-111

750

700

950

Biu din chui byte ca cc mu tin di thay i

Biu din chui byte c cc bt li sau:


-

Kh s dng khng gian b chim hnh thc bi mt mu tin b xo, iu ny dn n


mt s ln cc mnh nh ca lu tr a b lng ph.

Khng c khng gian cho s pht trin cc mu tin. Nu mt mu tin di thay i


di ra, n phi c di chuyn v s di chuyn ny l t gi nu mu tin b cht.

Biu din chui byte khng thng c s dng thc hin mu tin di thay i,
song mt dng sa i ca n c gi l cu trc khe-trang (slotted-page structure) thng
c dng t chc mu tin trong mt khi n.
Trong cu trc slotted-page, c mt header bt u ca mi khi, cha cc thng tin sau:
- S cc u vo mu tin (record entries) trong header
- im cui khng gian t do (End of Free Space) trong khi
- Mt mng cc u vo cha v tr v kch c ca mi mu tin
Cc mu tin hin hnh c cp pht k nhau trong khi, bt u t cui khi, Khng gian
t do trong khi l mt vng k nhau, nm gia u vo cui cng trong mng header v mu tin
u tin. Khi mt mu tin c xen vo, khng gian cp pht cho n cui ca khng gian t do,
v u vo tng ng vi n c thm vo header.
Block header
Size

#entries

location

Free Space

CHNG III. LU TR V CU TRC TP TIN

End of Free Space

trang

46

H QUN TR C S D LIU

Nu mt mu tin b xo, khng gian b chim bi n c gii phng, u vo ng vi n


c t l b xo (kch c ca n c t chng hn l -1). Sau , cc mu tin trong khi trc
mu tin b xo c di chuyn sao cho khng gian t do ca khi li l phn nm gia u vo
cui cng ca mng header v mu tin u tin. Con tr im cui khng gian t do v cc con
tr ng vi mu tin b di chuyn c cp nht. S ln ln hay nh i ca mu tin cng s dng
k thut tng t (trong trng hp khi cn khng gian cho s ln ln ca mu tin). Ci gi
phi tr cho s di chuyn khng qu cao v cc khi c kch c khng ln ( thng 4Kbytes).

Biu din di c nh
Mt cch khc thc hin mu tin di thay i mt cch hiu qu trong mt h thng
file l s dng mt hoc mt vi mu tin di c nh biu din mt mu tin di thay i.
Hai k thut thc hin file ca cc mu tin di thay i s dng mu tin di c nh l:
1. Khng gian d tr (reserved space). Gi thit rng cc mu tin c di khng vt
qu mt ngng ( di ti a). Ta c th s dng mu tin di c nh (c di ti
a), Phn khng gian cha dng n c lp y bi mt k t c bit: null hoc
End-of-record.
2. Contr (Pointers). Mu tin di thay i c biu din bi mt danh sch cc mu
tin di c nh, c "mc xch" vi nhau bi cc con tr.
S bt li ca cu trc con tr l lng ph khng gian trong tt c cc mu tin ngoi tr
mu tin u tin trong danh sch (mu tin u tin cn trng branch_name, cc mu tin sau trong
danh sch khng cn thit c trng ny!). gii quyt vn ny ngi ta ngh phn cc
khi trong file thnh hai loi:
Khi neo (Anchor block). cha ch cc mu tin u tin trong danh sch
Khi trn (Overflow block). cha cc mu tin cn li ca danh sch
Nh vy, tt c cc mu tin trong mt khi c cng di, cho d file c th cha cc
mu tin khng cng di.
0

Perryridge

A-102

400

A-201

900

A210

700

Round Hill

A-301

350

Mianus

A-101

800

Downtown

A-211

500

A-222

600

Redwood

A-300

650

A-200

1200

A-255

950

Brighton

A-111

750

S dng phng php khng gian d tr


0

Perryridge

A-102

400

A-201

900

2
A-210
CHNG III. LU TR V CU TRC TP TIN
3
Round Hill
A-301

700

350

Mianus

A-101

800

Downtown

A-211

500

Redwood

A-300

650

trang

47

H QUN TR C S D LIU

Perryridge

A-102

400

Round Hill

A-301

350

Mianus

A-101

800

Downtown

A-211

500

Redwood

A-300

650

Brighton

A-111

750

Khi neo

A-201

900

A-210

700

A-222

600

A-200

1200

A-255

950

Khi trn

Cu trc khi neo v khi trn

T CHC CC MU TIN TRONG FILE


Ta xt lm th no biu din cc mu tin trong mt cu trc file. Mt th hin ca
mt quan h l mt tp hp cc mu tin. cho mt tp hp cc mu tin, vn t ra l lm th
no t chc chng trong mt file. C mt s cch t chc sau:
T chc file ng (Heap File Organization). Trong t chc ny, mt mu tin bt k
c th c lu tr bt k ni no trong file, c khng gian cho n. Khng c th t no
gia cc mu tin. Mt file cho mt quan h.
T chc file tun t ( Sequential File Organization). Trong t chc ny, cc mu tin
c lu tr th t tun t, da trn gi tr ca kho tm kim ca mi mu tin.
T chc file bm (Hashed File Organization). Trong t chc ny, c mt hm bm
c tnh ton trn thuc tnh no ca mu tin. Kt qu ca hm bm xc nh mu tin c b
tr trong khi no trong file. T chc ny lin h cht ch vi cu trc ch mc.
CHNG III. LU TR V CU TRC TP TIN

trang

48

H QUN TR C S D LIU

T chc file cm (Clustering File Organization). Trong t chc ny, cc mu tin ca


mt vi quan h khc nhau c th c lu tr trong cng mt file. Cc mu tin c lin h ca cc
quan h khc nhau c lu tr trn cng mt khi sao cho mt hot ng I/O em li cc mu
tin c lin h t tt c cc quan h.

T CHC FILE TUN T


T chc file tun t c thit k x l hiu qu cc mu tin trong th t c sp da
trn mt kho tm kim (search key) no . cho php tm li nhanh chng cc mu tin theo
th t kho tm kim, ta "xch" cc mu tin li bi cc con tr. Con tr trong mi mu tin tr ti
mu tin k theo th t kho tm kim. Hn na, ti u ho s khi truy xut trong x l file
tun t, ta lu tr vt l cc mu tin theo th t kho tm kim hoc gn vi kho tm kim nh c
th.
T chc file tun t cho php c cc mu tin theo th t c sp m n c th hu dng
cho mc ch trnh by cng nh cho cc thut ton x l vn tin (query-processing algorithms).
Brighton

A-217

750

Downtown

A-101

500

Downtown

A-110

600

Mianus

A-215

700

Perryridge

A-102

400

Perryridge

A-201

900

Perryridge

A-218

700

Redwood

A-222

850

Round Hill

A-301

550

Kh khn gp phi ca t chc ny l vic duy tr th t tun t vt l ca cc mu tin khi


xy ra cc hot ng xen, xo, do ci gi phi tr cho vic di chuyn cc mu tin khi xen, xo. Ta
c th qun tr vn xo bi dng dy chuyn cc con tr nh trnh by trc y. i vi
xen, ta c th p dng cc quy tc sau:
1. nh v mu tin trong file m n i trc mu tin c xen theo th t kho tm kim.
2. Nu c mu tin t do (khng gian ca mu tin b xo) trong cng khi, xen mu tin
vo khi ny. Nu khng, xen mu tin mi vo mt khi trn. Trong c hai trng
hp, iu chnh cc con tr sao cho n mc xch cc mu tin theo th t ca kho tm
kim.
Brighton

A-217

750

Downtown

A-101

500

Downtown

A-110

600

Mianus

A-215

700

Perryridge

A-102

400

Perryridge

A-201

900

Perryridge

A-218

700

Redwood
A-222
850
CHNG III. LU TR V CU TRC TP TIN
Round Hill A-301
550
Khi trn

North Town

A_777

1100

trang

49

H QUN TR C S D LIU

T CHC FILE CM
Nhiu h CSDL quan h, mi quan h c lu tr trong mt file sao cho c th li dng
c ton b nhng ci m h thng file ca iu hnh cung cp. Thng thng, cc b ca mt
quan h c biu din nh cc mu tin di c nh. Nh vy cc quan h c th nh x vo
mt cu trc file. S thc hin n gin ca mt h CSDL quan h rt ph hp vi cc h
CSDL c thit k cho cc my tnh c nhn. Trong cc h thng , kch c ca CSDL nh.
Hn na, trong mt s my tnh c nhn, ch yu kch c tng th m i tng i vi h CSDL
l nh. Mt cu trc file n gin lm suy gim lng m cn thit thc thi h thng.
Cch tip cn n gin ny, thc hin CSDL quan h, khng cn ph hp khi kch c
ca CSDL tng ln. Ta s thy nhng im li v mt hiu nng t vic gn mt cch thn trng
cc mu tin vi cc khi, v t vic t chc k lng chnh bn thn cc khi. Nh vy, c v nh
l mt cu trc file phc tp hn li c li hn, ngay c trong trng hp ta gi nguyn chin
lc lu tr mi quan h trong mt file ring bit.
Tuy nhin, nhiu h CSDL quy m ln khng nh cy trc tip vo h iu hnh nn
qun tr file. Thay vo , mt file h iu hnh c cp pht cho h CSDL. Tt c cc quan h
c lu tr trong mt file ny, v s qun tr file ny thuc v h CSDL. thy nhng im
li ca vic lu tr nhiu quan h trong cng mt file, ta xt vn tin SQL sau:
SELECT

account_number, customer_number, customer_treet, customer_city

FROM

depositor, customer

WHERE

depositor.customer_name = customer.customername;

Cu vn tin ny tnh mt php ni ca cc quan h depositor v customer. Nh vy, i


vi mi b ca depositor, h thng phi tm b ca customer c cng gi tr customer_name. Mt
cch l tng l vic tm kim cc mu tin ny nh s tr gip ca ch mc. B qua vic tm kim
cc mu tin nh th no, ta ch vo vic truyn t a vo b nh. Trong trng hp xu nht,
mi mu tin trong mt khi khc nhau, iu ny buc ta phi c mt khi cho mt mu tin
c yu cu bi cu vn tin. Ta s trnh by mt cu trc file c thit k thc hin hiu qu
cc cu vn tin lin quan n depositor customer. Cc b depositor i vi mi customer_name
c lu tr gn b customer c cng customer_name. Cu trc ny trn cc b ca hai quan h
vi nhau, nhng cho php x l hiu qu php ni. Khi mt b ca ca quan h customer c
c, ton b khi cha b ny c c t a vo trong b nh chnh. Do cc b tng ng ca
depositor c lu tr trn a gn b customer, khi cha b customer cha cc b ca quan h
depositor cn cho x l cu vn tin. Nu mt customer c nhiu account n ni cc mu tin
depositor khng lp y trong mt khi, cc mu tin cn li xut hin trong khi k cn. Cu trc
file ny, c gi l gom cm (clustering), cho php ta c nhiu mu tin c yu cu ch s
dng mt c khi, nh vy ta c th x l cu vn tin c bit ny hiu qu hn.
customer_name

customer_street

customer_city

Hays

Main

Hays

Main

Brooklyn

Hays

A-102

Turner

Putnam

Stamford

Hays

A-220

Hays

A-503

Turner

Putnam

Turner

A-305

Quan h customer
CHNG
III. LU TR V CU TRC TP TIN

Hays

Main

Hays

A-102

Hays

A-220

Brooklyn

Cu trc file cm

Brooklyn

trang

Stamford

50

H QUN TR C S D LIU

Tuy nhin, cu trc gom cm trn li t ra khng c li bng t chc lu mi quan h


trong mt file ring, i vi mt s cu vn tin, chng hn:
SELECT *
FROM customer
Vic xc nh khi no th gom cm thng ph thuc vo kiu cu vn tin m ngi thit
k CSDL ngh rng n xy ra thng xuyn nht. S dng thn trng gom cm c th ci thin
hiu nng ng k trong vic x l cu vn tin.

LU TR T IN D LIU
Mt h CSDL cn thit duy tr d liu v cc quan h, nh s ca cc quan h. Thng
tin ny c gi l t in d liu (data dictionary) hay mc lc h thng (system catalog).
Trong cc kiu thng tin m h thng phi lu tr l:
Cc tn ca cc quan h
Cc tn ca cc thuc tnh ca mi quan h
Cc min (gi tr) v cc di ca cc thuc tnh
Cc tn ca cc View c nh ngha trn CSDL v nh ngha ca cc view ny
Cc rng buc ton vn
Nhiu h thng cn lu tr cc thng tin lin quan n ngi s dng h thng:
Tn ca ngi s dng c php
Gii trnh thng tin v ngi s dng
Cc d liu thng k v m t v cc quan h c th cng c lu tr:
S b trong mi quan h
Phng php lu tr c s dng cho mi quan h (cm hay khng)
Cc thng tin v mi ch mc trn mi quan h cng cn c lu tr :
Tn ca ch mc
Tn ca quan h c ch mc
Cc thuc tnh trn n ch mc c nh ngha
CHNG III. LU TR V CU TRC TP TIN

trang

51

H QUN TR C S D LIU

Kiu ca ch mc c to
Ton b cc thng tin ny trong thc t bao hm mt CSDL nh. Mt s h CSDL s
dng nhng cu trc d liu v m mc ch c bit lu tr cc thng tin ny. Ni chung, lu
tr d liu v CSDL trong chnh CSDL vn c a chung hn. Bng cch s dng CSDL
lu tr d liu h thng, ta n gin ho cu trc tng th ca h thng v cho php s dng y
sc mnh ca CSDL trong vic truy xut nhanh n d liu h thng.
S chn la chnh xc biu din d liu h thng s dng cc quan h nh th no l do
ngi thit k h thng quyt nh. Nh mt v d, ta ngh s biu din sau:
System_catalog_schema = (relation_name, number_of_attributes)
Attribute_schema = (attribute_name, relation_name, domain_type, position, length)
User_schema = (user_name, encrypted_password, group)
Index_schema = (index_name, relation_name, index_type, index_attributes)
View_schema = (view_name, definition)

CH MC
Ta xt hot ng tm sch trong mt th vin. V d ta mun tm mt cun sch ca mt
tc gi no . u tin ta tra trong mc lc tc gi, mt tm th trong mc lc ny s ch cho ta
bit c th tm thy cun sch u. Cc th trong mt mc lc c th vin sp xp th t
theo vn ch ci , nh vy gip ta c th tm n th cn tm nhanh chng khng cn phi duyt
qua tt c cc th. Ch mc ca mt file trong cc cng vic h thng rt ging vi mt mc lc
trong mt th vin. Tuy nhin, ch mc c lm nh mc lc c m t nh trn, trong thc t,
s qu ln c qun l mt cch hiu qu. Thay vo , ngi ta s dng cc k thut ch
mc tinh t hn. C hai kiu ch mc:

Ch mc c sp (Ordered indices). c da trn mt th t sp xp theo cc gi tr

Ch mc bm (Hash indices). c da trn cc gi tr c phn phi u qua cc


bucket. Bucket m mt gi tr c gn vi n c xc nh bi mt hm, c gi l
hm bm (hash function)

i vi c hai kiu ny, ta s nu ra mt vi k thut, ng lu l khng k thut no l


tt nht. Mi k thut ph hp vi cc ng dng CSDL ring bit. Mi k thut phi c nh
gi trn c s ca cc nhn t sau:

Kiu truy xut: Cc kiu truy xut c h tr hiu qu. Cc kiu ny bao hm c tm
kim mu tin vi mt gi tr thuc tnh c th hoc tm cc mu tin vi gi tr thuc tnh
nm trong mt khong xc nh.

Thi gian truy xut: Thi gian tm kim mt hng mc d liu hay mt tp cc
hng mc.

Thi gian xen: Thi gian xen mt hng mc d liu mi. gi tr ny bao hm thi
gian tm v tr xen thch hp v thi gian cp nht cu trc ch mc.

Thi gian xo: Thi gian xo mt hng mc d liu. gi tr ny bao hm thi gian
tm kim hng mc cn xo, thi gian cp nht cu trc ch mc.

Tng ph tn khng gian: Khng gian ph b chim bi mt cu trc ch mc.

Mt file thng i km vi mt vi ch mc. Thuc tnh hoc tp hp cc thuc tnh c


dng tm kim mu tin trong mt file c gi l kho tm kim. Ch rng nh ngha ny
CHNG III. LU TR V CU TRC TP TIN

trang

52

H QUN TR C S D LIU

khc vi nh ngha kho s cp (primary key), kho d tuyn (candidate key), v siu kho
(superkey). Nh vy, nu c mt vi ch mc trn mt file, c mt vi kho tm kim tng ng.

CH MC C SP.
Mt ch mc lu tr cc gi tr kho tm kim trong th t c sp, v kt hp vi mi
kho tm kim, cc mu tin cha kho tm kim ny. Cc mu tin trong file c ch mc c th
chnh n cng c sp. Mt file c th c mt vi ch mc trn nhng kho tm kim khc nhau.
Nu file cha cc mu tin c sp tun t, ch mc trn kho tm kim xc nh th t ny ca
file c gi ch mc s cp (primary index). Cc ch mc s cp cng c gi l ch mc
cm (clustering index). Kho tm kim ca ch mc s cp thng l kho s cp (kho chnh).
Cc ch mc, kho tm kim ca n xc nh mt th t khc vi th t ca file, c gi l cc
ch mc th cp (secondary indices) hay cc ch mc khng cm (nonclustering indices).

Ch mc s cp.

Brighton

A-217

750

Downtown

A-101

500

Downtown

A-110

600

Mianus

A-215

700

Perryridge

A-102

400

Perryridge

A-201

900

Perryridge

A-218

700

Redwood

A-222

850

Round Hill

A-301

550

file tun t cc mu tin account

Trong phn ny, ta gi thit rng tt c cc file c sp th t tun t trn mt kho tm


kim no . Cc file nh vy, vi mt ch mc s cp trn kho tm kim ny, c gi l file
tun t ch mc (index-sequential files). Chng biu din mt trong cc s xa nht c dng
trong h CSDL. Chng c thit k cho cc ng dng i hi c x l tun t ton b file ln
truy xut ngu nhin n mt mu tin.

Ch mc c v ch mc tha (Dense and Sparse Indices)


Ch mc c
Brighton

Brighton

A-217

750

Mianus

Downtown

A-101

500

Redwood

Downtown

A-110

600

Mianus

A-215

700

Perryridge

A-102

400

Perryridge

A-201

900

Perryridge

A-218

700

Redwood

A-222

850

Round Hill

A-301

550

Brighton

A-217

750

Downtown

A-101

500

CHNG III. LU
TR V CU TRC TP TINDowntown
Downtown

A-110

600

Mianus

A-215

700

Perryridge

Perryridge

A-102

400

Redwood

Perryridge

A-201

900

Ch mc tha
Brighton
Mianus

trang

53

H QUN TR C S D LIU

C hai loi ch mc c sp:

Ch mc c. Mi mu tin ch mc (u vo ch mc/ index entry) xut hin i vi


mi gi tr kho tm kim trong file. mu tin ch mc cha gi tr kho tm kim v mt
con tr ti mu tin d liu u tin vi gi tr kho tm kim .

Ch mc tha. Mt mu tin ch mc c to ra ch vi mt s gi tr. Cng nh vi


ch mc c, mi mu tin ch mc cha mt gi tr kho tm kim v mt con tr ti mu
tin d liu u tin vi gi tr kho tm kim ny. nh v mt mu tin, ta tm u vo
ch mc vi gi tr kho tm kim ln nht trong cc gi tr kho tm kim nh hn hoc
bng gi tr kho tm kim ang tm. Ta bt u t mu tin c tr ti bi u vo ch
mc, v ln theo cc con tr trong file (d liu) n tn khi tm thy mu tin mong mun.

V d: Gi s ta tm cc kim mu tin i vi chi nhnh Perryridge, s dng ch mc c.


u tin, tm Perryridge trong ch mc (tm nh phn!), i theo con tr tng ng n mu tin d
liu (vi Branch_name = Perryridge) u tin, x l mu tin ny, sau i theo con tr trong mu
tin ny nh v mu tin k trong th t kho tm kim, x l mu tin ny, tip tc nh vy n
tn khi t ti mu tin c Branch_name khc vi Perryridge.
i vi ch mc tha, u tin tm trong ch mc, u vo c Branch_name ln nht trong
cc u vo c Branch_name nh hn hoc bng Perryridge, ta tm c u vo vi Mianus, ln
theo con tr tng ng n mu tin d liu, i theo con tr trong mu tin Mianus nh v mu
tin k trong th t kho tm kim v c nh vy n tn khi t ti mu tin d liu Perryridge u
tin, sau x l bt u t im ny.
Ch mc c cho php tm kim mu tin nhanh hn ch mc tha, song ch mc tha li
i hi t khng gian hn ch mc c. Hn na, ch mc tha yu cu mt tn ph duy tr nh
hn i vi cc hot ng xen, xo.
Ngi thit k h thng phi cn nhc s cn i gia thi truy xut v tn ph khng
gian. Mt tho hip tt l c mt ch mc tha vi mt u vo ch mc cho mi khi, v nh vy
ci gi ni tri trong x l mt yu cu CSDL l thi gian mang mt khi t a vo b nh
chnh. Mi khi mt khi c mang vo, thi gian qut ton b khi l khng ng k. S dng
ch mc tha, ta tm khi cha mu tin cn tm. Nh vy, tr phi mu tin nm trn khi trn, ta ti
thiu ho c truy xut khi, trong khi gi c kch c ca ch mc nh nh c th.

Ch mc nhiu mc
Ch mc c th rt ln, ngay c khi s dng ch mc tha, v khng th cha trong b
nh mt ln. Tm kim u vo ch mc i vi cc ch mc nh vy i hi phi c vi khi
a. Tm kim nh phn c th c s dng tm mt u vo trn file ch mc, song vn phi
truy xut khong logB khi, vi B l s khi a cha ch mc. Nu B ln, thi gian truy xut
CHNG III. LU TR V CU TRC TP TIN

trang

54

H QUN TR C S D LIU

ny l ng k! Hn na nu s dng cc khi trn, tm kim nh phn khng s dng c v


nh vy vic tm kim phi lm tun t. N i hi truy xut ln n B khi!!
gii quyt vn ny, Ta xem file ch mc nh mt file tun t v xy dng ch mc
tha cho n. tm u vo ch mc, ta tm kim nh phn trn ch mc "ngoi" c mu tin
c kho tm kim ln nht trong cc mu tin c kho tm kim nh hn hoc bng kho mun tm.
Con tr tng ng tr ti khi ca ch mc "trong". Trong khi ny, tm kim mu tin c kho tm
kim ln nht trong cc mu tin c kho tm kim nh hn hoc bng kho mun tm, trng con
tr ca mu tin ny tr n khi cha mu tin cn tm. V ch mc ngoi nh, c th nm sn
trong b nh chnh, nn mt ln tm kim ch cn mt truy xut khi ch mc. Ta c th lp li
qu trnh xy dng trn nhiu ln khi cn thit. Ch mc vi khng t hn hai mc c gi l ch
mc nhiu mc. Vi ch mc nhiu mc, vic tm kim mu tin i hi truy xut khi t hn ng
k so vi tm kim nh phn.

Index block 0

outer index

Index block 1

inner index

Cp nht ch mc
Mi khi xen hoc xo mt mu tin, bt buc phi cp nht cc ch mc km vi file cha
mu tin ny. Di y, ta m t cc thut ton cp nht cho cc ch mc mt mc

Xo. xo mt mu tin, u tin phi tm mu tin mun xo. Nu mu tin b xo l


mu tin u tin trong dy chuyn cc mu tin c xc nh bi con tr ca u vo ch
mc trong qu trnh tm kim, c hai trng hp phi xt: nu mu tin b xo l mu tin
duy nht trong dy chuyn, ta xo u vo trong ch mc tng ng, nu khng, ta thay
th kho tm kim trong u vo ch mc bi kho tm kim ca mu tin k sau mu tin
b xo trong dy chuyn, con tr bi a ch mu tin k sau . Trong trng hp khc,
vic xo mu tin khng dn n vic iu chnh ch mc.

Xen. Trc tin, tm kim da trn kho tm kim ca mu tin c xen. Nu l ch mc


c v gi tr kho tm kim khng xut hin trong ch mc, xen gi tr kho ny v con
tr ti mu tin vo ch mc. Nu l ch mc tha v lu u vo cho mi khi, khng cn

CHNG III. LU TR V CU TRC TP TIN

trang

55

H QUN TR C S D LIU

thit phi thay i tr phi khi mi c to ra. Trong trng hp , gi tr kho tm


kim u tin trong khi mi c xen vo ch mc.
Gi thut xen v xo i vi ch mc nhiu mc l mt m rng n gin ca cc gi thut
va c m t.

Ch mc th cp.
Ch mc th cp trn mt kho d tuyn ging nh ch mc s cp c ngoi tr cc mu
tin c tr n bi cc gi tr lin tip trong ch mc khng c lu tr tun t. Ni chung, ch
mc th cp c th c cu trc khc vi ch mc s cp. Nu kho tm kim ca ch mc s cp
khng l kho d tuyn, ch mc ch cn tr n mu tin u tin vi mt gi tr kho tm kim
ring l (cc mu tin khc cng gi tr kho ny c th tm li c nh qut tun t file).
Nu kho tm kim ca mt ch mc th cp khng l kho d tuyn, vic tr ti mu tin
u tin vi gi tr kho tm kim ring khng , do cc mu tin trong file khng cn c sp
tun t theo kho tm kim ca ch mc th cp, chng c th nm bt k v tr no trong file.
Bi vy, ch mc th cp phi cha tt c cc co tr ti mi mu tin. Ta c th s dng mc ph
gin tip thc hin ch mc th cp trn cc kho tm kim khng l kho d tuyn. Cc con
tr trong ch mc th cp nh vy khng trc tip tr ti mu tin m tr ti mt bucket cha cc
con tr ti file.
350
400
500
600
700
750
900
Ch mc
th cp
trn kho
khng l d
tuyn

Brighton

A-217

750

Downtown

A-101

500

Downtown

A-110

600

Mianus

A-215

700

Perryridge

A-102

400

Perryridge

A-201

900

Perryridge

A-218

700

Redwood

A-222

700

Round Hill

A-305

350

Ch mc th cp phi l c, vi mt u vo ch mc cho mi mu tin. Ch mc th cp


ci thin hiu nng cc vn tin s dng kho tm kim khng l kha ca ch mc s cp, tuy
nhin n li em li mt tn ph sa i CSDL ng k.Vic quyt nh cc ch mc th cp no
l cn thit da trn nh gi ca nh thit k CSDL v tn xut vn tin v sa i.

FILE CH MC B+-CY (B+-Tree Index file)


T chc file ch mc tun t c mt nhc im chnh l lm gim hiu nng khi file ln
ln. khc phc nhc im i hi phi t chc li file. Cu trc ch mc B+-cy l cu trc
c s dng rng ri nht trong cc cu trc m bo c tnh hiu qu ca chng bt chp cc
hot ng xen, xo. Ch mc B+-cy l mt dng cy cn bng (mi ng dn t gc n l c
cng di). Mi nt khng l l c s con nm trong khong gia m/2 v m, trong m l
mt s c nh c gi l bc ca B+-cy. Ta thy rng cu trc B+-cy cng i hi mt tn ph
CHNG III. LU TR V CU TRC TP TIN

trang

56

H QUN TR C S D LIU

hiu nng trn xen v xo cng nh trn khng gian. Tuy nhin, tn ph ny l chp nhn c
ngay c i vi cc file c tn sut sa i cao.

Cu trc ca B+-cy
Mt ch mc B+-cy l mt ch mc nhiu mc, nhng c cu trc khc vi file tun t ch
mc nhiu mc (multilevel index-sequential). Mt nt tiu biu ca B+-cy cha n n-1 gi tr
kho tm kim. K1, K2, ..., Kn-1, v n con tr P1, P2, ..., Pn, cc gi tr kho trong nt c sp th
t: i < j Ki < Kj.
P1

K1

P2

K2

. . .

Pn-1

Kn-1 Pn

Trc tin, ta xt cu trc ca nt l. i vi i = 1, 2, ..., n-1, con tr Pi tr ti hoc mu


tin vi gi tr kho Ki hoc ti mt bucket cc con tr m mi mt trong chng tr ti mt mu tin
vi
gi tr kho Ki. Cu trc bucket ch c s dng trong cc trng hp: hoc kho tm kim
khng l kho s cp hoc file khng c sp theo kho tm kim. Con tr Pn c dng vo
mc ch c bit: Pn c dng mc xch cc nt l li theo th t kho tm kim, iu ny
cho php x l tun t file hiu qu. By gi ta xem cc gi tr kho tm kim c gn vi mt
nt l nh th no. Mi nt nt l c th cha n n-1 gi tr. Khong gi tr m mi nt l cha l
khng chng cho. Nh vy, nu Li v Lj l hai nt l vi i < j th mi gi tr kho trong nt Li
nh hn mi gi tr kho trong Lj . Nu ch mc B+-cy l c, mi gi tr kho tm kim phi xut
hin trong mt nt l no .
perryridg
e

Mianus

Brighton

downtown

Redwood

Mianus

perryridg
e

Brighton

A-212

750

Downtown

A-101

500

Downtown

A-110

600

Redwood

Round Hill

...

Cc nt khng l l ca mt B+-cy to ra mt ch mc nhiu mc trn cc nt l. Cu trc


ca cc nt khng l l tng t nh cu trc nt l ngoi tr tt c cc con tr u tr n cc nt
ca cy. Cc nt khng l l c th cha n m con tr v phi cha khng t hn m/2 con tr
ngoi tr nt gc. Nt gc c php cha t nht 2 con tr. S con tr trong mt nt c gi l
s nan (fanout) ca nt.
Con tr Pi ca mt nt khng l l (cha p con tr, 1 < i < p) tr n mt cy con cha cc
gi tr kho tm kim nh hn Ki v ln hn hoc bng Ki-1. Con tr P1 tr n cy con cha cc
gi tr kho tm kim nh hn K1. Con tr Pp tr ti cy con cha cc kho tm kim ln hn Kp-1.
CHNG III. LU TR V CU TRC TP TIN

trang

57

H QUN TR C S D LIU

Cc vn tin trn B+-cy


Ta xt x l vn tin s dng B+-cy nh th no ? Gi s ta mun tn tt c cc mu tin
vi gi tr kho tm kim k. u tin, ta kim tra nt gc, tm gi tr kho tm kim nh nht ln
hn k, gi s gi tr kho l Ki. i theo con tr Pi di ti mt nt khc. Nu nt c p con
tr v k > Kp-1, i theo con tr Pp. n mt nt ti, lp li qu trnh tm kim gi tr kho tm
kim nh nht ln hn k v theo con tr tng ng i ti mt nt khc v tip tc nh vy n
khi t ti mt nt l. Con tr tng ng trong nt l hng ta ti mu tin/bucket mong mun.
S khi truy xut khng vt qu log K , trong K l s gi tr kho tm kim trong B+-cy,

m / 2

m l bc ca cy.

Cp nht trn B+-cy

Xen. S dng cng k thut nh tm kim, ta tm nt l trong gi tr kho tm kim


cn xen s xut hin. Nu kho tm kim xut hin ri trong nt l, xen mu tin vo
trong file, thm con tr ti mu tin vo trong bucket tng ng. Nu kho tm kim cha
hin din trong nt l, ta xen mu tin vo trong file ri xen gi tr kho tm kim vo
trong nt l v tr ng (bo tn tnh th t), to mt bucket mi vi con tr tng ng.
Nu nt l khng cn ch cho gi tr kho mi, Mt khi mi c yu cu t h iu
hnh, cc gi tr kho trong nt l c tch mt na cho nt mi, gi tr kho mi c
xen vo v tr ng ca n vo mt trong hai khi ny. iu ny ko theo vic xen gi tr
kho u khi mi v con tr ti khi mi vo nt cha. Vic xen cp gi tr kho v con
tr vo nt cha ny li c th dn n vic tch nt ra lm hai. Qu trnh ny c th dn
n tn nt gc. Trong trng hp nt gc b tch lm hai, mt nt gc mi c to ra
v hai con ca n l hai nt c tch ra t nt gc c, chiu cao cy tng ln mt.

Procedure Insert(value V, pointer P)


Tm nt l L s cha gi tr V
Insert_entry(L, V, P)
end procedure
Procedure Insert_entry(node L, value V, pointer P)
If (L c khng gian cho (V, P) then
Xen (V, P) vo L
else

begin

/* tch L */

To nt L'
If ( L l nt l) then begin
V' l gi tr sao cho m/2 gi tr trong cc gi tr L.K1, L.K2, ..., L.Km-1, V nh hn V'
n l ch s nh nht sao cho L.Kn V'
Di chuyn L.Pn, L.Kn, ..., L.Pm-1, L.Kn-1 sang L'
If (V < V') then xen (V, P) vo trong L else xen (P, V) vo trong L'
end else begin
V' l gi tr sao cho m/2 gi tr trong cc gi tr L.K1, L.K2, ..., L.Km-1, V ln hn hoc bng V'
n l ch s nh nht sao cho L.Kn V'
Thm Nil, L.Kn, L.Pn+1, L.Kn+1, ..., L.Pm-1, L.Km-1, L.Pm vo L'
Xo L.Kn, L.Pn+1, L.Kn+1, ..., L.Pm-1, L.Km-1, L.Pm khi L

CHNG III. LU TR V CU TRC TP TIN

trang

58

H QUN TR C S D LIU
If (V < V') then xen (P, V) vo trong L else xen (P, V) vo trong L'
xo (Nil, V') khi L'
end
If (L khng l nt gc) then Insert_entry(parent(L), V', L')
else begin
To ra nt mi R vi cc nt con l L v L' vi gi tr duy nht trong n l V'
To R l gc ca cy
end
If (L) l mt nt l then begin
t L'.Pm = L.Pm
t L.Pm = L'
end
end
end procedure

Xo. S dng k thut tm kim tm mu tin cn xo, xo n khi file, xo gi tr kho tm


kim khi nt l trong B+-cy nu khng c bucket kt hp vi gi tr kho tm kim hoc
bucket tr nn rng sau khi xo con tr tng ng trong n. Vic xo mt gi tr kho khi
mt nt ca B+-cy c th dn n nt l tr nn rng, phi tr li, t nt cha ca n c
th c s con nh hn ngng cho php, trong trng hp hoc phi chuyn mt con t
nt anh em ca nt cha sang nt cha nu iu c th (nt anh em ca nt cha ny cn
s con m/2 sau khi chuyn i mt con). Nu khng, phi gom nt cha ny vi mt nt
anh em ca n, iu ny dn ti xo mt nt trong khi cy, ri xo khi nt cha ca n
mt hng, ... qu trnh ny c th dn n tn gc. Trong trng hp nt gc ch cn mt
con sau xo, cy phi thay nt gc c bi nt con ca n, nt gc c phi tr li cho h
thng, chiu cao cy gim i mt.

Procedure delete(value V, pointer P)


Tm nt l cha (V, P)
delete_entry(L, V, P)
end procedure
Procedure delete_entry(node L, value V, pointer P)
xo (V, P) khi L
If (L l nt gc and L ch cn li mt con) then
Ly con ca L lm nt gc mi ca cy, xo L
else If (L c qu t gi tr/ con tr) then begin
L' l anh em k tri hoc phi ca L
V' l gi tr gia hai con tr L, L' (trong nt parent(L))
If (cc u vo ca L v L' c th lp y trong mt khi) then begin
If (L l nt trc ca L') then wsap_variables(L, L')
If (L khng l l) then ni V' v tt c con tr, gi tr trong L vi L'
else begin ni tt c cc cp (K, P) trong L vi L'; L'.Pp = L.Pp end

CHNG III. LU TR V CU TRC TP TIN

trang

59

H QUN TR C S D LIU
delete_entry(parent(L), V', L); xo nt L
end
else begin
If (L' l nt trc ca L) then begin
If (L khng l nt l) then begin
p l ch s sao cho L'.Pp l con tr cui trong L'
xo (L'.Kp-1, L'.Pp) khi L'
xen (L'.Pp, V') nh phn t u tin trong L (right_shift tt c cc phn t ca L)
thay th V' trong parent(L) bi L'.Kp-1
end else begin
p l ch s sao cho L'.Pp l con tr cui trong L'
xo (L'.Pp, L'.Kp) khi L'
xen (L'.Pp, L'.Kp) nh phn t u tin trong L (right_shift tt c cc phn t ca L)
thay th V' trong parent(L) bi L'.Kp
end
end < i xng vi trng hp then >
end
end procedure

T chc file B+-cy


Trong t chc file B+-cy, cc nt l ca cy lu tr cc mu tin, thay cho cc con tr ti
file. V mu tin thng ln hn con tr, s ti a cc mu tin c lu tr trong mt khi l t
hn s con tr trong mt nt khng l. Cc nt l vn c yu cu c lp y t nht l mt
na.
Xen v xo trong t chc file B+-cy tng t nh trong ch mc B+-cy.
Khi B+-cy c s dng t chc file, vic s dng khng gian l c bit quan trng,
v khng gian b chim bi mu tin l ln hn nhiu so vi khng gian b chim bi (kho,con
tr). Ta c th ci tin s s sng khng gian trong B+-cy bng cch bao hm nhiu nt anh em
hn khi ti phn phi trong khi tch v trn. Khi xen, nu mt nt l y, ta th phn phi li mt
s u vo n mt trong cc nt k to khng gian cho u vo mi. Nu vic th ny tht
bi, ta mi thc hin tch nt v phn chia cc u vo gia mt trong cc nt k v hai nt nhn
c do tch nt. Khi xo, nu nt cha t hn 2m/3 u vo, ta th mn mt u vo t mt
trong hai nt anh em k. Nu c hai u c ng 2m/3 mu tin, ta phn phi li cc u vo ca
nt cho hai nt anh em k v xo nt th 3. Nu k nt c s dng trong ti phn phi (k-1
nt anh em), mi nt m bo cha t nht (k-1)m/k u vo. Tuy nhin, ci gi phi tr cho
cp nht ca cch tip cn ny s cao hn.

FILE CH MC B-CY (B-Tree Index Files)


Ch mc B-cy tng t nh ch mc B+-cy. S khc bit l ch B-cy loi b lu tr
d tha cc gi tr kho tm kim. Trong B-cy, cc gi tr kho ch xut hin mt ln. Do cc
kho tm kim xut hin trong cc nt khng l l khng xut hin bt k ni no khc na
trong B-cy, ta phi thm mt trng con tr cho mi kho tm kim trong cc nt khng l l.
Con tr thm vo ny tr ti hoc mu tin trong file hoc bucket tng ng.
Mt nt l B-cy tng qut c dng:
CHNG III. LU TR V CU TRC TP TIN

trang

60

H QUN TR C S D LIU
P1

K1

P2

K2

...

Pm-1

Km-1

Pm

Mt nt khng l l c dng:
BmB

P1

B1

K1

P2

B2

K2

Cc con tr Pi l cc con tr cy v c dng nh trong B+-cy. Cc con tr Bi trong cc


nt khng l l l cc con tr mu tin hoc con tr bucket. R rng l s gi tr kho trong nt
khng l nh hn s gi tr trong nt l. S nt c truy xut trong qu trnh tm kim trong mt
B-cy ph thuc ni kho tm kim c nh v.
Xo trong mt B-cy phc tp hn trong mt B+-cy. Xo mt u vo xut hin mt
nt khng l l ko theo vic tuyn chn mt gi tr thch hp trong cy con ca nt cha u vo
b xo. Nu kho Ki b xo, kho nh nht trong cy con c tr bi Pi+1 phi c di chuyn
vo v tr ca Ki. Nu nt l cn li qu t u vo, cn thit cc hot ng b xung.
III.9.4

nh ngha ch mc trong SQL


Mt ch mc c to ra bi lnh CREATE INDEX vi c php
CREATE INDEX < index-name > ON < relation_name > (< attribute-list >)

attribute-list l danh sch cc thuc tnh ca quan h c dng lm kho tm kim cho ch mc. Nu mun
khai bo l kho tm kim l kho d tuyn, thm vo t kho UNIQUE:
CREATE UNIQUE INDEX < index-name > ON < relation_name > (< attribute-list >)
attribute-list phi to thnh mt kho d tuyn, nu khng s c mt thng bo li.
B i mt ch mc s dng lnh DROP:
DROP INDEX < index-name >

BM (HASHING)
BM TNH (Static Hashing)
Bt li ca t chc file tun t l ta phi truy xut mt cu trc ch mc nh v d liu,
hoc phi s dng tm kim nh phn, v kt qu l c nhiu hot ng I/O. T chc file da trn
k thut bm cho php ta trnh c truy xut mt cu trc ch mc. Bm cung cung cp mt
phng php xy dng cc ch mc.

T chc file bm
Trong t chc file bm, ta nhn c a ch ca khi a cha mt mu tin mong mun
bi tnh ton mt hm trn gi tr kho tm kim ca mu tin. thut ng bucket c dng ch
mt n v lu tr. Mt bucket kiu mu l mt khi a, nhng c th c chn nh hn hoc
ln hn mt khi a.

B k hiu tp tt c cc a ch bucket.
B : h: K B

k hiu tp tt c cc gi tr kho tm kim,

Mt hm bm h l mt hm t

K vo

Xen mt mu tin vi gi tr kho K vo trong file: ta tnh h(K). Gi tr ca h(K) l a ch


ca bucket s cha mu tin. Nu c khng gian trong bucket cho mu tin, mu tin c lu tr
trong bucket.
CHNG III. LU TR V CU TRC TP TIN

trang

61

H QUN TR C S D LIU

Tm kim mt mu tin theo gi tr kho K: u tin tnh h(K), ta tm c bucket tng


ng. sau d tm trong bucket ny mu tin vi gi tr kho K mong mun.
Xo mu tin vi gi tr kho K: tnh h(K), tm trong bucket tng ng mu tin mong
mun, xo n khi bucket.

Hm bm
Hm bm xu nht l hm nh x tt c cc gi tr kho vo cng mt bucket. Hm bm l
tng l hm phn phi u cc gi tr kho vo cc bucket, nh vy mi bucket cha mt s
lng mu tin nh nhau. Ta mun chn mt hm bm tho mn cc tiu chun sau:
o Phn phi u: Mi bucket c gn cng mt s gi tr kho tm kim trong tp
hp tt c cc gi tr kho c th
o Phn phi ngu nhin: Trong trng hp trung bnh, cc bucket c gn mt s
lng gi tr kho tm kim gn bng nhau.
Cc hm bm phi c thit k thn trng. Mt hm bm xu c th dn n vic tm
kim chim mt thi gian t l vi s kho tm kim trong file.

iu khin trn bucket


Khi xen mt mu tin, nu bucket tng ng cn ch, mu tin c xen vo bucket, nu
khng s xy ra trn bucket. Trn bucket do cc nguyn do sau:
Cc bucket khng . S cc bucket nB phi tho mn nB > nr / fr trong nr l tng
s mu tin s lu tr, fr l s mu tin c th lp y trong mt bucket.
S lch. Mt vi bucket c gn cho mt s lng mu tin nhiu hn cc bucket
khc, nh vy mt bucket c th trn trong khi cc bucket khc vn cn khng gian.
Tnh hung ny c gi l s lch bucket. S lch xy ra do hai nguyn nhn:
1. Nhiu mu tin c cng kho tm kim
2. Hm bm c chn phn phi cc gi tr kho khng u
Ta qun l trn bucket bng cch dng cc bucket trn. Nu mt mu tin phi c xen
vo bucket B nhng bucket B y, khi mt bucket trn s c cp cho B v mu tin c
xen vo bucket trn ny. Nu bucket trn cng y mt bucket trn mi li c cp v c nh
vy. Tt c cc bucket trn ca mt bucket c mc xch vi nhau thnh mt danh sch lin
kt. Vic iu khin trn dng danh sch lin kt nh vy c gi l dy chuyn trn. i vi
dy chuyn trn, thut ton tm kim thay i ch t: trc tin ta cng tnh gi tr hm bm trn
kho tm kim, ta c bucket B, kim tra cc mu tin, trong bucket B v tt c cc bucket trn
tng ng, c gi tr kho khp vi gi tr tm khng.
Mt cch iu khin trn bucket khc l: Khi cn xen mt mu tin vo mt bucket nhng
n y, thay v cp thm mt bucket trn, ta s dng mt hm bm k trong mt dy cc hm
bm c chn tm bucket khc cho mu tin, nu bucket sau cng y, ta li s dng mt hm
bm k v c nh vy... Dy cc hm bm thng c s dng l { hi (K) = (hi-1(K) +1) mod nB
vi 1 i nB-1 v h0 l hm bm c s }.
Dng cu trc bm s dng dy chuyn bucket c gi l bm m. Dng s dng dy
cc hm bm c gi l bm ng. Trong cc h CSDL, cu trc bm ng thng c a
dng hn.

CHNG III. LU TR V CU TRC TP TIN

trang

62

H QUN TR C S D LIU

Ch mc bm
Mt ch mc bm t chc cc kho tm kim cng con tr kt hp vo mt cu trc file
bm nh sau: p dng mt hm bm trn kho tm kim nh danh bucket sau lu gi tr
kho v con tr kt hp vo bucket ny (hoc vo cc bucket trn). Ch mc bm thng l ch
mc th cp.
Hm bm trn s ti khon c tnh theo cng thc:
h(Account_number) = (tng cc ch s trong s ti khon) mod 7

BM NG (Dynamic Hashing)

Trong k thut bm tnh (static hashing), tp


cc a ch bucket phi l c nh. Cc
CSDL pht trin ln ln theo thi gian. Nu ta s dng bm tnh cho CSDL, ta c ba lp la chn:
1. Chn mt hm bm da trn kch c file hin hnh. S la chn ny s dn n s
suy gim hiu nng khi CSDL ln ln.
2. Chn mt hm bm da trn kch c file d on trc cho mt thi im no
trong tng lai. Mc d s suy gim hiu nng c ci thin, mt lng ng k
khng gian c th b lng ph lc khi u.
3. T chc li theo chu k cu trc bm p ng s pht trin kch c file. Mt s t
chc li nh vy ko theo vic la chn mt hm bm mi, tnh li hm bm trn
mi mu tin trong file v sinh ra cc gn bucket mi. T chc li l mt hot ng
tn thi gian. Hn na, n i hi cm truy xut file trong khi ang t chc li file.

bucket 0
A-215
A-305
Brighton

A-217

750

A-101

Downtown

A-101

500

A-110

Downtown

A-110

600

Mianus

A-215

700

A-217

Perryridge

A-102

400

A-102

Perryridge

A-203

900

bucket 3

Perryridge

A-218

700

A-218

Redwood

A-222

850

Round Hill

A-305

550

bucket 1

bucket 2

bucket 4
A-203
bucket 5
A-222
CHNG III. LU TR V CU TRC TP TIN
bucket 6

trang

63

H QUN TR C S D LIU

Ch mc bm trn kho tm kim account-number ca file account


K thut bm ng cho php sa i hm bm ph hp vi s tng hoc gim ca
CSDL. Mt dng bm ng c gi l bm c th m rng (extendable hashing) c thc hin
nh sau: Chn mt hm bm h vi cc tnh cht u, ngu nhin v c min gi tr tng i
rng, chng hn, l mt s nguyn b bit (b thng l 32). Khi khi u ta khng s dng ton b
b bit gi tr bm. Ti mt thi im, ta ch s dng i bit 0 i b. i bit ny c dng nh mt
di (offset) trong mt bng a ch bucket ph. gi tr i tng ln hay gim xung tu theo kch
c CSDL.
S i xut hin bn trn bng a ch bucket ch ra rng i bit ca gi tr bm h(K) c i
hi xc nh bucket ng cho K, s ny s thay i khi kch c file thay i. Mc d i bit dc
i hi tm u vo ng trong bng a ch bucket, mt s u vo bng k nhau c th tr
n cng mt bucket. Tt c cc nh vy c chung hash prefix chung, nhng chiu di ca prefix
ny c th nh hn i. Ta kt hp mt s nguyn ch di ca hash prefix chung ny, ta s k
hiu s nguyn kt hp vi bucket j l ij. S cc u vo bng a ch bucket tr n bucket
(i )
j l 2 i j .
i1
hash prefix
i

bucket 1

..00

i2

..01
..10
..11

bucket 2

i3

.
.
bng a ch bucket

bucket 3

Cu trc bm c th m rng tng qut


nh v bucket cha gi tr kho tm kim K , ta ly i bit cao u tin ca h(K), tm
trong u vo bng tng ng vi chui bit ny v ln theo con tr trong u vo bng ny.
xen mt mu tin vi gi tr kho tm kim K, tin hnh th tc dnh v trn, ta c bucket, gi s
l bucket j. Nu cn cho cho mu tin, xen mu tin vo trong bucket . Nu khng, ta phi tch
bucket ra v phn phi li cc mu tin hin c cng mu tin mi. tch bucket, u tin ta xc
nh t gi tr bm c cn tng s bit ln hay khng.

Nu i = ij , ch c mt u vo trong bng a ch bucket tr n bucket j. ta cn tng kch


c ca bng a ch bucket sao cho ta c th bao hm cc con tr n hai bucket kt qu

CHNG III. LU TR V CU TRC TP TIN

trang

64

H QUN TR C S D LIU

ca vic tch bucket j bng cch xt thm mt bit ca gi tr bm. tng gi tr i ln mt,
nh vy kch c ca bng a ch bucket tng ln gp i. Mi mt u vo c thay bi
hai u vo, c hai cng cha con tr ca u vo gc. By gi hai u vo trong bng a
ch bucket tr ti bucket j. Ta nh v mt bucket mi (bucket z), v t u vo th hai tr
ti bucket mi, t ij v iz v i, tip theo mi mt mu tin trong bucket j c bm li,
tu thuc vo i bit u tin, s hoc li bucket j hoc c cp pht cho bucket mi c
to.

Nu i > ij khi nhiu hn mt u vo trong bng a ch bucket tr ti bucket j. nh vy


ta c th tch bucket j m khng cn tng kch c bng a ch bucket. Ta cp pht mt
bucket mi (bucket z) v t ij v iz n gi tr l kt qu ca vic thm 1 vo gi tr ij gc.
K n, ta iu chnh cc u vo trong bng a ch bucket trc y tr ti bucket j. Ta
li na u cc u vo, v t tt c cc u vo cn li tr ti bucket mi to (z). Tip
theo, mi mu tin trong trong bucket j c bm li v c cp pht cho hoc vo bucket
j hoc bucket z.

xo mt mu tin vi gi tr kho K, trc tin ta thc hin th tc nh v, ta tm c


bucket tng ng, gi l j, ta xo c kho tm kim trong bucket ln mu tin mu tin trong file.
bucket cng b xo, nu n tr nn rng. Ch rng, ti im ny, mt s bucket c th c kt
hp li v kch c ca bng a ch bucket s gim i mt na.
u im chnh ca bm c th m rng l hiu nng khng b suy gim khi file tng kch
c, hn na, tng ph khng gian l ti tiu mc d phi thm vo khng gian cho bng a ch
bucket. Mt khuyt im ca bm c th m rng l vic tm kim phi bao hm mt mc gin
tip: ta phi truy xut bng a ch bucket trc khi truy xut n bucket. V vy, bm c th m
rng l mt k thut rt hp dn.

CHN CH MC HAY BM ?
Ta xt qua cc s : ch mc th t, bm. Ta c th t chc file cc mu tin bi hoc
s dng t chc file tun t ch mc, hoc s dng B+-cy, hoc s dng bm ... Mi s c
nhng cc u im trong cc tnh hung nht nh. Mt nh thc thi h CSDL c th cung cp
nhiu nhiu s , li vic quyt nh s dng s no cho nh thit k CSDL. c mt s
la chn khn ngoan, nh thc thi hoc nh thit k CSDL phi xt cc yu t sau:

Ci gi phi tr cho vic t chc li theo nh k ca ch mc hoc bm c chp nhn


c hay khng?
Tn s tng i ca cc hot ng xen v xo l bao nhiu ?
C nn ti u ho thi gian truy xut trung bnh trong khi thi gian truy xut trng
hp xu nht tng ln hay khng ?
Cc kiu vn tin m cc ngi s dng thch t ra l g ?

CU TRC LU TR CHO CSDL HNG I TNG

SP XP CC I TNG VO FILE
Phn d liu ca i tng c th c lu tr bi s dng cc cu trc file c m t
trc y vi mt s thay i do i tng c kch c khng u, hn na i tng c th rt
ln. Ta c th thc thi cc trng tp hp t phn t bng cch s dng danh sch lin kt, cc
trng tp hp nhiu phn t bi B-cy hoc bi cc quan h ring bit trong c s d liu. Cc
trng tp hp cng c th b loi tr mc lu tr bi chun ho. Cc i tng cc ln kh c
CHNG III. LU TR V CU TRC TP TIN

trang

65

H QUN TR C S D LIU

th phn tch thnh cc thnh phn nh hn c th c lu tr trong mt file ring cho mi i


tng.

THC THI NH DANH I TNG


V i tng c nhn bit bi nh danh ca i tng (OID = objject Identifier), Mt
h lu tr i tng cn phi c mt c ch tm kim mt i tng c cho bi mt OID.
Nu cc OID l logic, c ngha l chng khng xc nh v tr ca di tng, h thng lu tr phi
duy tr mt ch mc m n nh x OID ti v tr hin hnh ca i tng. Nu cc OID l vt l,
c ngha l chng m ho v tr ca i tng, i tng c th dc tm trc tip. Cc OID in
hnh c ba trng sau:
1. Mt volume hoc nh danh file
2. Mt nh danh trang bn trong volume hoc file
3. Mt offset bn trong trang
Hn na, OID vt l c th cha mt nh danh duy nht, n l mt s nguyn tch bit
OID vi cc nh danh ca cc i tng khc c lu tr cng v tr trc y v b xo
hoc di i. nh danh duy nht ny cng c lu vi i tng, cc nh danh trong mt OID
v i tng tng ng ph hp. Nu nh danh duy nht trong mt OID vt l khng khp vi
vi nh danh duy nht trong i tng m OID ny tr ti, h thng pht hin ra rng con tr l
bm v bo mt li. Li con tr nh vy xy ra khi OID vt l tng ng vi i tng c b
xo do tai nn. Nu khng gian b chim bi i tng c cp pht li, c th c mt i tng
mi vo v tr ny v c th c nh a ch khng ng bi nh danh ca i tng c. Nu
khng pht hin c, s dng con tr bm c th gy nn s sai lc ca mt i tng mi c
lu cng v tr. nh danh duy nht tr gip pht hin li nh vy. Gi s mt i tng phi di
chuyn sang trang mi do s ln ln ca i tng v trang c khng c khng gian ph. Khi
OID vt l tr ti trang c by gi khng cn cha i tng. Thay v thay i OID ca i tng
(iu ny ko theo s thay i mi i tng tr ti i tng ny) ta a ch forward v tr
c. Khi CSDL tm i tng, n tm a ch forward thay cho tm i tng v s dng a ch
forward tm i tng.

QUN TR CC CON TR BN (persistent pointers)


Ta thc thi cc con tr bn trong ngn ng lp trnh bn (persistent programming language) bng cch s
dng cc OID. Cc con tr bn c th l cc OID vt l hoc logic. S khc nhau quan trng gia con tr bn v con
tr trong b nh l kch thc conca con tr. Con tr trong b nh ch cn ln nh a ch ton b b nh o,
hin ti kch c con tr trong b nh l 4 byte. Con tr bn nh a ch ton b d liu trong mt CSDL, nn kch
c ca n t nht l 8 byte.

Pointer Swizzling
Hnh ng tm mt i tng c cho bi nh danh c gi l dereferencing. cho mt con tr trong
b nh, tm i tng n thun l mt s tham kho b nh. cho mt con tr bn, dereferencing mt i tng
c mt bc ph: phi tm v tr hin hnh ca i tng trong b nh bi tm con tr bn trong mt bng. Nu i
tng cha nm trong b nh, n phi c np t a. Ta c th thc thi bng tm kim ny hon ton hiu qu bi
s dng bm, song tm kim vn chm.
pointer swizling l mt phng php gim ci gi tm kim cc i tng bn hin din trong b nh.
tng l khi mt con tr bn c dereference, i tng c nh v v mang vo trong b (nh nu n cha c
). By gi mt bc ph c thc hin: mt con tr trong b nh ti i tng c lu vo v tr ca con tr
bn. Ln k con tr bn tng t c dereference, v tr trong b nh c th c c ra trc tip. Trong trng hp
cc i tng bn phi di chuyn ln a ly khng gian cho i tng bn khc, cn mt bc ph m bo
i tng vn trong b nh cng phi c thc hin. Khi mt i tng c vit ra. bt k con tr bn no m n
cha v b swizzling phi c unswizzling nh vy c chuyn i v biu din bn ca chng. pointer swizzling

CHNG III. LU TR V CU TRC TP TIN

trang

66

H QUN TR C S D LIU
trn poiter dereferenc c m t ny c gi l software swizzling. Quan tr buffer s phc tp hn nu pointer
swizzling c s dng.

Hardware swizzling
Vic c hai kiu con tr, con tr bn (persistent pointer) v con tr tm (transient pointer / con tr trong b
nh), l iu kh bt li. Ngi lp trnh phi nh kiu con tr v c th phi vit m chng trnh hai ln- mt cho
cc con tr bn v mt cho con tr tm. S thun tin hn nu c hai kiu con tr ny cng kiu. Mt cch n gin
trn ln hai con r ny l m rng chiu di con tr b nh cho bng kch c con tr bn v s dng mt bit ca
phn nh danh phn bit chng. Cch lm ny s lm tng chi ph lu tr i vi cc con tr tm. Ta s m t mt
k thut c gi l hardware swizzling n s dng phn cng qun tr b nh gii quyt vn ny. Hardware
swizzling c hai im li hn so vi software swizzling: Th nht, n cho php lu tr cc con tr bn trong i
tng trong lng khng gian bng vi lng khng gian con tr b nh i hi. Th hai, n chuyn i trong sut
gia cc con tr bn v cc con tr tm mt cch thng minh v hiu qu. Phn mm c vit gii quyt cc con
tr trong b nh c th gii quyt cc con tr bn m khng cn thay i.
hardware swizzling s dng s biu din cc con tr bn c cha trong i tng trn a nh sau: Mt
con tr bn c tch ra thnh hai phn, mt l nh danh trang v mt l offset bn trong trang. nh danh trang
thng l mt con tr trc tip nh: mi trng c mt bng dch (translation table) cung cp mt nh x t cc nh
danh trang ngn n cc nh danh CSDL y . H thng phi tm nh danh trang nh trong mt con tr bn trong
bng dch tm nh danh trang y . Bng dch, trong trng hp xu nht, ch ln bng s ti a cc con tr c
th c cha trong cc i tng trong mt trang. Vi mt trang kch thc 4096 byte, con tr kch thc 4 byte, s
ti a cc con tr l 1024. Trong thc t s ti a nh hn con s ny rt nhiu. nh danh trang nh ch cn s bit
nh danh mt dng trong bng, nu s dng ti a l 1024, ch cn 10 bit nh danh trang nh. Bng dch cho
php ton b mt con tr bn lp y mt khng gian bng khng gian cho mt con tr trong b nh.

PageID

Off.

PageID

Off.

PageID

Off.

PageID

Off.

PageID

Off.

PageID

Off.

2395

255

4867

020

2395

170

5001

255

4867

020

5001

170

Object 1

Object 2

Object 3

Translation Table

Object 1

Object 2

Translation Table

PageID

FullPageID

PageID

FullPageID

2395

679.34.28000

5001

679.34.28000

4867

519.56.84000

4867

519.56.84000

Hnh 1. nh trang trc khi swizzling

Object 3

Hnh 2. nh trang sau khi swizzling

Trong hnh 1, trnh by s biu din con tr bn, c ba i tng trong trang, mi mt cha mt con tr
bn. Bng dch cho ra nh x gia nh danh trang ngn v nh danh trang CSDL y i vi mi nh danh trang
ngn trong cc con tr bn ny. nh danh trang CSDL c trnh by di dng volume.file.offset. Thng tin ph
c duy tr vi mi trang sao cho tt cc cc con tr bn trong trang c th tm thy. Thng tin c cp nht khi
mt i tng c to ra hay b xo khi trang. Khi mt con tr trong b nh c dereferencing, nu h iu hnh
pht hin trang trong khng gian a ch o c tr ti khng c cp pht lu tr hoc c truy xut c bo v,
khi mt s vi phm on c c on l xy ra. Nhiu h iu hnh cung cp mt c ch xc nh mt hm se
c gi khi vi phm on xy ra, mt c ch cp pht lu tr cho cc trang trong khng gian a ch o, v mt tp
cc quyn truy xut trang. u tin, ta xt mt con tr trong b nh tr ti trang v c kh tham chiu, khi lu tr
cha c cp pht cho trang ny. Mt vi phm on s xy ra v kt qu l mt li gi hm trn h CSDL. H
CSDL du tin xc nh trang CSDL no c cp pht cho trang b nh o v, gi nh danh trang y ca
trang CSDL l P, nu khng c trang CSDL cp pht cho v, mt li c thng bo., nu khng, h CSDL cp pht
khng gian lu tr cho trang v v np trang CSDL P vo trong v. Pointer swizzling by gi c lm i vi trang P
nh sau: H thng tm tt c cc con tr bn c cha trong cc i tng trong trang, bng cch s dng thng tin
ph c lu tr trong trang. Ta xt mt con tr nh vy v gi n l (pi, oi), trong pi l nh danh trang ngn v

CHNG III. LU TR V CU TRC TP TIN

trang

67

H QUN TR C S D LIU
oi l offset trong trang. Gi s Pi l nh danh trang y ca pi c tm thy trong bng dch trong trang P. Nu
trang Pi cha c mt trang b nh o c cp cho n, mt trang t do trong khng gian a ch o s c cp cho
n. Trang Pi s nm v tr a ch o nynu v khi n c mang vo. Ti im ny, trang trong khng gian a ch
o khng c bt k mt lu tr no c cp cho n, c trong b nh ln trn a, n tun ch l mt khong a ch
d tr cho trang CSDL. By gi gi s trang b nh o c cp pht cho Pi l vi . Ta cp nht con tr (pi, oi) bi
thay th pi bi vi , cui cng sau khi swizzling tt c cc con tr bn trong P, s kh tham chiu gy ra vi phm on
c cho php tip tc v s tm thy i tng ang c tm kim trong b nh.
Trong hnh 2, trnh by trng thi trang trong hnh 1 sau khi trang ny c mang vo trong b nh v cc
con tr trong n c swizzling. y ta gi thit trang nh danh trang CSDL ca n l 679.34.28000 c nh
x n trang 5001 trong b nh, trong khi trang nh danh ca n l 519.56.84000 c nh x dn trang 4867. Tt c
cc con tr trong i tng c cp nht phn nh tng ng mi v by gi c th c dng nh con tr
trong b nh. cui ca giai on dch i vi mt trang, cc i tng trong trang tho mn mt tnh cht quan
trng: Tt c cc con tr bn c cha trong i tng trong trang c chuyn i thnh cc con tr trong b nh.

CU HI V BI TP CHNG III
III.1

Xt s sp xp cc khi d liu v cc khi parity trn bn a sau:


a 1

a 2

a 3

a 4

B1

B2

B3

B4

P1

B5

B6

B7

B8

P2

B9

B10

...

...

...

...

Trong cc Bi biu din cc khi d liu, cc khi Pi biu din cc khi parity. Khi Pi
l khi parity i vi cc khi d liu B4i - 3 , B4i - 2 , B4i - 1 , B4i . Hy nu cc vn gp phi ca
cch sp xp ny.
III.2 Mt s mt in xy ra trong khi mt khi ang c vit s dn ti kt qu l khi c
th ch c vit mt phn. Gi s rng khi c vit mt phn c th pht hin c. Mt vit
khi nguyn t l hoc ton b khi c vit hoc khng c g c vit (khng c khi c
vit mt phn). Hy ngh nhng s c c cc vit khi nguyn t hiu qu trn cc s
RAID:
1. Mc 1

(mirroring)

2. Mc 5

(block interleaved, distributed parity)

CHNG III. LU TR V CU TRC TP TIN

trang

68

H QUN TR C S D LIU
III.3
Cc h thng RAID tiu biu cho php thay th cc a h khng cn ngng truy xut h thng.
Nh vy d liu trong a b h s phi c ti to v vit ln a thay th trong khi h thng vn tip tc
hot ng. Vi mc RAID no thi lng giao thoa gia vic ti to v cc truy xut a cn ang chy l
t nht ? Gii thch.
III.4

Xt vic xo mu tin 5 trong file:


0

Perryridge

A-102

400

Round Hill

A-305

350

Perryridge

A-218

700

Downtown

A-101

500

Redwood

A-222

700

Perryridge

A-201

900

Brighton

A-217

750

Downtown

A-110

600

So snh cc iu hay/d tng i ca cc k thut xo sau:


1. Di chuyn mu tin 6 n khng gian ch chim bi mu tin 5, ri di chuyn
mu tin 7 n ch b chim bi mu tin 6.
2. Di chuyn mu tin 7 n ch b chim bi mu tin 5
3. nh du xo mu tin 5.
III.5

V cu trc ca file:
header

Perryridge

A-102

400

Mianus

A-215

700

Downtown

A-101

500

Perryridge

A-201

900

Downtown

A-110

600

Perryridge

A-218

700

4
5
6

Sau mi bc sau:
1. Insert(Brighton, A-323, 1600)
2. Xo mu tin 2
3. Insert(Brighton, A-636, 2500)
CHNG III. LU TR V CU TRC TP TIN

trang

69

H QUN TR C S D LIU
III.6

V li cu trc file:
0

Perryridge

A-102

400

A-201

Round Hill

A-301

350

Mianus

A-101

800

Downtown

A-211

500

Redwood

A-300

Brighton

A-111

900

A210

A-222

600

650

A-200

1200

A-255

750

700

950

Sau mi bc sau:
1. Insert(Mianus, A-101, 2800)
2. Insert(Brighton, A-323, 1600)
3. Delete (Perryridge, A-102, 400)
III.7

iu g s xy ra nu xen mu tin (Perryridge, A-999, 5000) vo file trong III.6.

III.8

V li cu trc file di y sau mi bc sau:


1. Insert(Mianus, A-101, 2800
2. Insert(Brighton, A-323, 1600)
3. Delete (Perryridge, A-102, 400)
0

Perryridge

A-102

400

A-201

900

A-210

700

Round Hill

A-301

350

Mianus

A-101

800

Downtown

A-211

500

Redwood

A-300

650

Brighton

A-111

750

A-222

600

A-200

1200

10

A-255

950

( = con tr nil )

III.9 Nu ln mt v d, trong phng php khng gian d tr biu din cc mu tin


di thay i ph hp hn phng php con tr.
III.10 Nu ln mt v d, trong phng php con tr biu din cc mu tin di thay i ph hp
hn phng php khng gian d tr.
III.11 Nu mt khi tr nn rng sau khi xo. Khi ny c ti s dng vo mc ch g ?
CHNG III. LU TR V CU TRC TP TIN

trang

70

H QUN TR C S D LIU
III.12 Trong t chc file tun t, ti sao khi trn c s dng thm ch, ti thi im ang xt, ch c
mt mu tin trn ?
III.13 Lit k cc u im v nhc im ca mi mt trong cc chin lc lu tr CSDL quan h sau:
1. Lu tr mi quan h trong mt file
2. Lu tr nhiu quan h trong mt file
III.14 Nu mt v d biu thc i s quan h v mt chin lc x l vn tin trong :
1. MRU ph hp hn LRU
2. LRU ph hp hn MRU
III.15 Khi no s dng ch mc c ph hp hn ch mc tha ? Gii thch.
III.16 Nu cc im khc nhau gia ch mc s cp v ch mc th cp .
III.17 C th c hai ch mc s cp i vi hai kho khc nhau trn cng mt quan h ? Gii thch.
III.18 Xy dng mt B+-cy i vi tp cc gi tr kho: (2, 3, 5, 7, 11, 15, 19, 25, 29, 33, 37, 41, 47).
Gi thit ban u cy l rng v cc gi tr c xen theo th t tng. Xt trong cc trng hp sau:
1. Mi nt cha ti a 4 con tr
2. Mi nt cha ti a 6 con tr
3. Mi nt cha ti a 8 con tr
III.19 i vi mi B+-cy trong bi tp III.18 By t cc bc thc hin trong cc vn tin sau:
1. Tm mu tin vi gi tr kho tm kim 11
2. Tm cc mu tin vi gi tr kho nm trong khong [ 7..19 ]
III.20 i vi mi B+-cy trong bi tp III.18. V cy sau mi mt trong dy hot ng sau:
1. Insert 9
2. Insert 11
3. Insert 11
4. Delete 25
5. Delete 19
III.21 Cng cu hi nh trong III.18 nhng i vi B-cy
III.22 Nu v gii thch s khc nhau gia bm ng v bm m. Nu cc u, nhc im ca mi k
thut ny.
III.23 iu g gy ra s trn bucket trong mt t chc file bm ? Lm g gim s trn ny ?
III.24 Gi s ta ang s dng bm c th m rng trn mt trn mt file cha cc mu tin vi cc gi tr
kho tm kim sau:
2, 3, 5, 7, 11, 17, 19, 23, 37, 31, 35, 41, 49, 55
V cu trc bm c th m rng i vi file ny nu hm bm l h(x) = x mod 8 v mi bucket c
th cha nhiu nht c ba mu tin.
III.25 V li cu trc bm c th m rng trong bi tp III.24 sau mi bc sau:
1. Xo 11
2. Xo 55
3. Xen 1
CHNG III. LU TR V CU TRC TP TIN

trang

71

H QUN TR C S D LIU
4. Xen 15

CHNG III. LU TR V CU TRC TP TIN

trang

72

You might also like