Download as pdf or txt
Download as pdf or txt
You are on page 1of 46

KHAI THC LUT KT HP

1
DDNN NH NHPP DDNN NH NHPP
Xt CSDL kho st tin nghi s dng cc h
gia nh nh sau: gia nh nh sau:
H Tin nghi s hu
1 Ti i M Vit h 1 Tivi, MyVitnh
2 Tlnh, Mylnh
3 Tivi, Mygit, Mylnh 3 Tivi, Mygit, Mylnh
4 Tivi, Tlnh, Mylnh
5 Tivi, Mygit, MyVitnh
6 Tivi, Tlnh, Mygit
7 Tivi, Tlnh, MyVitnh
8 Tivi, Tlnh, Mygit, Mylnh, MyVitnh 2
LLUUTT KKTT HHPP LLUUTT KKTT HHPP
Lu Lutt k ktt h hpp l l php php ko ko theo theo c c ddng ng::
Tivi Myvitnh [50%, 57%] hay
s dng:Tivi s dng:Myvitnh [50%, 57%]
Ngha l: 57% h gia nh s dng Tivi th cng s dng
Myvitnh. Tivi v Myvitnh xut hin chung trong 50%
dng d liu " dng d liu.
3
KKHAI HAI THC THC LU LUTT KKTT HHPP
Khai thc lut kt hp c chia lm hai giai on:
1. Khai thc tp ph bin(FIs Frequent Itemsets). p p ( q )
2. Khai thc lut t cc tp ph bin(ARs
Association Rules) Association Rules).
4
KKHAI HAI THC THC LU LUTT KKTT HHPP

CSDL
giao tc
Khai thc lut kt hp c chia lm hai giai on:
1. Khai thc tp ph bin(FIs Frequent Itemsets).
giao tc
p p ( q )
2. Khai thc lut t cc tp ph bin(ARs
Association Rules) Tm tp ph
CSDL lut
Association Rules). Tm tp ph
bin
Khai thc lut
FIs
5
1 Tm Tp ph bin 1. Tm Tp ph bin
2 Tm lut kt hp 2. Tm lut kt hp
6
TTM M TTPP PH PH BI BINN
c xut bi Agrawal nm 1993.
Mc ch: tm mi lin h gia cc mt hng
(danh mc) c bn trong siu th.
hi h h h n nay, c nhiu phng php c pht
trin nh:
Ph h A i i (A l) Phng php Apriori (Agrawal)
Phng php IT-tree (M. Zaki)
Phng php FP-tree (J Han) Phng php FP tree (J. Han)

7
MMT S T S THU THUT TON T TON
TM T TM TP PH P PH BI BINN
1 Phng php Apriori 1. Phng php Apriori.
2. Phng php FP-tree (Frequent Patterns
Tree) Tree).
3. Phng php IT-tree (Itemset-Tidset Tree).
8
NH NH NGHA NGHA
1. nh ngha ph bin:
Cho CSDL giao dch D v tp d liu X_I.
ph bin ca X trong D, k hiu o(X),
c nh ngha l s giao dch m X xut c nh ngha l s giao dch m X xut
hin trong D.
2. nh ngha tp ph bin:
Tp X_ I c gi l ph bin nu p g p
o(X)>minSup ( vi minSup l gi tr do
ngi dng ch nh).
9
21 21--Dec Dec- -10 10
TTNH NH CH CHTT AAPRIORI PRIORI
1. Mi tp con ca tp ph bin u ph bin, ngha
l X_Y, nu o(Y) > minSup th o(X) > minSup
2. Mi tp cha ca tp khng ph bin u khng ph
bin ngha l Y _ X nu o(X) < minSup th o(Y) bin, ngha l Y _ X, nu o(X) < minSup th o(Y)
< minSup
10
TTHU HUTT TON TON AAPRIORI PRIORI
u vo:CSDL giao dch D v ngng ph bin
minSupp
u ra: FIs cha tt c cc tp ph bin ca D
M gi:
Gi C
k
: Tp cc ng vin c kch thc k
k
p g
L
k
: Cc tp ph bin c kch thc k
L
1
= { i e I: o(i) > minSup}
for (k = 2; L
k-1
!=C; k++) do
C
k
= {cc ng vin c to t L
k 1
} C
k
{cc g v c to t
k-1
}
for each t e D do
if C
k
_ t then C
k
.count++
L
k
= {C
k
| C
k
.count > minSup}
FIs = L ; FIs =
k
L
k
;
11
CCCH CH TTOO NG NG VIN VIN CCAA AAPRIORI PRIORI
Nguyn tc Apriori:
Nh li tnh cht: mi tp con ca tp ph bin Nh li tnh cht: mi tp con ca tp ph bin
cng ph bin
Gi s ta c L
3
= {abc, abd, acd, ace, bcd}
Xt vic kt tao ra cc ng vin C
4
: L
3
*L
3
abcd c to t abc v abd
d t t d acde c to t acd v ace
Rt gn:
acde b loi v ade khng c trong L
3
acde b loi v ade khng c trong L
3
C
4
= {abcd}
12
VV DD MINH MINH HHAA VV DD MINH MINH HHAA
Bng 1: Xt CSDL mu
M M giao giao
dch dch
Ni dung giao Ni dung giao
dch dch
11 AA, , CC, , TT, , WW
22 CC, , DD, , WW
oo(A) = 4
oo(C) = 6
33 AA, , CC, , TT, , WW
44 AA, , CC, , DD, , WW
55 AA CC DD T W T W
oo(D) = 4
oo(T) = 4
55 AA, , CC, , DD, , T, W T, W
66 CC, , DD, , TT
oo(T) 4
oo(W) = 5
13
Vi minSup = 50% (50*6/100 = 3), ta c:
VV DD ((TT TT))
Database (D) L1 ( )
TID Ni dung Danh
mc

ph bin
1
AA, , CC, , TT, , WW
A 4
2
CC, , DD, , WW
C 6
AA CC TT WW
3
AA, , CC, , TT, , WW
D 4
4
AA, , CC, , DD, , WW
T 4
5
AA CC DD T W T W
W 5 5
AA, , CC, , DD, , T, W T, W
W 5
6
CC, , DD, , TT
14
TID TID Items Items
11 AA, , CC, , TT, , WW
22 CC, , DD, , WW
VV DD ((TT TT))
C2 L2
33 AA, , CC, , TT, , WW
44 AA, , CC, , DD, , WW
55 AA, , CC, , DD, , T, W T, W
C2 L2
Danh
mc
ph
bin
Danh
mc
ph
bin
,, ,, ,, ,,
66 CC, , DD, , TT

AC 4 AC 4
AD 2 AT 3
AT 3 AW 4
AW 4 CD 4
CD 4 CT 4
CT 4 CW 5
CW 5 DW 3
DT 2 TW 3
DW 3
15
DW 3
TW 3
TID TID Items Items
11 AA, , CC, , TT, , WW
VV DD ((TT TT))
22 CC, , DD, , WW
33 AA, , CC, , TT, , WW
44 AA, , CC, , DD, , WW
C3 L3
Danh ph Danh ph
55 AA, , CC, , DD, , T, W T, W
66 CC, , DD, , TT
mc
p
bin mc
p
bin
ACT 3 ACT 3
ACW 4 ACW 4
ATW 3 ATW 3
CDW 3 CDW 3
CTW 3 CTW 3
Lu : CDT khng c trong C
3
v DT khng c trong L
2
!
16
TID TID Items Items
11 AA, , CC, , TT, , WW
22 CC DD WW
VV DD ((TT TT))
C4 L4
22 CC, , DD, , WW
33 AA, , CC, , TT, , WW
44 AA, , CC, , DD, , WW
55 AA CC DD T W T W
Danh
mc
ph
bin
Danh
mc
ph
bin
55 AA, , CC, , DD, , T, W T, W
66 CC, , DD, , TT
ACTW 3 ACTW 3
C5 = C L5 = C
Danh
mc
ph
bin
Danh
mc
ph
bin
17
PH PH NG PHP FP NG PHP FP-- TREE TREE PH PH NG PHP FP NG PHP FP TREE TREE
Qut DB ln th nht tm tt c cc
item n ph bin (single item pattern)
Sp xp cc item theo th t gim ca
ph bin f-list
Q t DB l 2 X d FP t Qut DB ln 2, Xy dng FP-tree
21 21--Dec Dec- -10 10
18
FP FP-- TREE TREE XY XY DDNG NG CY CY
TID TID Items Items
11 AA, , CC, , TT, , WW
22 CC DD WW
FP FP-- TREE TREE XY XY DDNG NG CY CY
22 CC, , DD, , WW
33 AA, , CC, , TT, , WW
44 AA, , CC, , DD, , WW
55 AA CC DD T W T W 55 AA, , CC, , DD, , T, W T, W
66 CC, , DD, , TT
It A C D T W Item A C D T W
oo 4 6 4 4 5
It C W A D T
Sp xp theo oo
Item C W A D T
oo 6 5 4 4 4
19
21 21--Dec Dec- -10 10
FP FP-- TREE TREE XY XY DDNG NG CY CY
TID TID Items Items
11 AA, , CC, , TT, , WW
22 CC DD WW
AA, , CC, , TT, , WW
CC DD WW
FP FP TREE TREE XY XY DDNG NG CY CY
22 CC, , DD, , WW
33 AA, , CC, , TT, , WW
44 AA, , CC, , DD, , WW
55 AA CC DD T W T W
Item oo Link
{}
CC, , DD, , WW
AA, , CC, , TT, , WW
AA, , CC, , DD, , WW
AA CC D T D T WW 55 AA, , CC, , DD, , T, W T, W
66 CC, , DD, , TT
C 6
W 5
C:1 C:2 C:3 C:4
AA, , CC, , D, T, D, T, WW
C:5
CC, , D, T D, T
C:6
W 5
A 4
D 4
W:1
CC, , W, A, T W, A, T
CC, , W, D W, D
W:1 W:2 W:3 W:4 W:5
D:1
D 4
T 4
A:1 D:1
CC, , W, A, T W, A, T
A:2
CC, , W, A, D W, A, D
A:3 A:4 T:1
Item C W A D T
oo 6 5 4 4 4
T:1 T:2 D:1
CC, , W, A, D, T W, A, D, T
D:2
20
21 21--Dec Dec- -10 10
oo 6 5 4 4 4
T:1
FP-tree trn CSDL bng 1 vi minSup = 50%
CCHI HIUU TRN TRN FP FP--TREE TREE TT FP TT FP- -GGROWTH ROWTH
Item
oo
Link
{}
Chiu trn nt T: ta c CSDL
cc b nh sau:
C 6
W 5
C:1 C:2 C:3 C:4 C:5 C:6
cc b nh sau:
{CWA:2, CWAD:1, CD:1}
A 4
D 4
W:1 W:1 W:2 W:3 W:4 W:5
D:1
T 4
A:1
T:1
D:1 A:2
T:2
A:3
D:1
A:4
D:2
T:1
T:2
T:1
T
T:1 T:2 D:1 D:2
T:1
T:1
T:2
21
21 21--Dec Dec- -10 10
T:1
T:1
CCHI HIUU TRN TRN T:4 T:4
{CWA:2, CWAD:1, CD:1} Cy
cc b cho CSDL chiu trn T nh
y l ng i n nn vic tm
t h bi h i l
sau:
Item
oo
Link
{}
cc tp ph bin ch n gin l
tm cc tp con ca tp {C, W,
A}. Ta c cc tp con:
Item
oo
Link
C 4
W 3
C:2 C:3 C:4
{C,A:3,W:3,C:4,AW:3,AC:3,WC
:3, AWC:3}
W 3
A 3
W:2
W:3 V vy: chiu trn T sinh ra cc
tp ph bin l: {T:4, TA:3,
TW3 TC 4 TAW3 TAC 3
A:2 A:3
CWA:2
TW:3, TC:4, TAW:3, TAC:3,
TWC:3, TAWC:3}.
22
21 21--Dec Dec- -10 10
CWAD:1CWA:1
CD:1 C:1
CCHI HIUU TRN TRN D:4 D:4
D
{CWA:2, CW:1, C:1} Cy cc b nh sau:
{}
Item
oo
Link
C 4
{}
C:2 C:3 C:4
ng i n Cc tp con:
{C, W:3,C:4, WC:3}
W 3
W:2
W:3
Chiu trn D sinh ra cc tp ph
bin l:{D:4, DW:3, DC:4,
DWC:3}
23
21 21--Dec Dec- -10 10
DWC:3}.
CCHI HIUU TRN TRN A:4 A:4
A
{CW:4} Cy cc b nh sau:
{}
i C t
Item
oo
Link
C 4
{}
C:4
ng i n Cc tp con:
{C, W:4,C:4, WC:4}
W 4
W:4
Chiu trn Asinh ra cc tp ph bin
l:{A:4, AW:4, AC:4, AWC:4}.
24
21 21--Dec Dec- -10 10
CCHI HIUU TRN TRN W,C W,C
C
W
W:5 {C:5} Cy cc b nh sau:
{}
ng i n Cc tp con:
Item
oo
Link
C 5
C:5
{C, C:5}
Chiu trn W sinh ra cc tp ph p p
bin l:{W:5, WC:5}.
Cui cng, chiu trn C: 6 ta c {C} tp ph bin:{C:6}
25
21 21--Dec Dec- -10 10
Cui cng, chiu trn C: 6 ta c {C} tp ph bin:{C:6}.
FP FP-- TREE TREE NNHHNN XT XT
FP-tree duyt CSDL 2 ln, sau dng php y , g p p
chiu to ra CSDL cc b ca tng item n,
sau to cy FP cc b v khai thc trn cy
c c b mt cch qui cc b mt cch qui.
S dng phng php chia tr khai thc
tp ph bin. tp ph bin.
L phng php khng sinh ng vin.
Thng rt hiu qu trn cc CSDL c mt g q
trng lp d liu cao.
26
21 21--Dec Dec- -10 10
PH PH NG PHP IT NG PHP IT- - TREE TREE
Kt ni Galois: Kt ni Galois:
Cho quan h hai ngi o _ I T cha
CSDL cn khai thc. Vi: X _ I v Y _ T. _ _
nh ngha hai nh x gia P(I) (Tp tt
c cc tp con = C ca I) v P(T) nh sau: ) ( )
t: P(I ) P(T ), t(X) = {yeT | xeX, x o y}
i: P(T) P(I ), i(Y) = {xeI | yeY, x o y}
27
PH PH NG PHP IT NG PHP IT-- TREE TREE ((TT TT)) PH PH NG PHP IT NG PHP IT TREE TREE ((TT TT))
Cu trc IT-tree v cc lp tng ng:
Cho X_I ta nh ngha hm p(X k)=X[1:k] Cho X_I, ta nh ngha hm p(X,k)=X[1:k]
gm k phn t u ca X v quan h tng
ng da vo tin t nh sau:
Mi t t IT t 2 th h h Mi nt trn IT-tree gm 2 thnh phn
Itemset-Tidset: Xt(X) c gi l IT-pair,
thc cht l mt lp tin t. Cc nt con ca p
X thuc v lp tng ng ca X v chng
chia s chung tin t X (t(X) l tp cc giao
dch c cha X) dch c cha X)
28
NNHHNN XT XT VV IT IT-- TREE TREE NNHHNN XT XT VV IT IT TREE TREE
1. o(X) =|t(X)|
2 Ch cn kt hp cc phn t trn cng mt 2. Ch cn kt hp cc phn t trn cng mt
mc ca lp tng ng l sinh ra cc
tp ph bin.
29
THU THUT TON TM T T TON TM TP PH P PH BI BIN N
ECLAT()
[C] = {ieI| oo(i)> minSup} [C] = {ieI| oo(i)> minSup}
ENUMERATE_FREQUENT([C])
ENUMERATE_FREQUENT([P])
for all l
i
e [P] do
[P
i
] = C
for all l
j
e [P] with j > i do
j
[ ] j
X = l
i
l
j
T = t(l
i
) t(l
j
)
if |T| > minSup then if |T| > minSup then
[P
i
] = [P
i
] {XT}
ENUMERATE_FREQUENT([P
i
])
30
Trong t(X) = {yeT | X xut hin trong giao dch y} c gi l Tidset ca X.
V D V D MINH H MINH HAA
Xt CSDL mu ca bng 1 nh dng d liu dc
M danh mc Cc giao dch
cha danh mc
M giao dch Ni dung giao dch
1 C

A 1, 3, 4, 5
C 1, 2, 3, 4, 5, 6
1 A, C, T, W
2 C, D, W
3 A C T W
, , , , ,
D 2, 4, 5, 6
T 1, 3, 5, 6
3 A, C, T, W
4 A, C, D, W
5 A, C, D, T, W
t(A) 1345 t(AD) t(A) t(D) 1345 2456 45
W 1, 2, 3, 4, 5
, , , ,
6 C, D, T
t(A) = 1345; t(AD) = t(A) t(D) = 1345 2456 = 45
31
31 31
IT-tree vi minSup=50%
Item Item TID TID
AA 1, 3, 4, 5 1, 3, 4, 5
CC 1, 2, 3, 4, 5, 6 1, 2, 3, 4, 5, 6
{}x123456
, , , , , , , , , ,
DD 2, 4, 5, 6 2, 4, 5, 6
TT 1, 3, 5, 6 1, 3, 5, 6
WW 1, 2, 3, 4, 5 1, 2, 3, 4, 5
Cx123456 Dx2456 Tx1356
Wx12345
Ax1345 A C 123456 1345 1345

WW 1, 2, 3, 4, 5 1, 2, 3, 4, 5
ADx45 AWx1345 CWx12345 DTx56 DWx245 TWx135
ACx1345
ATx135 CDx2456 CTx1356
AC
ACTx135 ACWx1345 ATWx135 CDTx56 CDWx245 CTWx135
32
ACTWx135
C 19 tp ph bin tha minSup = 50%
NH NHN XT N XT NH NHN XT N XT
Thut ton da vo phn giao gia cc

p g g
Tidset tnh nhanh ph bin nn ch
qut CSDL 1 ln.
C th s dng Diffset tnh nhanh C th s dng Diffset tnh nhanh
ph bin nhm lm gim khng gian lu
tr Tidset.
D th t t kh i h i Do thut ton khng sinh ng vin nn
hiu qu khai thc thng cao hn so vi
cc h thut ton sinh ng vin.
Khi s tp ph bin ln, thi gian khai
thc lut ln Cn phng php khai
thc hiu qu hn thc hiu qu hn
33
DDIFFSET IFFSET TNH TNH NHANH NHANH PH PH BI BINN
Diffset ca X so vi Y, k hiu d(PX) c nh , ( )
ngha nh sau:
d(PX) = t(P) t(X)
(PXY) (PX) |d(PXY)| (1) o(PXY) = o(PX) - |d(PXY)| (1)
Ta c d(PXY) = d(PY) d(PX) (2)
Diffset thng kh nh so vi Tidset (3) Diffset thng kh nh so vi Tidset (3)
T (1), (2) v (3), chng ta c th s dng Diffset
thay th Tidset.
34
DDIFFSET IFFSET ((TT TT))
{}x12345
Mc 1 dng
Tidset
Bx12345 Cx245 Dx135 Ex234 Ax1345 A
B
12345
1345
-
B
Mc 2:
d(PX) = t(P) t(X)
ACx13 AEx15 BEx15 CDx24 CEx5 DEx15
ABx
ADx4 BCx13 BDx24
AB
d(PX) = t(P) t(X)
ABDx4
BCDx24 BCEx5 BDEx15
35
T mc 3:
d(PXY) = d(PY) d(PX)
NNHHNN XT XT
Kch thc Diffset thng kh nh so vi Tidset
nn tit kim c khng gian b nh v thi nn tit kim c khng gian b nh v thi
gian tnh phn khc nhau.
So snh di trung bnh gia Tidset v Diffset trn cc CSDL chun[4]
CSDL MinSup
(%)
di trung
bnh Diffset
di trung
bnh Tidset
T l
Tidset/Diffset
h 0 5 26 1820 70
g g [ ]
chess 0.5 26 1820 70
connect 90 143 62204 434.99
mushroom 5 60 622 10.37
pumsb_star 35 301 18977 63.04
pumsb 90 330 45036 136.47
T10I4D100K 0.1 31 230 7.42
T l = 1820/26
T40I10D100K 0.5 96 755 7.86
36
TM T TM TP PH P PH BI BIN NG N NG
((FFREQUENT REQUENT CCLOSED LOSED IITEMSETS TEMSETS FCI) FCI) ((FFREQUENT REQUENT CCLOSED LOSED IITEMSETS TEMSETS -- FCI) FCI)
Ton t ng:
Cho X _ I. c
it
: P(I) P(I): c
it
(X) = i(t(X)). nh x c
it
c gi l ton t
ng.
V d: c
it
(AW) = i(t(AW)) = i(1345) = ACW
Tp ng:
Cho X _ I. X gi l tp ng c
it
(X) = X.
37
TM T TM TP PH P PH BI BIN NG N NG
((FFREQUENT REQUENT CCLOSED LOSED IITEMSETS TEMSETS FCI) FCI) ((FFREQUENT REQUENT CCLOSED LOSED IITEMSETS TEMSETS -- FCI) FCI)
Item Item TID TID
Tid Tid Items Items
AA 1, 3, 4, 5 1, 3, 4, 5
CC 1, 2, 3, 4, 5, 6 1, 2, 3, 4, 5, 6
DD 2, 4, 5, 6 2, 4, 5, 6
11 AA, , CC, , TT, , WW
22 CC, , DD, , WW
33 AA, , CC, , TT, , WW
t(AW) = t(A) t(W) = 1345
Tp ng:
Ch X I X i l t (X) X
TT 1, 3, 5, 6 1, 3, 5, 6
WW 1, 1, 2, 2, 3, 4, 5 3, 4, 5
44 AA, , CC, , DD, , WW
55 AA, , CC, , DD, , T, T, WW
66 CC, , DD, , TT
Cho X _ I. X gi l tp ng c
it
(X) = X.
V d: xt CSDL bng 1 ta c
Do c
it
(AW) = i(t(AW)) = i(1345)
,, ,,
= ACW
it
( ) ( ( )) ( )
AW khng phi l tp ng.
Do c
it
(ACW) = i(t(ACW)) = i(1345) = ACW
ACW l tp ng.
38
CC TNH CH CC TNH CHT C T CA IT A IT PAIR PAIR CC TNH CH CC TNH CHT C T CA IT A IT--PAIR PAIR
nh l 1:
Cho X
i
t(X
i
) v X
j
t(X
j
) l hai phn t ty ca lp
tng ng [P]. Ta c 4 tnh cht sau (c l c
it
):
1 N t(X) t(X) th (X) (X) (X X) 1. Nu t(X
i
) = t(X
j
) th c(X
i
) = c(X
j
) = c(X
i
X
j
)
2. Nu t(X
i
) c t(X
j
) th c(X
i
) = c(X
j
)
nhng c(X
i
) = c(X
i
X
j
)
i i j
3. Nu t(X
i
) t(X
j
) th c(X
i
) = c(X
j
)
nhng c(X
j
) = c(X
i
X
j
)
4 Ngc li ca 1 2 v 3: c(X) = c(X) = c(XX) 4. Ngc li ca 1, 2 v 3: c(X
i
) = c(X
j
) = c(X
i
X
j
)
39
NH NHN XT V N XT V IT IT--PAIR PAIR
1. Tnh cht 1 ni rng, nu phn giao ca g, p g
hai Tidset bng nhau th
|t(X
i
)|=|t(X
j
)|=|t(X
i
X
j
)| m X
i
cX
i
X
j
v X XX nn X X khng l tp ng v X
j
cX
i
X
j
nn X
i
, X
j
khng l tp ng.
2. Theo tnh cht 2, ta c c(X
i
) = c(X
i
X
j
)
X
i
khng l tp ng. Bn cnh , do X
i
khng l tp ng. Bn cnh , do
t(X
i
)=t(X
j
) nn X
i
v X
j
thuc v 2 tp ng
khc nhau.
3. Tng t tnh cht 2.
4. Theo tnh cht 4, X
i
, X
j
v X
i
X
j
s thuc
v 3 tp ng khc nhau v 3 tp ng khc nhau.
40
THU THUT TON TM T TON TM
TTP PH P PH BI BIN NG(CHARM) N NG(CHARM) TTP PH P PH BI BIN NG(CHARM) N NG(CHARM)
CHARM(D,minSup) CHARM-PROPERTY(X Y,l
i
,l
j
,[P
i
],[P]) ( , p)
[C]={l
i
t(l
i
):l
i
eI .Sup(l
i
)>minSup}
CHARM-EXTEND([C], C = C)
return C
( ,
i
,
j
,[
i
],[ ])
if Sup(X) > minSup then
if t(l
i
)=t(l
j
) then
Remove l
j
from [P]
P = P l
CHARM-EXTEND([P], C)
for each l
i
t(l
i
) in [P] do
P
i
= P
i
l
j
and [P
i
] = C
P
i
= P
i
l
j
elseif t(l
i
) c t(l
j
) then
P
i
= P
i
l
j
elseif t(l
i
) t(l
j
) then
R l f [P]
for each l
j
t(l
j
) with j > i do
Y =t(l
i
) t(l
j
)
CHARM-PROPERTY(XY,l
i
,l
j
,[P
i
],[P])
SUBSUMPTION-CHECK(C, P
i
)
Remove l
j
from [P]
Add X Y to [P
i
]
else
Add X Y to [P
i
]
CHARM-EXTEND([P
i
], C)
delete ([P
i
]
SUBSUMPTION-CHECK(C, P)
for allY eHASHTABLE[|t(P)|] do
if P.Y th
S dng bng bm kim tra tp P
41
if P.Y then
C = C P
S dng bng bm kim tra tp P
c phi l tp ng hay khng?
MINH HA CHARM MINH HA CHARM
((minSup minSup=50%) =50%)
Item Item TID TID
AA 1, 3, 4, 5 1, 3, 4, 5
{}x123456
((minSup minSup=50%) =50%)
CC 1, 2, 3, 4, 5, 6 1, 2, 3, 4, 5, 6
DD 2, 4, 5, 6 2, 4, 5, 6
TT 1 3 5 6 1 3 5 6 TT 1, 3, 5, 6 1, 3, 5, 6
WW 1, 2, 3, 4, 5 1, 2, 3, 4, 5
SX tng theo
|t(X)|
Thay D bi DC
Cx123456 Dx2456 Tx1356 Wx12345 Ax1345 DCx2456 CC TCx1356 Wx12345 AWx1345 Cx123456 AWCx1345 Cx123456 WCx12345
t(D) c t(C) Tha tnh cht 2 nn D khng l tp ng
Thay D bi DC
Do t(TCA) = t(TCW) nn thay
DTx56 DAx45 DWx245 DCWx245 TAx135 TWx135 TCAx135 TCWx135 TCAWx135
C tt c 7 tp ph bin ng tha minSup = 50%

V thay DW bi DCW
Do t(TCA) t(TCW) nn thay
TCA bi TCAW v xa TCW
42
gm: DC, TC, AWC, WC, C, DWC, TAWC
NH NHN XT N XT
S lng tp ph bin ng thng nh S lng tp ph bin ng thng nh
hn nhiu so vi s tp ph bin. Nh vy,
vic khai thc lut t chng s hiu qu
hn.
Mc tm kim trn IT-tree tm FCI thp
hn so vi tm FI khng gian b nh
h h i i h yu cu cho qu trnh gi qui s nh
hn.
43
1. Tm Tp ph bin
2 T l t kt h 2. Tm lut kt hp
44
21 21--Dec Dec- -10 10
KHAI THC LU KHAI THC LUT TRUY T TRUYN TH N THNG NG
(M (MINING INING TTRADITIONAL RADITIONAL AASSOCIATION SSOCIATION RRULES ULES)) (M (MINING INING TTRADITIONAL RADITIONAL AASSOCIATION SSOCIATION RRULES ULES))
nh ngha:
Lut kt hp l biu thc c dng XY X (q,
p) (X Y l cc tp ph bin) trong X Y=C p) (X, Y l cc tp ph bin) trong X,Y=C,
XcY v p = o(Y)/ o(X) > minConf gi l tin
cy ca lut cn q = o(Y) minSup c gi l y q ( ) p g
ph bin ca lut.
Nh vy: lut kt hp l lut sinh ra gia cc
tp ph bin X, YeFI trong X c Y.
45
LU LUT TRUY T TRUYN TH N THNG: NG: THU THUTT TON TON
EXTRACT_AR( FI, minConf ) ( , f )
SORT (FI) // Sp xp tp FI tng theo k-itemset
AR = C
f h Y FI d for each YeFI do
for each X e FI with Y after X do
if XcY then if XcY then
conf = Sup(Y)/Sup(X)
if conf > minConf then f
AR = AR {X Y\X (Sup(Y), conf)}
return AR
46

You might also like