Example: biology

頻出パターンマイニング - kamishima.net

Frequent Pattern Mining . 1.. 2.. Association Rule 3.. Association Rule X (antecedent). X Y Y (consequent). X Y . ( ). X Y .. { , } { }. (X ) . (Y ) . 2. 4.. X i . X i . X = { , }. T1 = { , , }. T2 = { , }. X 1 1 . X 2 2 .. X Y . 5.. X Y . { , } { }. { , } { }.. , , , .. 6.. X Y . { } { }.. X Y . { } { }.. 7.. Basket Data ( ) .. T1 = { , , }. T2 = { , }.. TN = { , }. T1 . ( ) .. 8.. support(X). X .. X {a, b} . T1 = {a, b, c} T2 = {a, d}. T3 = {b, d, e} T4 = {a, b, e}. T5 = {a, b, c} T6 = {d, e}. = 6. ( ) = 3. X 3. support(X) = . = 6. = 9.. con dence(X,Y). X X . Y . X Y . con dence(X,Y) =. X . X Y X Y . X Y Ti support(X Y). con dence(X,Y) =. support(X). 10. Apriori 11. Apriori Apriori .. and "Fast Algorithms for Mining Association Rules", VLDB 1994.. minsup minconf . X Y . support(X Y) minsup con dence(X,Y) minconf 12. Apriori : .. ! 10 . 57,002 .. ! . 13. Apriori Step 1.. X Y . support(X Y) support(X).

頻出パターンマイニング 2 頻出パターン データ集合の要素アイテム集合,系列データ,時系列,木,グラフ…

Information

Domain:

Source:

Link to this page:

Please notify us if you found a problem with this document:

Other abuse

Advertisement

Transcription of 頻出パターンマイニング - kamishima.net

1 Frequent Pattern Mining . 1.. 2.. Association Rule 3.. Association Rule X (antecedent). X Y Y (consequent). X Y . ( ). X Y .. { , } { }. (X ) . (Y ) . 2. 4.. X i . X i . X = { , }. T1 = { , , }. T2 = { , }. X 1 1 . X 2 2 .. X Y . 5.. X Y . { , } { }. { , } { }.. , , , .. 6.. X Y . { } { }.. X Y . { } { }.. 7.. Basket Data ( ) .. T1 = { , , }. T2 = { , }.. TN = { , }. T1 . ( ) .. 8.. support(X). X .. X {a, b} . T1 = {a, b, c} T2 = {a, d}. T3 = {b, d, e} T4 = {a, b, e}. T5 = {a, b, c} T6 = {d, e}. = 6. ( ) = 3. X 3. support(X) = . = 6. = 9.. con dence(X,Y). X X . Y . X Y . con dence(X,Y) =. X . X Y X Y . X Y Ti support(X Y). con dence(X,Y) =. support(X). 10. Apriori 11. Apriori Apriori .. and "Fast Algorithms for Mining Association Rules", VLDB 1994.. minsup minconf . X Y . support(X Y) minsup con dence(X,Y) minconf 12. Apriori : .. ! 10 . 57,002 .. ! . 13. Apriori Step 1.. X Y . support(X Y) support(X).

2 X Y X . support(X Y) support(X). support(X) minsup .. X (frequent itemset). Step 2.. 14. Apriori .. support(X) minsup X .. F . F k . k- Fk : support({a,b}) minsup . {a, b} 2 2- F2 . 15. Apriori .. F1 Fk . Fk+1 . Xk k . Xk+1 Xk . Xk Xk+1 . Xk+1 Xk support(Xk) support(Xk+1). support(Xk) minsup support(Xk+1) minsup Xk+1 . Fk Xk+1 . downward closure property . 16. Apriori . Xk+1 Fk . Xk+1 .. k+1- Ck+1 Xk+1 . Fk . Fk {a,b} {b,c} {a,c} {a,d} . {a,b,c} {a,b} {b,c}. {a,c} Fk Ck+1 . {a,b,d} {b,d} Fk Ck+1 .. Ck+1 . 17. Apriori . k=1. F1 .. Fk Ck+1 . Ck+1 . Fk+1 . Fk+1 . k = k + 1; . F1, F2 Fk+1. 18. Apriori .. F.. X Y. X Y F. support(X Y). con dence(X,Y) = minconf support(X). X Y F X Y F . 19. Apriori . X Y X' Y' X X' . X Y X' Y' . support(X Y) support(X' Y'). con dence(X,Y) con dence(X',Y') . support(X) support(X'). X Y X' Y' support(X Y) = support(X' Y'). X X' support(X) support(X'). con dence(X,Y) con dence(X',Y').

3 {a,b} {c} {a} {b,c} . con dence({a,b},{c}) con dence({a},{b,c}).. 20. Apriori . 2 F G . X Y=G X Y . Y k . Y' k+1 . con dence(X',Y') minconf . Y' X' 1 . X Y con dence(X,Y) minconf . {a,b,c} {d} {a,b,d} {c} . m i n c o n f { a , b } { c , d } . con dence({a,b},{b,c}) minconf . 21. Apriori . F2 {A,B} {A} {B} {B}. {A} . k 3 Fk . j=1. {k-1 } {1 } .. {k-{j+1} } {j+1 } {k-j }. {j } . minconf .. j = j +1 . 22. FP-growth Frequent Pattern-growth 23. FP-growth FP-growth .. trie FP-tree . , , and Mining Frequent Patterns without Candidate Generation SIGMOD 2000. DB . minsup . support(X) minsup Apriori . Apriori . FP-tree ( . ). 24. FP-tree FP-tree . a {a} . a .. a .. 25. FP-tree Apriori DB . ( Apriori F1 .. DB minsup = 2/9 . T1 a, b, e T4 a, b, d T7 a, c T2 b, d, f T5 a, c, g T8 a, b, c, e T3 b, c T6 b, c T9 a, b, c . L = {b:7} {a:6} {c:6} {d:2} {e:2} {f:1} {g:1} .. 26. FP-tree L .. T1 = {a,b,e}. L. T1 = {b,a,e} 1.)

4 FP-tree .. b 7.. a 6. c 6 b:1. d 2 . a:1 . e 2 .. e:1. 27. FP-tree .. 1 T2 = {b,d} 2 . L 1 .. 2. b 7. b:1 b:2. a 6. c 6 a:1 d:1 a:1. d 2. e 2. e:1 e:1. 1. 28. FP-tree . b .. T5 = T7 = {a,c}. L .. b:7 a:2.. b 7. c:2 a:4 d:1 c:2. a 6. c 6. e:1 c:2 d:1. d 2. e 2. e:1. node-link . 29. FP-growth (Conditional Pattern Base; CPB). e .. L. e .. b:7 a:2. b 7 (e }. c:2 a:4 d:1 c:2. a 6 {b, a: 1}.. c 6. e:1 c:2 d:1. d 2. e 2 . e:1.. e . {e:2} {b, a, c: 1}. 30. FP-growth CPB FP-tree tree . e CPB {b, a: 1} {b, a, c: 1}. FP-tree . e . tree . FP-tree {a} {b} {b, a}. L .. {b: 2} {a: 2} {b, a: 2}.. b:2 e FP- b 2 tree e . c a 2 a:2 . c 1 {b, e: 2} {a, e: 2} {b, a, e: 2}. { }-{b:3}-{a:2}-{c:1} FP-tree . {b: 3} {a: 2} {c: 1} {b, a: 2} {b, c: 1} {b, a, c: 1} . 31. FP-growth CPB FP-tree tree . c CPB {b, a: 2} {b: 2} {a: 2}. c a . FP-tree FP-tree L .. b:2 a:2 b:2. b 4. a 4 a:2. b . FP-tree FP-tree . FP-growth .. 32. FP-growth.)

5 A . c . FP-tree FP-tree L 1 . a {b:4} {a:4} {b,a:2}.. {b, a: 2} . b:2 b:2 a:2. b 4. a 4 a:2. b .. FP-tree . {b:4} {a:4} c . {b, c: 4} {a, c: 4}.. {b, a, c: 2}.. 33. FP-growth L i . 1. i {i} . 2. i (CPB) . 3. CPB FP-tree . a) FP-tree . i .. b) FP-tree FP-tree . FP-growth . i . 34.. Sequence Data 35.. T1 = { , , }. T2 = { , }.. 1 10 1 12 1 30 . , , . , .. 1 . c a (bd) a . c 2 a 3 b d 4 a . 36.. S2 S1 . S1 b (a c) e (d f) . S2 (a c) d .. (a b) a b .. a b c d e f g h a h . 37.. minconf DB .. S1 a b c S2 b (a c) S3 a c DB . minconf=2/3 . a c S1 S3 . a b S1 .. ( ) .. 38. Pre xSpan PREFIX-projected Sequential PAtterN mining 39. Pre xSpan Pre xSpan .. post x post x . , , B. Mortazavi-Asl, , , , PrefixSpan: Mining Sequential Patterns Efficiently by Prefix-Projected Pattern Groush ICDE 2001. DB . minsup P . support(P) minsup . AprioriAll GSP Generalized Sequential Pattern . 40. Pre xSpan {a g} DB minsup = 2/4.

6 S1 a (abc) (ac) d (cf) S3 (ef) (ab) (df) c b . S2 (ad) c (bc) (ae) S4 e g (af) c b c . a g . post x S T post x S T . S . S S1 a (abc) (ac) d (cf) . T= a a (abc) (ac) d (cf) post x . (abc) (ac) d (cf) . T= a a a (abc) (ac) d (cf) post x . (_bc) (ac) d (cf) _ T . T= a b a (abc) (ac) d (cf) post x . (_c) (ac) d (cf) abc b a . 41. Pre xSpan DB (projected database). DB post x . S1 a (abc) (ac) d (cf) S3 (ef) (ab) (df) c b . S2 (ad) c (bc) (ae) S4 e g (af) c b c .. a - DB . (abc) (ac) d (cf) (_d) c (bc) (ae) (_b) (df) c b (_f) c b c . e - DB . (_f) (ab) (df) c b (af) c b c . S2 e post x . a b - DB . (_c) (ac) d (cf) (_c) (ae) c . a - DB . 42. Pre xSpan Pre xSpan . S1 a (abc) (ac) d (cf) S3 (ef) (ab) (df) c b . S2 (ad) c (bc) (ae) S4 e g (af) c b c . 1 . a : 4 b : 4 c : 4 d : 3 e : 3 f : 3. 2 g . 2 a a - DB . a - DB (abc) (ac) d (cf) (_d) c (bc) (ae) (_b) (df) c b (_f) c b c . 3 . a . (ab) (ac) (ad) (af) (aa).

7 A . a a a b a c a f a a . 43. Pre xSpan S1 a (abc) (ac) d (cf) S3 (ef) (ab) (df) c b . S2 (ad) c (bc) (ae) S4 e g (af) c b c . a - DB (abc) (ac) d (cf) (_d) c (bc) (ae) (_b) (df) c b (_f) c b c . (ab) a - DB _ . b (ab) .. S1 S3 a - DB . a b _ b . S1 S2 S3 S4 a - DB . 44. Pre xSpan S1 a (abc) (ac) d (cf) S3 (ef) (ab) (df) c b . S2 (ad) c (bc) (ae) S4 e g (af) c b c . a - DB (abc) (ac) d (cf) (_d) c (bc) (ae) (_b) (df) c b (_f) c b c . a . a a : 2 a b : 4 (ab) : 2 a c : 4 a d : 2 a f : 4. 4 . a DB .. a . b f .. 45. Pre xSpan Pre xSpan( , l, DB) . Pre xSpan( , l DB). l DB . b . a) b . b) b . 2. b b .. 3. - DB Pre xSpan( , l+1, - . DB) .. bi-level projection . pseudo-projection DB . 46.. 47.. (frequent closed item set). DB .. support(X)=support(Y) Y X X . X . minconf = 3/6 . 1,2,5,6,7,9 {1} {1,7} {1,9} {1,7,9}. 2,3,4,5 {2}. 1,2,7,8,9 {7} {9} {7,9}. 1,7,9 {2,7} {2,9} {2,7,9}. 2,3,5,7,9. 2,7,9 .. 48.

8 Web .. ACBABA ACBABA. A B.. Winepi A .. Minepi . minimal occurence . , , and Discovery of Frequent Episodes in Event Sequences Data Mining and Knowledge Discovery (1997). Pre xSpan . 49.. XML .. 50.. Web XML .. 51.. DB . Apriori FP-growth . X Y .. Pre xSpan .. 52.. Apriori . 53. Oms-B . 5 . A. B. C. D. E.. N=5.. minsup = minconf = 54. Oms-B . a b c d e . T1. T2. T3. T4. T5.. 55.. a b c d e . 1- . T1. T2. N=5 T3. minsup = T4. T5. 1 . {a} T2 T3 2 . support({a}) = 2/N = minsup {a} . {e} T3 1 . support({e}) = 1/N = < minsup {e} . F1={a} {b} {c} {d}. 56.. a b c d e . 2- . T1. N=5 T2. minsup = T3. T4. F1={a} {b} {c} {d}. T5. F1 C2 . {a,b} {a} {b} F1 C2 . {a,e} {e} F1 C2 . C2={a,b} {a,c} {a,d} {b,c} {b,d} {c,d}. {a,b} T2 1 . support({a,b}) = 1/N = < minsup {a,b} . {b,c} T1 T4 2 . support({b,c}) = 2/N = minsup {b,c} . F2={b,c} {b,d} {c,d}. 57.. a b c d e . 3- . T1. N=5 T2. minsup = T3. T4. F2= {b,c} {b,d} {c,d}.

9 T5. F2 C3 . {a,b,c} {a,b} {a,c} F2 C3 . {b,c,d} {b,c} {b,d} {c,d} F2 C3 . C3={b,c,d}. {b,c,d} T1 T4 2 . support({b,c,d}) = 2/N = minsup {b,c,d} . F3={b,c,d}. C4 . 58.. C 1: {a} {b} {c} {d} {e}. F1: {a} {b} {c} {d} {e}. C 2: {a,b} {a,c} {a,d} {b,c} {b,d} {c,d}. F2: {a,b} {a,c} {a,d} {b,c} {b,d} {c,d}. C 3: {b, c, d}. F3: {b, c, d}. 59.. a b c d e . N=5, minsup= , minconf= T1. F1={a} {b} {c} {d} T2. F2= {b,c} {b,d} {c,d} T3. F3={b,c,d} T4. T5. F2 . {b,c} {b} {c} {c} {b} . support({b,c})=2/N, support({b})=4/N, support({c})=2/N. {b} {c} . con dence({b},{c}) = support({b,c})/support({b}) = 2/4 < minconf {b} {c} . {c} {b} . con dence({c},{b}) = support({b,c})/support({c}) = 2/2 minconf {c} {b} . F2 . {c} {b}, {b} {d}, {d} {b}, {c} {d}. 60.. a b c d e . N=5, minsup= , minconf= T1. F1={a} {b} {c} {d} T2. F2= {b,c} {b,d} {c,d} T3. F3={b,c,d} T4. T5. {b,c,d} (Y 1 ). {b,c,d}, {b,c}, {b,d}, {c,d} 2/N, 2/N, 3/N, 2/N.

10 {b,c} {d} . con dence({b,c},{d}) = support({b,c,d})/support({b,c}) = 2/2. minconf {b,c} {d} . {b,d} {c} . con dence({b,d},{c}) = support({b,c,d})/support({b,d}) = 2/3. minconf {b,d} {c} . {c,d} {b} . con dence({c,d},{b}) = support({b,c,d})/support({c,d}) = 2/2. minconf {c,d} {b} . 61.. {b,c,d} (Y 2 ). {c,d} {b} {b,c} {d}. {b,d} {c} minconf {b} {c,d} {d} {b,c}. minconf . {b} {c,d} {d} {b,c} . {c} {b,d} . con dence({c},{b,d}) = support({b,c,d})/support({c}) = 2/2. minconf {c} {b,d} . F3 . {c,d} {b}, {b,c} {d}, {c} {b,d}.. {c} {b}, {b} {d}, {d} {b}, {c} {d}. {c,d} {b}, {b,c} {d}, {c} {b,d}. 62.


Related search queries