Improved pattern tree for incremental frequent-pattern mining

Ming Zhou; Taiyong Wang

doi:10.1007/s12209-010-0023-4

Transactions of Tianjin University ›› 2010, Vol. 16 ›› Issue (2) :129 -134. DOI: 10.1007/s12209-010-0023-4

Article

Improved pattern tree for incremental frequent-pattern mining

Ming Zhou ¹
, Taiyong Wang ¹

Author information +

History +

PDF

Abstract

By analyzing the existing prefix-tree data structure, an improved pattern tree was introduced for processing new transactions. It firstly stored transactions in a lexicographic order tree and then restructured the tree by sorting each path in a frequency-descending order. While updating the improved pattern tree, there was no need to rescan the entire new database or reconstruct a new tree for incremental updating. A test was performed on synthetic dataset T10I4D100K with 100,000 transactions and 870 items. Experimental results show that the smaller the minimum support threshold, the faster the improved pattern tree achieves over CanTree for all datasets. As the minimum support threshold increased from 2% to 3.5%, the runtime decreased from 452.71 s to 186.26 s. Meanwhile, the runtime required by CanTree decreased from 1,367.03 s to 432.19 s. When the database was updated, the execution time of improved pattern tree consisted of construction of original improved pattern trees and reconstruction of initial tree. The experiment results showed that the runtime was saved by about 15% compared with that of CanTree. As the number of transactions increased, the runtime of improved pattern tree was about 25% shorter than that of FP-tree. The improved pattern tree also required less memory than CanTree.

Keywords

data mining / association rules / improved pattern tree / incremental mining

Cite this article

Download citation ▾

Ming Zhou, Taiyong Wang. Improved pattern tree for incremental frequent-pattern mining. Transactions of Tianjin University, 2010, 16 (2) : 129-134 DOI:10.1007/s12209-010-0023-4

登录浏览全文

4963

注册一个新账户忘记密码

References

Publishing order | Descend order by publishing year | Descend order by cited within

[1]	Agrawal R, Imielinski T, Swami A. Mining association rules between sets of items in large databases[C]. In: Proceedings of the 1993 ACM-SIGMOD International Conference on Management of Data (SIGMOD’93). Washington DC, 1993. 207–216.

[2]	Wua F., Chiang S. W., Linb J. R. A new approach to mine frequent patterns using item-transformation methods[J]. Information Systems, 2007, 32(7): 1056-1072.

[3]	Lin C., Hong T., Lu Wenhsiang. The Pre-FUFP algorithm for incremental mining[J]. Expert Systems with Application, 2009, 36(5): 9498-9505.

[4]	Park J S, Chen M S, Yu P S. An effective hash-based algorithm for mining association rules[C]. In: Proceedings of the 1995 ACM-SIGMOD International Conference on Management of Data (SIGMOD’95). San Jose, 1995. 175–186.

[5]	Han J, Pei J, Yin Y. Mining frequent patterns without candidate generation[C]. In: Proceedings of the 2000 ACMSIGMOD International Conference on Management of Data (SIGMOD’2000). Dallas, 2000. 1–12.

[6]	Grahne G., Zhu J. Fast algorithms for frequent itemset mining using FP-trees[J]. IEEE Transactions on Knowledge and Data Engineering, 2005, 17(10): 1347-1362.

[7]	Leung C. K., Khan Q. I., Li Z. CanTree: A canonical-order tree for incremental frequent-pattern mining[J]. Knowledge and Information Systems, 2007, 11(3): 287-311.

[8]	Han J., Cheng H., Xin D., et al. Frequent pattern mining: Current status and future directions[J]. Data Mining and Knowledge Discovery, 2007, 15(1): 55-86.

[9]	Agarwal R. C., Aggarwal C. C., Prasad V. V. V. A tree projection algorithm for generation of frequent itemsets[J]. Journal of Parallel and Distributed Computing, 2001, 61(3): 350-361.

[10]	Pei J, Han J, Lu H et al. H-Mine: Hyper-structure mining of frequent patterns in large databases[C]. In: First IEEE International Conference on Data Mining (ICDM’2001). Midland, 2001. 441–448.

[11]	Liu G., Lu X., Yu J. X. CFP-tree: A compact disk-based structure for storing and querying frequent itemsets[J]. Information Sciences, 2007, 32(2): 295-319.

[12]	Koh J. L., Shieh S. F. An efficient approach for maintaining association rules based on adjusting FP-tree structures[C] Proceedings of the DASFAA, 2004, Berlin Heidelberg. New York: Springer-Verlag 417-424.

[13]	Li X, Deng Z H, Tang S. A fast algorithm for maintenance of association rules in incremental databases[C]. In: ADMA. Xi’an, 2006. 56–63.

[14]	Cheung W, Zaïane OR. Incremental mining of frequent patterns without candidate generation or support constraint [C]. In: Proc IDEAS 2003. Hongkong, China, 2003. 111–116.