Improved Parallel Pattern Growth Data Mining Algorithm

P. Asha; T. Jebarajan

doi:10.15866/irecos.v9i1.1033

Improved Parallel Pattern Growth Data Mining Algorithm

^(*) Corresponding author

DOI's assignment:
the author of the article can submit here a request for assignment of a DOI number to this resource!
Cost of the service: euros 10,00 (for a DOI)

Abstract

Data mining techniques that extract information from huge amount of data have become popular in many applications. Many algorithms are designed to analyze those volumes of data automatically in efficient ways. To improve the performance a data mining task, it is important that parallelism would be better than the sequential mining. Association Rule Mining (ARM) is data mining technique which aims to discover patterns/rules among items in a large database of variable length transactions. This paper proposes a parallel Frequent Pattern Tree Growth algorithm. Task parallelization is done by partitioning the database and sent to all of its compute nodes and finally the results were merged together in the Head node. Efficient partitioning and parallelization works in a better way and shows good performance. Filtering of the retrieved association rules using various Rule Interestingness measures has been done. The Performance of the parallel FP Tree algorithm is then compared and analyzed with the Rapid Miner Toolkit
Copyright © 2014 Praise Worthy Prize - All rights reserved.

Keywords

Frequent Patterns Mining; Sequential Mining; Parallel Processing; Association Rule; Interestingness Measures

Full Text:

PDF

References

T. Imielinski R. Agrawal and A.N. Swami, ” A Tree Projection. Algorithm for Generation of Frequent Item Sets”, Journal of Parallel and Distributed Computing, Vol. 61, Issue 3, pp. 350-371, March 2001.

J. Han J. Pei and Y. Yin,“ Mining frequent pattern without candidate generation,” ACM International Conference on Management of Data, 1999.

Agrawal, Swami and Imielinski, ” Mining association rules between sets of items in large databases,” In Proceedings of International Conference on Management of Data , 1993, pp. 207-216.

R. Bayardo, ”Efficiently mining long patterns from Databases,” In the Proceedings of the 1998 ACM SIGMOD International Conference on Management of Data, 1998, pp. 85- 93.

J. Li, Y. Liu and Choudhary,” Parallel Data Mining Algorithms for Association Rules and Clustering.” Handbook of Parallel Computing: Models, Algorithms and Applications, Sanguthevar Rajasekaran and John Reif, CRC Press, 2011.

G. Karypis, and V. Kumar,” Scalable parallel data mining for association rules,” IEEE Transactions on Knowledge and Data Engineering, 1997, pp. 337- 352.

S.M. Chung and C. Luo,” Parallel mining of maximal frequent itemsets from database”, 15th IEEE International Conference on Tools with Artificial Intelligence, 2003, pp. 134-139.

Rakesh Agrawal, Tomasz Imielinski and Arun Swami, “ Mining association rules in large databases,” IEEE Transactions on Data mining, Vol. 22, No.10, June 2010, pp. 2-5.

P. Asha, Dr. T. Jebarajan,“ Mining interesting association rules with a Heterogeneous Environment” in the International Joint Conferences on CNC and CSEE 2013 ,Conference Proceedings published by Springer, ISSN: 1867-8211, PP. 222-228, Feb 22-23,2013.

Rakhi Garg, Mishra,“ Exploiting Parallelism in Association Rule Mining Algorithms,” Int Journal of Advancements in Technology, Vol. 2, No. 2, April 2011, pp. 222-232.

Longbing Cao, “Domain-Driven Data Mining: Challenges and Prospects,” IEEE Transactions on Knowledge and Data Engineering, Vol. 22, No. 6, June 2010, pp. 755-769.

Wu, J., & Li, X. M. (2008). “An efﬁcient association rule mining algorithm in distributed database,” In International Workshop on Knowledge Discovery and Data mining (WKDD) (pp. 108–113).

Bing Liu,Wynne Hsu, Shu Chen, Yiming Ma, “ Analyzing the subjective interestingness of association rules,” Intelligent Systems and their Applications, IEEE , Vol. 15,No. 5,Sep 2000,pp. 47-55.

Zaki, M. J., “Parallel and distributed association mining: A survey,” IEEE Concurrency, Special Issue on Parallel Mechanisms for Data Mining, 7(4):14--25, December 1999.

Avi Silberschatz and Alexander Tuzhilin, "What makes patterns interesting in knowledge Discovering Systems," IEEE Trans. on Knowledge and Data Engineering, Vol. 8,No. 6,Dec 1996.

R. Agrawal et al., “Fast Discovery of Association Rules,” Advances in Knowledge Discovery and Data Mining, U. Fayyad et al.,eds., AAAI Press, Menlo Park, Calif., 1996, pp. 307–328.

Savasere , E. Omiecinski, and S.B. Navathe, "An Efficient Algorithm for Mining Association Rules in Large Databases,"Proc. 21st Int'l Conf. Very Large Databases.

Brin et al., “Dynamic Itemset Counting and Implication Rules for Market Basket Data,” Proc. ACM SIGMOD Conf. Management of Data, ACM Press, New York, 1997, pp. 255–264.

Oded Maimon and Lior Rokach, “The Data Mining and Knowledge Discovery Handbook,” Springer, 2009 Edition.

Kun-Ming Yu, Jiayi Zhou, Tzung -Pei Hong, Jia-Ling Zhou,”A load-balanced distributed parallel mining algorithm” , Expert Systems with Applications, (2010), pp. 2459–2464.

Ma, C., Gao, X.-D., Zeng, Z.-W., A spatial data mining method based on the concept lattice of compact dependencies, (2013) International Review on Computers and Software (IRECOS), 8 (1), pp. 341-346.

Yin, F., Jiang, X., Qin, Y., A data mining model for broadcasting and television DSS, (2013) International Review on Computers and Software (IRECOS), 8 (1), pp. 88-93.

Rizvi, S.S., Chung, T.-S., Investigation of in-network data mining approach for energy efficient data centric wireless sensor networks, (2013) International Review on Computers and Software (IRECOS), 8 (2), pp. 443-447.

Benali, K., Rahal, S.A., OntoAR: An ontology for unification and description of association rules, (2013) International Review on Computers and Software (IRECOS), 8 (6), pp. 1400-1406.

Refbacks

There are currently no refbacks.

Username
Password
Remember me