Frequent Itemset Mining Based on Development of FP-growth Algorithm and Use MapReduce Technique
Abstract
The Finding of frequent itemset in big data is an important task in data mining and knowledge
discovery. The exponential daily growth of data, called “Big Data”, mining frequent patterns from the huge
volumes of data has many challenges due to memory requirement, multiple data dimensions, heterogeneity
of data and so on. The complexities related to mining frequent item-sets from a Big Data can be minimized
by using Modified FP-growth algorithm and parallelizing the mining task with Map Reduce framework in
Hadoop. In this paper, a modified FP-growth based on directed graph with Hadoop framework will reduce
the execution time for the massive database and works efficiently on number of nodes (computers). The
algorithm was tested, our experimental results demonstrated that the proposed algorithm could scale well
and efficiently process large datasets. In addition, it achieves improvement in memory consumption to store
frequent patterns and time complexity.