Bhat, Anup B and Harish, S V and Geetha, M (2018) An Apache Spark based implementation of Single Pass Counting Algorithm for Frequent Itemset Mining. In: NCAMSE-2018, 20/06/2018, Manipal.
![]() |
PDF
1123.pdf - Published Version Restricted to Registered users only Download (79kB) | Request a copy |
Abstract
Frequent Itemset Mining has evolved as a popular and general data mining task since its inception. It is an indispensable step in discovering associations between the items in a transactional database. As the size of the databases increases massively, sequential approach for mining suffer a bottleneck both in terms of memory and execution time. In this study, we explore the applicability of customised Apriori algorithm in a parallel and distributed framework called Single Pass Counting algorithm using Apache Spark and Hadoop Distributed File System (HDFS).
Item Type: | Conference or Workshop Item (Paper) |
---|---|
Uncontrolled Keywords: | Frequent Itemsets, Parallel and Distributed Computing, Apache Spark, Apriori algorithm, Single Pass Counting |
Subjects: | Engineering > MIT Manipal > Computer Science and Engineering |
Depositing User: | MIT Library |
Date Deposited: | 16 Jul 2018 06:52 |
Last Modified: | 16 Jul 2018 06:52 |
URI: | http://eprints.manipal.edu/id/eprint/151590 |
Actions (login required)
![]() |
View Item |