Add abstract
Want to add your dissertation abstract to this database? It only takes a minute!
Search abstract
Search for abstracts by subject, author or institution
Want to add your dissertation abstract to this database? It only takes a minute!
Search for abstracts by subject, author or institution
High Utility Itemsets Identification in Big Data
by Ashish Tamrakar
Institution: | University of Nevada Las Vegas |
---|---|
Year: | 2017 |
Keywords: | Data mining; itemset mining; parallel computing; spark; Computer Sciences |
Posted: | 02/01/2018 |
Record ID: | 2151495 |
Full text PDF: | http://digitalscholarship.unlv.edu/thesesdissertations/3044 |
High utility itemset mining is an important data mining problem which considers profit factors besides quantity from the transactional database. It helps find the most valuable products/items that are difficult to track using only the frequent data mining set. An item that has a high-profit value might be rare in the transactional database despite its tremendous importance. While there are many existing algorithms which generate comparatively large candidate sets while finding high utility itemsets, the major focus is to reduce the computational time significantly with the introduction of pruning strategies. Another aspect of high utility itemset mining is to compute the large dataset. There are very few algorithms that can handle a large dataset to find high utility itemset mining in a parallel (distributed) system. In this thesis, there are two proposed methods: 1) High utility itemset mining using pruning strategies approach (HUI-PR) and 2) Parallel EFIM (EFIM-Par). In the method I, the proposed algorithm constructs the candidate sets in the form of a tree structure, which traverses the itemsets with High Transaction-Weighted Utility (HTWUIs). It uses a pruning strategies to reduce the computational time by refraining the visit to unnecessary nodes of an itemset to reduce the search space. It significantly minimizes the transaction database generated on each node. In the method II, the distributed approach is proposed dividing the search space among different worker nodes to compute high utility itemsets which are aggregated to find the result. The experimental results for both methods show that they significantly improve the execution time for computing the high utility itemsets. Advisors/Committee Members: Justin Zhan, Laxmi Gewali, Fatma Nasoz, Ge Lin Kan.
Want to add your dissertation abstract to this database? It only takes a minute!
Search for abstracts by subject, author or institution
Electric Cooperative Managers' Strategies to Enhan...
|
|
Bullied!
Coping with Workplace Bullying
|
|
The Filipina-South Floridian International Interne...
Agency, Culture, and Paradox
|
|
Solution or Stalemate?
Peace Process in Turkey, 2009-2013
|
|
Performance, Managerial Skill, and Factor Exposure...
|
|
The Deritualization of Death
Toward a Practical Theology of Caregiving for the ...
|
|
Emotional Intelligence and Leadership Styles
Exploring the Relationship between Emotional Intel...
|
|
Commodification of Sexual Labor
Contribution of Internet Communities to Prostituti...
|
|
The Census of Warm Debris Disks in the Solar Neigh...
|
|
Risk Factors and Business Models
Understanding the Five Forces of Entrepreneurial R...
|
|