AbstractsComputer Science

Ensemble rapid centroid estimation : a semi-stochastic consensus particle swarm approach for large scale cluster optimization

by Mitchell Yuwono




Institution: University of Technology, Sydney
Department:
Year: 2015
Record ID: 1043659
Full text PDF: http://hdl.handle.net/10453/34454


Abstract

This thesis details rigorous theoretical and empirical analyses on the related works in the clustering literature based on the Particle Swarm Optimization (PSO) principles. In particular, we detail the discovery of disadvantages in Van Der Merwe - Engelbrecht’s PSO clustering, Cohen - de Castro Particle Swarm Clustering (PSC), Szabo’s modified PSC (mPSC) and Szabo’s Fuzzy PSC (FPSC). We validate, both theoretically and empirically, that Van Der Merwe - Engelbrecht’s PSO clustering algorithm is not significantly better than the conventional k-means. We propose that under random initialization, the performance of their proposed algorithm diminishes exponentially as the number of classes or dimensions increase. We unravel that the PSC, mPSC, and FPSC algorithms suffer from significant complexity issues which do not translate into performance. Their cognitive and social parameters have negligible effect to convergence and the algorithms generalize to the k-means, retaining all of its characteristics including the most severe: the curse of initial position. Furthermore we observe that the three algorithms, although proposed under varying names and time frames, behave similarly to the original PSC. This thesis analyzes, both theoretically and empirically, the strengths and limitations of our proposed semi-stochastic particle swarm clustering algorithm, Rapid Centroid Estimation (RCE), self-evolutionary Ensemble RCE (ERCE), and Consensus Engrams, which are developed mainly to address the fundamental issues in PSO Clustering and the PSC families. The algorithms extend the scalability, applicability, and reliability of earlier approaches to handle large-scale non-convex cluster optimization in quasilinear complexity in both time and space. This thesis establishes the fundamentals, much surpassing those outlined in our published manuscripts.