Skip to main content
Article
Parallel Hybrid Clustering Using Genetic Programming and Multi-Objective Fitness with Density (PYRAMID)
Proceedings of the 2006 International Conference on Data Mining
  • Junping Sun, Nova Southeastern University
  • Samir Tout
  • William Sverdlik
Document Type
Article
Publication Date
6-1-2006
Abstract

Clustering is the process of locating patterns in large data sets. It is an active research area that provides value to scientific as well as business applications. Practical clustering faces several challenges including: identifying clusters of arbitrary shapes, sensitivity to the order of input, dynamic determination of the number of clusters, outlier handling, processing speed of massive data sets, handling higher dimensions, and dependence on user-supplied parameters. Many studies have addressed one or more of these challenges. This study proposes an algorithm called parallel hybrid clustering using genetic programming and multi-objective fitness with density (PYRAMID). While still leaving significant challenges unresolved, such as handling higher dimensions and dependence on user-supplied parameters, PYRAMID employs a combination of data parallelism, a form of genetic programming, and a multiobjective density-based fitness function in the context of clustering to resolve most of the above challenges. Preliminary experiments have yielded promising results.

Disciplines
Citation Information
Junping Sun, Samir Tout and William Sverdlik. "Parallel Hybrid Clustering Using Genetic Programming and Multi-Objective Fitness with Density (PYRAMID)" Proceedings of the 2006 International Conference on Data Mining (2006) p. 197 - 203 ISSN: 0-542-56083-6
Available at: http://works.bepress.com/junping-sun/66/