Based on empirical studies, the feature of random initialization in Particle Swarm Optimization (PSO) based Fuzzy c-means (FCM) methods affects the computational performance especially in big data. As the data points in high-density areas are more likely near the cluster centroids, we design a new algorithm to guide the initialization according to the data density patterns. Our algorithm is initialized by fusing the data characteristics near the cluster centers. Our evaluation results from real data show that our approach can significantly improve the computational performance of PSO-based Fuzzy clustering methods, while preserving comparable clustering performance.
- Particle Swarm Optimization,
- Big Data
Available at: http://works.bepress.com/hua_fang/36/