The explosion of Big Data in last years has determined some relevant problems in data management and the urgence of new methods. In fact data aggregation lead to information loss and so there is the need to consider new approaches in order to handle data in a suitable way. The SDA approach consider symbolic data (i.e. interval, boxplot or histogram data) which take in to account the internal data structure without aggregation. In this sense our proposal is using beanplots to consider the variation in a specific observation. The beanplots are obtained by mean of a kernel density estimate which allow to represent the original data and show their relevant features. In the temporal framework, we consider beanplot time series, ordered sequences of beanplot over time. The beanplot data can be parameterized by mean of mixture distribution models to retain the relevant structural information. In particular the obtained parameters can be used in clustering and in forecasting. An important element is the possibility to taking in to account also the fit of the different models obtained in the analysis. In this work we will present a new clustering approach on Beanplot data which take in to account constraints over time. These obtained clusters allow to identify homogeneous temporal periods which can be used in applicative contexts.
- Time Series,
- Constrained Clustering,
- Symbolic Data Analysis
Available at: http://works.bepress.com/carlo_drago/118/