Improving clustering performance by incorporating uncertainty
Author(s)
Bakoben, M
Bellotti, AG
Adams, NM
Type
Journal Article
Abstract
In more challenging problems the input to a clustering problem is not raw data objects, but rather parametric statistical summaries of the data objects. For example, time series of different lengths may be clustered on the basis of estimated parameters from autoregression models. Such summary procedures usually provide estimates of uncertainty for parameters, and ignoring this source of uncertainty affects the recovery of the true clusters. This paper is concerned with the incorporation of this source of uncertainty in the clustering procedure. A new dissimilarity measure is developed based on geometric overlap of confidence ellipsoids implied by the uncertainty estimates. In extensive simulation studies and a synthetic time series benchmark dataset, this new measure is shown to yield improved performance over standard approaches.
Date Issued
2016-03-11
Date Acceptance
2016-03-02
Citation
Pattern Recognition Letters, 2016, 77, pp.28-34
ISSN
1872-7344
Publisher
Elsevier
Start Page
28
End Page
34
Journal / Book Title
Pattern Recognition Letters
Volume
77
Copyright Statement
© 2016, Elsevier. Licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International http://creativecommons.org/licenses/by-nc-nd/4.0/
Subjects
Artificial Intelligence & Image Processing
0801 Artificial Intelligence And Image Processing
0906 Electrical And Electronic Engineering
1702 Cognitive Science
Publication Status
Published