High-Dimensional Cluster Analysis with the Masked EM Algorithm

File Description SizeFormat 
neco_a_00661.pdfPublished version555.09 kBAdobe PDFDownload
Title: High-Dimensional Cluster Analysis with the Masked EM Algorithm
Author(s): Kadir, SN
Goodman, DFM
Harris, KD
Item Type: Journal Article
Abstract: Cluster analysis faces two problems in high dimensions: the "curse of dimensionality" that can lead to overfitting and poor generalization performance and the sheer time taken for conventional algorithms to process large amounts of high-dimensional data. We describe a solution to these problems, designed for the application of spike sorting for nextgeneration, high-channel-count neural probes. In this problem, only a small subset of features provides information about the cluster membership of any one data vector, but this informative feature subset is not the same for all data points, rendering classical feature selection ineffective.We introduce a "masked EM" algorithm that allows accurate and time-efficient clustering of up to millions of points in thousands of dimensions. We demonstrate its applicability to synthetic data and to real-world high-channel-count spike sorting data.
Publication Date: 10-Oct-2014
Date of Acceptance: 23-May-2014
ISSN: 0899-7667
Publisher: Massachusetts Institute of Technology
Start Page: 2379
End Page: 2394
Journal / Book Title: Neural Computation
Volume: 26
Issue: 11
Copyright Statement: © 2014 Massachusetts Institute of Technology Published under a Creative Commons Attribution 3.0 Unported (CC BY 3.0) license.
Sponsor/Funder: Engineering & Physical Science Research Council (EPSRC)
Funder's Grant Number: EP/I005102/1
Keywords: Artificial Intelligence & Image Processing
MD Multidisciplinary
Publication Status: Published
Appears in Collections:Faculty of Engineering
Electrical and Electronic Engineering

Items in Spiral are protected by copyright, with all rights reserved, unless otherwise indicated.

Creative Commons