High-dimensional cluster analysis with the masked EM algorithm
File(s)neco_a_00661.pdf (555.09 KB)
Published version
Author(s)
Kadir, SN
Goodman, DFM
Harris, KD
Type
Journal Article
Abstract
Cluster analysis faces two problems in high dimensions: the "curse of dimensionality" that can lead to overfitting and poor generalization performance and the sheer time taken for conventional algorithms to process large amounts of high-dimensional data. We describe a solution to these problems, designed for the application of spike sorting for nextgeneration, high-channel-count neural probes. In this problem, only a small subset of features provides information about the cluster membership of any one data vector, but this informative feature subset is not the same for all data points, rendering classical feature selection ineffective.We introduce a "masked EM" algorithm that allows accurate and time-efficient clustering of up to millions of points in thousands of dimensions. We demonstrate its applicability to synthetic data and to real-world high-channel-count spike sorting data.
Date Issued
2014-11-01
Date Acceptance
2014-05-23
Citation
Neural Computation, 2014, 26 (11), pp.2379-2394
ISSN
0899-7667
Publisher
Massachusetts Institute of Technology Press
Start Page
2379
End Page
2394
Journal / Book Title
Neural Computation
Volume
26
Issue
11
Copyright Statement
© 2014 Massachusetts Institute of Technology Published under a Creative Commons Attribution 3.0 Unported (CC BY 3.0) license.
License URL
Sponsor
Engineering & Physical Science Research Council (EPSRC)
Identifier
https://www.mitpressjournals.org/doi/full/10.1162/NECO_a_00661
Grant Number
EP/I005102/1
Subjects
Science & Technology
Technology
Life Sciences & Biomedicine
Computer Science, Artificial Intelligence
Neurosciences
Computer Science
Neurosciences & Neurology
MODEL
MIXTURES
Action Potentials
Algorithms
Cluster Analysis
Humans
Models, Neurological
Models, Theoretical
Neurons
Neurons
Humans
Cluster Analysis
Action Potentials
Algorithms
Models, Theoretical
Models, Neurological
q-bio.QM
q-bio.QM
cs.LG
q-bio.NC
stat.AP
Artificial Intelligence & Image Processing
Publication Status
Published
Date Publish Online
2014-10-10