Speaker segmentation and clustering
File(s)SP_Elsevier_2008_Margarita_Kotti.pdf (347.29 KB)
Accepted version
Author(s)
Kotti, Margarita
Moschou, Vassiliki
Kotropoulos, Constantine
Type
Journal Article
Abstract
This survey focuses on two challenging speech processing topics, namely: speaker segmentation and speaker clustering. Speaker segmentation aims at finding speaker change points in an audio stream, whereas speaker clustering aims at grouping speech segments based on speaker characteristics. Model-based, metric-based, and hybrid speaker segmentation algorithms are reviewed. Concerning speaker clustering, deterministic and probabilistic algorithms are examined. A comparative assessment of the reviewed algorithms is undertaken, the algorithm advantages and disadvantages are indicated, insight to the algorithms is offered, and deductions as well as recommendations are given. Rich transcription and movie analysis are candidate applications that benefit from combined speaker segmentation and clustering. © 2007 Elsevier B.V. All rights reserved.
Date Issued
2008-05
Citation
Signal Processing, 2008, 88 (5), pp.1091-1124
ISSN
0165-1684
Publisher
Elsevier
Start Page
1091
End Page
1124
Journal / Book Title
Signal Processing
Volume
88
Issue
5
Copyright Statement
© 2007 Elsevier B.V. All rights reserved. NOTICE: this is the author’s version of a work that was accepted for publication in Signal Processing. Changes resulting from the publishing process, such as peer review, editing, corrections, structural formatting, and other quality control mechanisms may not be reflected in this document. Changes may have been made to this work since it was submitted for publication. A definitive version was subsequently published in SIGNAL PROCESSING, Vol.:88, Issue:5, (2008), DOI: 10.1016/j.sigpro.2007.11.017
Description
07.08.13 KB. Ok to add the accepted version to Spiral, Elsevier says ok whlile mandate not enforced.