7
IRUS TotalDownloads
Altmetric
Analysis of phonetic dependence of segmentation errors in speaker diarization
File | Description | Size | Format | |
---|---|---|---|---|
EUSIPCO2020_McKnight.pdf | Submitted version | 716.71 kB | Adobe PDF | View/Open |
Title: | Analysis of phonetic dependence of segmentation errors in speaker diarization |
Authors: | McKnight, SW Hogg, A Naylor, P |
Item Type: | Conference Paper |
Abstract: | Evaluation of speaker segmentation and diarization normally makes use of forgiveness collars around ground truth speaker segment boundaries such that estimated speaker segment boundaries with such collars are considered completely correct. This paper shows that the popular recent approach of removing forgiveness collars from speaker diarization evaluation tools can unfairly penalize speaker diarization systems that correctly estimate speaker segment boundaries. The uncertainty in identifying the start and/or end of a particular phoneme means that the ground truth segmentation is not perfectly accurate, and even trained human listeners are unable to identify phoneme boundaries with full consistency. This research analyses the phoneme dependence of this uncertainty, and shows that it depends on (i) whether the phoneme being detected is at the start or end of an utterance and (ii) what the phoneme is, so that the use of a uniform forgiveness collar is inadequate. This analysis is expected to point the way towards more indicative and repeatable assessment of the performance of speaker diarization systems. |
Date of Acceptance: | 29-May-2020 |
URI: | http://hdl.handle.net/10044/1/80786 |
DOI: | 10.23919/Eusipco47968.2020.9287552 |
ISBN: | 978-9-0827-9705-3 |
ISSN: | 2076-1465 |
Publisher: | IEEE |
Journal / Book Title: | 2020 28th European Signal Processing Conference (EUSIPCO) |
Copyright Statement: | © 2020 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. |
Conference Name: | European Signal Processing Conference (EUSIPCO) |
Keywords: | Science & Technology Technology Acoustics Computer Science, Software Engineering Engineering, Electrical & Electronic Imaging Science & Photographic Technology Computer Science Engineering Speaker diarization forgiveness collar phoneme boundary diarization scoring |
Publication Status: | Published online |
Start Date: | 2021-01-18 |
Finish Date: | 2020-01-22 |
Conference Place: | Amsterdam, NL |
Online Publication Date: | 2020-12-18 |
Appears in Collections: | Electrical and Electronic Engineering Faculty of Engineering |