A polynomial subspace projection approach for the detection of weak voice activity
Author(s)
Neo, Vincent Weisheng
Weiss, Stephan
Naylor, Patrick A
Type
Conference Paper
Abstract
A voice activity detection (VAD) algorithm identifies whether or not time frames contain speech. It is essential for many military and commercial speech processing applications, including speech enhancement, speech coding, speaker identification, and automatic speech recognition. In this work, we adopt earlier work on detecting weak transient signals and propose a polynomial subspace projection pre-processor to improve an existing VAD algorithm. The proposed multi-channel pre-processor projects the microphone signals onto a lower dimensional subspace which attempts to remove the interferer components and thus eases the detection of the speech target. Compared to applying the same VAD to the microphone signal, the proposed approach almost always improves the F1 and balanced accuracy scores even in adverse environments, e.g. -30 dB SIR, which may be typical of operations involving noisy machinery and signal jamming scenarios.
Date Issued
2022-09-23
Date Acceptance
2022-06-28
Citation
Sensor Signal Processing for Defence, 2022, pp.1-5
Publisher
IEEE
Start Page
1
End Page
5
Journal / Book Title
Sensor Signal Processing for Defence
Copyright Statement
Copyright © 2022 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
Identifier
https://ieeexplore.ieee.org/document/9896222
Source
Sensor Signal Processing for Defence conference (SSPD)
Publication Status
Published
Start Date
2022-09-13
Finish Date
2022-09-14
Coverage Spatial
London, UK
Date Publish Online
2022-09-23