Altmetric

Theory of polynomial matrix eigenvalue decomposition with applications to speech and acoustics signal processing

File Description SizeFormat 
Neo-V-2022-PhD-Thesis.pdfThesis14.45 MBAdobe PDFView/Open
Title: Theory of polynomial matrix eigenvalue decomposition with applications to speech and acoustics signal processing
Authors: Neo Weisheng, Vincent
Item Type: Thesis or dissertation
Abstract: The eigenvalue decomposition (EVD) of covariance matrices is used in signal processing applications such as data compression, noise reduction, direction-of-arrival estimation, source separation and adaptive beamforming. These multi-channel covariance matrices are usually computed using the instantaneous data vector under the assumption of narrowband models. However, when broadband sources are involved, correlations across different sensors and temporal lags need to be considered. Hence, an EVD that removes correlations at a single time lag becomes inadequate for completely decorrelating the signals. To simultaneously capture the correlations in space, time and frequency, space-time covariance polynomial matrices have been proposed as an appropriate model for multi-channel broadband signals. The processing of polynomial matrices has motivated the development of a number of polynomial matrix eigenvalue decomposition (PEVD) algorithms. This thesis first presents an algorithmic enhancement to the well-known second-order sequential best rotation (SBR2) algorithm using Householder transformations motivated by ideas for the symmetric eigenvalue problem in numerical linear algebra. The thesis then introduces and extends polynomial matrices and PEVD to the field of speech and acoustics signal processing. A novel PEVD-based speech enhancement algorithm is proposed for noise reduction and dereverberation. Despite being a blind and unsupervised algorithm, the approach achieves good speech enhancement, and experiments show that it does not introduce any noticeable processing artefacts into the enhanced signal. While the method performs well for arbitrary microphone arrays, the computational complexity scales at best cubically with the number of signals used for processing. When the array geometry information is known and exploited, certain pre-processing reduces the dimensions for PEVD processing. The thesis compares different signal representations using KLT, spherical harmonic transform (SHT) and PEVD for a spherical microphone array geometry. PEVD is shown to provide the most compact spatio-temporal representation of the microphone signals at the cost of high complexity. In contrast, spherical harmonics provide a compact spatial representation of the sound field using eigenbeam signals at a relatively low cost. The thesis provides both theoretical and experimental evidence showing that the PEVD of a smaller number of eigenbeams achieves good speech enhancement and source separation without noticeable audible distortions. One of the goals of this thesis is to provide a bridge between the theoretical developments of PEVD and the applications for signal enhancement, compact signal representation and source separation. The work in this thesis also points in the way for further development in PEVD and its applications to other speech and acoustics signal processing tasks.
Content Version: Open Access
Issue Date: May-2022
Date Awarded: Dec-2022
URI: http://hdl.handle.net/10044/1/109492
DOI: https://doi.org/10.25560/109492
Copyright Statement: Creative Commons Attribution NonCommercial Licence
Supervisor: Naylor, Patrick
Evers, Christine
Sponsor/Funder: Defence Science and Technology Agency
Department: Electrical and Electronic Engineering
Publisher: Imperial College London
Qualification Level: Doctoral
Qualification Name: Doctor of Philosophy (PhD)
Appears in Collections:Electrical and Electronic Engineering PhD theses



This item is licensed under a Creative Commons License Creative Commons