95
IRUS Total
Downloads
  Altmetric

Cross-relation based blind identification of acoustic SIMO systems and applications

File Description SizeFormat 
Hu-M-2017-PhD-Thesis.pdfThesis3.41 MBAdobe PDFView/Open
Title: Cross-relation based blind identification of acoustic SIMO systems and applications
Authors: Hu, Mathieu
Item Type: Thesis or dissertation
Abstract: Speech signals captured by microphones placed at a distance from the speaker are cor- rupted by reverberation, i.e. sound waves reflected off hard surfaces such as walls and objects. The spectral distortion caused by reverberation drastically decreases the perfor- mance of automatic speech recognition systems and may degrade the intelligibility and the quality of speech for human listeners. The increased use of devices controlled by distant speech therefore induces the need for dereverberation. A possible approach to dereverberation is that of system equalization, which consists of the blind estimation of the room impulse responses from noisy reverberant signals followed by an inversion of these impulse responses. This thesis investigates the first part of this two-stage approach. The cross-relation method is adopted and exploited in two different ways. The first way follows the adaptive filter framework, which was first introduced in the context of blind identification of room impulses responses in the Multi-Channel Least Mean Square. By considering a block update of this stochastic gradient algorithm, a noise robust algorithm is developed. The convergence rate of the resulting algorithm is then increased by using a locally optimal adaptive step-size. The cross-relation, expressed in the frequency domain, is then shown to contain the transfer function relating any of the microphone to a reference microphone. This relative transfer function can be used to reduce the number of variables to be estimated. However, the performance of the previous methods severely degrades when realisti- cally long room impulse responses are considered. An alternative interpretation of the cross-relation, from an annihilation filter perspective, is therefore explored. The resulting algorithm is shown to be able to estimate room impulse responses of thousands of taps. From a more practical perspective, the use of room impulses estimated at a poor accuracy is investigated for the problem of speaker diarization. The spatial information captured in the direct-to-reverberant ratio is shown to be robust to high levels of errors in the estimated room impulse responses. Blindly estimated direct-to-reverberant ratios combined with speech features in a single-channel diarization system are shown to provide additional information, which improves the performance of the diarization system.
Content Version: Open Access
Issue Date: Mar-2017
Date Awarded: Sep-2017
URI: http://hdl.handle.net/10044/1/52430
DOI: https://doi.org/10.25560/52430
Supervisor: Naylor, Patrick A.
Brookes, Mike
Sponsor/Funder: European Union
Funder's Grant Number: grant agreement n◦ ITN-GA-2012-316969
grant agreement no. 609465
Department: Electrical and Electronic Engineering
Publisher: Imperial College London
Qualification Level: Doctoral
Qualification Name: Doctor of Philosophy (PhD)
Appears in Collections:Electrical and Electronic Engineering PhD theses



Unless otherwise indicated, items in Spiral are protected by copyright and are licensed under a Creative Commons Attribution NonCommercial NoDerivatives License.

Creative Commons