95
IRUS TotalDownloads
Altmetric
Cross-relation based blind identification of acoustic SIMO systems and applications
File | Description | Size | Format | |
---|---|---|---|---|
Hu-M-2017-PhD-Thesis.pdf | Thesis | 3.41 MB | Adobe PDF | View/Open |
Title: | Cross-relation based blind identification of acoustic SIMO systems and applications |
Authors: | Hu, Mathieu |
Item Type: | Thesis or dissertation |
Abstract: | Speech signals captured by microphones placed at a distance from the speaker are cor- rupted by reverberation, i.e. sound waves reflected off hard surfaces such as walls and objects. The spectral distortion caused by reverberation drastically decreases the perfor- mance of automatic speech recognition systems and may degrade the intelligibility and the quality of speech for human listeners. The increased use of devices controlled by distant speech therefore induces the need for dereverberation. A possible approach to dereverberation is that of system equalization, which consists of the blind estimation of the room impulse responses from noisy reverberant signals followed by an inversion of these impulse responses. This thesis investigates the first part of this two-stage approach. The cross-relation method is adopted and exploited in two different ways. The first way follows the adaptive filter framework, which was first introduced in the context of blind identification of room impulses responses in the Multi-Channel Least Mean Square. By considering a block update of this stochastic gradient algorithm, a noise robust algorithm is developed. The convergence rate of the resulting algorithm is then increased by using a locally optimal adaptive step-size. The cross-relation, expressed in the frequency domain, is then shown to contain the transfer function relating any of the microphone to a reference microphone. This relative transfer function can be used to reduce the number of variables to be estimated. However, the performance of the previous methods severely degrades when realisti- cally long room impulse responses are considered. An alternative interpretation of the cross-relation, from an annihilation filter perspective, is therefore explored. The resulting algorithm is shown to be able to estimate room impulse responses of thousands of taps. From a more practical perspective, the use of room impulses estimated at a poor accuracy is investigated for the problem of speaker diarization. The spatial information captured in the direct-to-reverberant ratio is shown to be robust to high levels of errors in the estimated room impulse responses. Blindly estimated direct-to-reverberant ratios combined with speech features in a single-channel diarization system are shown to provide additional information, which improves the performance of the diarization system. |
Content Version: | Open Access |
Issue Date: | Mar-2017 |
Date Awarded: | Sep-2017 |
URI: | http://hdl.handle.net/10044/1/52430 |
DOI: | https://doi.org/10.25560/52430 |
Supervisor: | Naylor, Patrick A. Brookes, Mike |
Sponsor/Funder: | European Union |
Funder's Grant Number: | grant agreement n◦ ITN-GA-2012-316969 grant agreement no. 609465 |
Department: | Electrical and Electronic Engineering |
Publisher: | Imperial College London |
Qualification Level: | Doctoral |
Qualification Name: | Doctor of Philosophy (PhD) |
Appears in Collections: | Electrical and Electronic Engineering PhD theses |