Altmetric
Audio-visual tracking by density approximation in a sequential Bayesian filtering framework
Title: | Audio-visual tracking by density approximation in a sequential Bayesian filtering framework |
Authors: | Gebru, ID Evers, C Naylor, PA Horaud, R |
Item Type: | Conference Paper |
Abstract: | This paper proposes a novel audio-visual tracking approach that exploits constructively audio and visual modalities in order to estimate trajectories of multiple people in a joint state space. The tracking problem is modeled using a sequential Bayesian filtering framework. Within this framework, we propose to represent the posterior density with a Gaussian Mixture Model (GMM). To ensure that a GMM representation can be retained sequentially over time, the predictive density is approximated by a GMM using the Unscented Transform. While a density interpolation technique is introduced to obtain a continuous representation of the observation likelihood, which is also a GMM. Furthermore, to prevent the number of mixtures from growing exponentially over time, a density approximation based on the Expectation Maximization (EM) algorithm is applied, resulting in a compact GMM representation of the posterior density. Recordings using a camcorder and microphone array are used to evaluate the proposed approach, demonstrating significant improvements in tracking performance of the proposed audio-visual approach compared to two benchmark visual trackers. |
Date of Acceptance: | 24-Jan-2017 |
URI: | http://hdl.handle.net/10044/1/44900 |
DOI: | 10.1109/HSCMA.2017.7895564 |
ISBN: | 9781509059256 |
Publisher: | IEEE |
Start Page: | 71 |
End Page: | 75 |
Journal / Book Title: | Hands-free Speech Communication and Microphone Arrays |
Copyright Statement: | © 2017 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. |
Sponsor/Funder: | Commission of the European Communities |
Funder's Grant Number: | 609465 |
Conference Name: | HSCMA 2017 |
Keywords: | Science & Technology Technology Engineering, Electrical & Electronic Telecommunications Engineering Motion estimation Speech processing Machine vision Bayes methods Audio-visual systems |
Publication Status: | Published |
Start Date: | 2017-03-01 |
Finish Date: | 2017-03-03 |
Conference Place: | San Francisco |
Online Publication Date: | 2017-04-13 |
Appears in Collections: | Electrical and Electronic Engineering Faculty of Engineering |