48
IRUS TotalDownloads
Altmetric
Event modelling and recognition in video
File | Description | Size | Format | |
---|---|---|---|---|
Gkalelis-N-2014-PhD-Thesis.pdf | Thesis | 51.15 MB | Adobe PDF | View/Open |
Title: | Event modelling and recognition in video |
Authors: | Gkalelis, Nikolaos |
Item Type: | Thesis or dissertation |
Abstract: | The management of digital video has become a very challenging problem as the amount of video content continues to witness phenomenal growth. This trend necessitates the development of advanced techniques for the efficient and effective manipulation of video information. However, the performance of current video processing tools has not yet reached the required satisfaction levels mainly due to the gap between the computer generated semantic descriptions of video content and the interpretations of the same content by humans, a discrepancy commonly referred to as the semantic gap. Inspired from recent studies in neuroscience suggesting that humans remember real life using past experience structured in events, in this thesis we investigate the use of appropriate models and machine learning approaches for representing and recognizing events in video. Specifically, a joint content-event model is proposed for describing video content (e.g., shots, scenes, etc.), as well as real-life events (e.g., demonstration, birthday party, etc.) and their key semantic entities (participants, location, etc.). In the core of this model stands a referencing mechanism which utilizes a set of video analysis algorithms for the automatic generation of event model instances and their enrichment with semantic information extracted from the video content. In particular, a set of subclass discriminant analysis and support vector machine methods for handling data nonlinearities and addressing several limitations of the current state-of-the-art approaches are proposed. These approaches are evaluated using several publicly available benchmarks particularly suited for testing the robustness and reliability of nonlinear classification methods, such as the facial image collection of the Four Face database, datasets from the UCI repository, and other. Moreover, the most efficient of the proposed methods are additionally evaluated using a large-scale video collection, consisting of the datasets provided in TRECVID multimedia event detection (MED) track of 2010 and 2011, which are among the most challenging in this field, for the tasks of event detection and event recounting. This experiment is designed in such a manner so that it can be conceived as a fundamental evaluation of the proposed joint content-event model. |
Content Version: | Open Access |
Issue Date: | Jun-2013 |
Date Awarded: | Feb-2014 |
URI: | http://hdl.handle.net/10044/1/23932 |
DOI: | https://doi.org/10.25560/23932 |
Supervisor: | Stathaki, Tania |
Department: | Electrical and Electronic Engineering |
Publisher: | Imperial College London |
Qualification Level: | Doctoral |
Qualification Name: | Doctor of Philosophy (PhD) |
Appears in Collections: | Electrical and Electronic Engineering PhD theses |