An attention model and its application in man-made scene interpretation

Jahangiri, Mohammad

41

IRUS Total
Downloads

Altmetric

An attention model and its application in man-made scene interpretation

File	Description	Size	Format
Jahangiri-M-2009-PhD-Thesis.pdf		10.07 MB	Adobe PDF	View/Open

Title:	An attention model and its application in man-made scene interpretation
Authors:	Jahangiri, Mohammad
Item Type:	Thesis or dissertation
Abstract:	The ultimate aim of research into computer vision is designing a system which interprets its surrounding environment in a similar way the human can do effortlessly. However, the state of technology is far from achieving such a goal. In this thesis different components of a computer vision system that are designed for the task of interpreting man-made scenes, in particular images of buildings, are described. The flow of information in the proposed system is bottom-up i.e., the image is first segmented into its meaningful components and subsequently the regions are labelled using a contextual classifier. Starting from simple observations concerning the human vision system and the gestalt laws of human perception, like the law of “good (simple) shape” and “perceptual grouping”, a blob detector is developed, that identifies components in a 2D image. These components are convex regions of interest, with interest being defined as significant gradient magnitude content. An eye tracking experiment is conducted, which shows that the regions identified by the blob detector, correlate significantly with the regions which drive the attention of viewers. Having identified these blobs, it is postulated that a blob represents an object, linguistically identified with its own semantic name. In other words, a blob may contain a window a door or a chimney in a building. These regions are used to identify and segment higher order structures in a building, like facade, window array and also environmental regions like sky and ground. Because of inconsistency in the unary features of buildings, a contextual learning algorithm is used to classify the segmented regions. A model which learns spatial and topological relationships between different objects from a set of hand-labelled data, is used. This model utilises this information in a MRF to achieve consistent labellings of new scenes.
Issue Date:	2009
Date Awarded:	Feb-2010
URI:	http://hdl.handle.net/10044/1/5521
DOI:	https://doi.org/10.25560/5521
Supervisor:	Petrou, Maria
Author:	Jahangiri, Mohammad
Department:	Electrical and Electronic Engineering
Publisher:	Imperial College London
Qualification Level:	Doctoral
Qualification Name:	Doctor of Philosophy (PhD)
Appears in Collections:	Electrical and Electronic Engineering PhD theses

Unless otherwise indicated, items in Spiral are protected by copyright and are licensed under a Creative Commons Attribution NonCommercial NoDerivatives License.