Complex Filters and Higher-Order Spatial Information for Image Categorization
Author(s)
Alexiou, Ioannis
Type
Thesis or dissertation
Abstract
This Thesis applies complex spatial filters to the front end filtering to a
computer vision framework for object recognition and scene categorization.
This involves careful filter design in the Fourier domain based
on discrete frame properties. Biological plausibility of the suggested filtering is compared against a common model found in the computer
vision literature. The designed complex filter bank is equipped with
focus-of-attention operators. Specifically, two possible keypoint detection
methodologies are examined and compared with state of the art keypoint
detection methods. This includes an investigation of scale-estimation
methods. In addition, three image patch descriptor arrangements are
proposed to sample the complex filter responses, and an initial evaluation
of categorization performance is undertaken. Next, the spatial pooling
arrangement of the best performing descriptor is further optimised and
the performance of different complex filter bandwidths is examined in
class separation tasks. A further study is conducted on the effects of a
Winner-Take-All (WTA) approach to modifying filter responses before
pooling. A thorough evaluation of descriptor performance is undertaken
to reveal any advantages or disadvantages from a variety of perspectives.
Next, the clustering behaviour of descriptors of various types is inspected
in the descriptor feature space. A reverse look-up of visual words attempts
to relate clustering behaviour to descriptor performance. Typical
grouping approaches, such as spatial pyramids, are then compared with
a novel method for coupling visual words in which a linear kernel SVM
learns class separability. A final evaluation on this stage is presented
and discussed, leading to conclusive arguments about the importance of
careful approaches to word-pairing for good-quality categorization.
computer vision framework for object recognition and scene categorization.
This involves careful filter design in the Fourier domain based
on discrete frame properties. Biological plausibility of the suggested filtering is compared against a common model found in the computer
vision literature. The designed complex filter bank is equipped with
focus-of-attention operators. Specifically, two possible keypoint detection
methodologies are examined and compared with state of the art keypoint
detection methods. This includes an investigation of scale-estimation
methods. In addition, three image patch descriptor arrangements are
proposed to sample the complex filter responses, and an initial evaluation
of categorization performance is undertaken. Next, the spatial pooling
arrangement of the best performing descriptor is further optimised and
the performance of different complex filter bandwidths is examined in
class separation tasks. A further study is conducted on the effects of a
Winner-Take-All (WTA) approach to modifying filter responses before
pooling. A thorough evaluation of descriptor performance is undertaken
to reveal any advantages or disadvantages from a variety of perspectives.
Next, the clustering behaviour of descriptors of various types is inspected
in the descriptor feature space. A reverse look-up of visual words attempts
to relate clustering behaviour to descriptor performance. Typical
grouping approaches, such as spatial pyramids, are then compared with
a novel method for coupling visual words in which a linear kernel SVM
learns class separability. A final evaluation on this stage is presented
and discussed, leading to conclusive arguments about the importance of
careful approaches to word-pairing for good-quality categorization.
Date Issued
2013-02
Date Awarded
2013-07
Advisor
Bharath, Anil
Publisher Department
Bioengineering
Publisher Institution
Imperial College London
Qualification Level
Doctoral
Qualification Name
Doctor of Philosophy (PhD)