Spatial and temporal analysis of facial actions
File(s)
Author(s)
Jiang, Bihan
Type
Thesis or dissertation
Abstract
Facial expression recognition has been an active topic in computer vision since 90s due
to its wide applications in human-computer interaction, entertainment, security, and health
care. Previous works on automatic analysis of facial expressions have focused mostly on
detecting prototypic expressions of basic emotions like happiness and anger. In contrast,
the Facial Action Coding System (FACS) is one of the most comprehensive and objective
ways to describe facial expressions. It associates facial expressions with the actions of
the muscles that produce them by defining a set of atomic movements called Action Units
(AUs). The system allows any facial expressions to be uniquely described by a combination of AUs. Over the past decades, extensive research has been conducted by psychologists
and neuroscientists on various applications of facial expression analysis using FACS. Automating FACS coding would make this research faster and more widely applicable, opening up new avenues to understanding how we communicate through facial expressions.
Morphology and dynamics are the two aspects of facial actions, that are crucial for the
interpretation of human facial behaviour. The focus of this thesis is how to represent and
learn the rich facial texture changes in both the spatial and temporal domain. The effectiveness of spatial and spatio-temporal facial representations and their roles in detecting the activation and temporal dynamics of facial actions are explored. In the spatial domain,
a novel feature extraction strategy is proposed based on a heuristically defined regions
from which a separate classifier is trained and fused in the decision-level. In the temporal
domain, a novel dynamic appearance descriptor is presented by extending the static appearance descriptor Local Phase Quantisation (LPQ) to the temporal domain by using the
Three Orthogonal Planes (TOP). The resulting dynamic appearance descriptor LPQ-TOP
is applied to detect the latent temporal information representing facial appearance changes and explicitly model facial dynamics of AUs in terms of their temporal segments. Finally, a parametric temporal alignment method is proposed. Such strategy can accommodate very flexible time warp functions and is able to deal with both sequence-to-sequence and sub-sequence alignment. This method also opens up a new approach to the problem of AU temporal segment detection.
This thesis contributes to facial action recognition by modelling the spatial and temporal texture changes for AU activation detection and AU temporal segmentation. We
advance the performance of state-of-the-art facial action recognition systems and this has
been demonstrated on a number of commonly used databases.
to its wide applications in human-computer interaction, entertainment, security, and health
care. Previous works on automatic analysis of facial expressions have focused mostly on
detecting prototypic expressions of basic emotions like happiness and anger. In contrast,
the Facial Action Coding System (FACS) is one of the most comprehensive and objective
ways to describe facial expressions. It associates facial expressions with the actions of
the muscles that produce them by defining a set of atomic movements called Action Units
(AUs). The system allows any facial expressions to be uniquely described by a combination of AUs. Over the past decades, extensive research has been conducted by psychologists
and neuroscientists on various applications of facial expression analysis using FACS. Automating FACS coding would make this research faster and more widely applicable, opening up new avenues to understanding how we communicate through facial expressions.
Morphology and dynamics are the two aspects of facial actions, that are crucial for the
interpretation of human facial behaviour. The focus of this thesis is how to represent and
learn the rich facial texture changes in both the spatial and temporal domain. The effectiveness of spatial and spatio-temporal facial representations and their roles in detecting the activation and temporal dynamics of facial actions are explored. In the spatial domain,
a novel feature extraction strategy is proposed based on a heuristically defined regions
from which a separate classifier is trained and fused in the decision-level. In the temporal
domain, a novel dynamic appearance descriptor is presented by extending the static appearance descriptor Local Phase Quantisation (LPQ) to the temporal domain by using the
Three Orthogonal Planes (TOP). The resulting dynamic appearance descriptor LPQ-TOP
is applied to detect the latent temporal information representing facial appearance changes and explicitly model facial dynamics of AUs in terms of their temporal segments. Finally, a parametric temporal alignment method is proposed. Such strategy can accommodate very flexible time warp functions and is able to deal with both sequence-to-sequence and sub-sequence alignment. This method also opens up a new approach to the problem of AU temporal segment detection.
This thesis contributes to facial action recognition by modelling the spatial and temporal texture changes for AU activation detection and AU temporal segmentation. We
advance the performance of state-of-the-art facial action recognition systems and this has
been demonstrated on a number of commonly used databases.
Version
Open Access
Date Issued
2014-06
Date Awarded
2014-11
Advisor
Pantic, Maja
Sponsor
European Community
Grant Number
231287
Publisher Department
Computing
Publisher Institution
Imperial College London
Qualification Level
Doctoral
Qualification Name
Doctor of Philosophy (PhD)