490
IRUS Total
Downloads
  Altmetric

3D hand pose estimation: methods, datasets, and challenges

File Description SizeFormat 
Yuan-S-2019-PhD-thesis.pdfThesis34.88 MBAdobe PDFView/Open
Title: 3D hand pose estimation: methods, datasets, and challenges
Authors: Yuan, Shanxin
Item Type: Thesis or dissertation
Abstract: 3D hand pose estimation is an important task in the computer vision community due to its vast applications, including but not limited to, human computer interaction, virtual reality and augmented reality, sign language recognition, medical image analysis. The challenges for this task lie in high degree of freedom of a human hand, self-occlusions, different hand shapes, ambiguities among different fingers. The obstacles faced by the research communities are lack of proper methods and limitation in the current datasets. In view of this, I investigate in this thesis in three aspects: methods, datasets, and challenges. More specifically, the contributions of this thesis are: (1) Proposed a large-scale hand pose dataset, collected using a novel capture method, the dataset is known as the BigHand2.2M dataset; (2) Hosted a depth-based 3D hand pose challenge that attracted the top research groups across the world to evaluate the current best state-of-the-art methods, to investigate into the best practices, and to show some promising research directions; (3) Proposed a method for 3D hand pose estimation from RGB images with privileged information from depth data. Real datasets are limited in quantity and coverage, mainly due to the difficulty to annotate them. To deal with this issue, this thesis proposed a tracking system with six magnetic 6D sensors and inverse kinematics to automatically obtain 21-joints hand pose annotations of depth maps captured with minimal restriction on the range of motion. The automatic annotation method allows us to build the largest real dataset with higher joint annotations accuracy. To find out the current state of 3D hand pose estimation from depth and the next challenges, we hosted the Hands In the Million Challenge (HIM2017), and investigated the state-of-the-art methods on three tasks: single frame 3D pose estimation, 3D hand tracking, and hand pose estimation during object interaction. This thesis analysed the performance of different CNN structures with regard to hand shape, joint visibility, view point and articulation distributions. The advancement in hand pose estimation from RGB images lagged behind that of depth images, this thesis proposed a method for hand pose estimation from RGB images that uses both external large-scale depth image datasets and paired depth and RGB images as privileged information at training time. We show that providing depth information during training significantly improves performance of pose estimation from RGB images during testing.
Content Version: Open Access
Issue Date: Oct-2018
Date Awarded: Jun-2019
URI: http://hdl.handle.net/10044/1/70791
DOI: https://doi.org/10.25560/70791
Copyright Statement: Creative Commons Attribution NonCommercial Licence
Supervisor: Kim, Tae-Kyun
Department: Electrical and Electronic Engineering
Publisher: Imperial College London
Qualification Level: Doctoral
Qualification Name: Doctor of Philosophy (PhD)
Appears in Collections:Electrical and Electronic Engineering PhD theses