Dyn-hamr: recovering 4d interacting hand motion from a dynamic camera
Author(s)
Yu, Zhengdi
Zafeiriou, Stefanos
Birdal, Tolga
Type
Conference Paper
Abstract
We propose Dyn-HaMR, to the best of our knowledge, the first approach to reconstruct 4D global hand motion from monocular videos recorded by dynamic cameras in the wild. Reconstructing accurate 3D hand meshes from monocular videos is a crucial task for understanding human behaviour, with significant applications in augmented and virtual reality (AR/VR). However, existing methods for monocular hand reconstruction typically rely on a weak perspective camera model, which simulates hand motion within a limited camera frustum. As a result, these approaches struggle to recover the full 3D global trajectory and often produce noisy or incorrect depth estimations, particularly when the video is captured by dynamic or moving cameras, which is common in egocentric scenarios. Our DynHaMR consists of a multi-stage, multi-objective optimization pipeline, that factors in (i) simultaneous localization and mapping (SLAM) to robustly estimate relative camera motion, (ii) an interacting-hand prior for generative infilling and to refine the interaction dynamics, ensuring plausible recovery under (self-)occlusions, and (iii) hierarchical initialization through a combination of state-of-the-art hand tracking methods. Through extensive evaluations on both in-the-wild and indoor datasets, we show that our approach significantly outperforms state-of-the-art methods in terms of 4D global mesh recovery. This establishes a new benchmark for hand motion reconstruction from monocular video with moving cameras. Our project page is at https://dyn-hamr.github.io/.
Date Issued
2025-08-13
Date Acceptance
2025-02-26
Citation
2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2025, pp.27716-27726
ISBN
979-8-3315-4364-8
ISSN
2575-7075
Publisher
IEEE
Start Page
27716
End Page
27726
Journal / Book Title
2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
Copyright Statement
© 2025 IEEE. This CVPR paper is the Open Access version, provided by the Computer Vision Foundation. Except for this watermark, it is identical to the accepted version; the final published version of the proceedings is available on IEEE Xplore.
Source
The IEEE/CVF Conference on Computer Vision and Pattern Recognition
Publication Status
Published
Start Date
2025-06-11
Finish Date
2025-06-15
Coverage Spatial
Nashville