Diffusion model with temporal constraint for 3D human pose estimation

Abstract

3D human pose estimation has received increasing attention as it is the foundation for many downstream tasks. However, this task is challenging due to inherent depth ambiguity and occlusion issues. Thanks to the ability of diffusion models to generate multiple hypotheses, they are promising in reducing uncertainty in results. Inspired by this, we propose a diffusion-based temporal constraint framework for 3D human pose estimation, called DTCPose, which generates multiple 3D candidate poses with 2D poses as conditions to synthesize the final pose to improve estimation accuracy. Simultaneously, to ensure the temporal stability of the 3D output sequences, we introduce temporal constraints into the model to reduce the jitter of the results. Extensive experiments on Human3.6M and MPI-INF3DHP datasets demonstrate that our approach performs predominantly in both single-hypothesis and multi-hypothesis 3D human pose estimation.

Publication
The Visual Computer
Ju Dai(代菊)
Ju Dai(代菊)

My research interests include distributed robotics, mobile computing and programmable matter.

Junjun Pan(潘俊君)
Junjun Pan(潘俊君)
Professor of Beihang University

My research interests include computer vision, computer graphics, animation and medical simulation.