A lightweight pose estimation network with multi-scale receptive field

Abstract

Existing lightweight networks perform inferior to large-scale models in human pose estimation because of shallow model depths and limited receptive fields. Current approaches utilize large convolution kernels or attention mechanisms to encourage long-range receptive field learning at the expense of model redundancy. In this paper, we propose a novel Multi-scale Field Lightweight High-resolution Network (MFite-HRNet) for human pose estimation. Specifically, our model mainly consists of two lightweight blocks, a Multi-scale Receptive Field Block (MRB) and a Large Receptive Field Block (LRB), to learn informative multi-scale and long-range spatial context information. The MRB utilizes group depthwise dilation convolutions with varied dilation rates to extract multi-scale spatial relationships from different feature maps. The LRB leverages large depthwise convolution kernels to model large-range spatial knowledge at the low-level features. We apply MFite-HRNet to single-person and multi-person pose estimation tasks. Experiments on COCO, MPII, and CrowdPose datasets demonstrate that our network outperforms current state-of-the-art lightweight networks in either single-person or multi-person pose estimation tasks. The source code will be publicly available at https://github.com/lskdje/MFite-HRNet.git.

Publication
The Visual Computer 2023(39)
Shuo Li(李烁)
Shuo Li(李烁)
2020 级软院硕士
Ju Dai(代菊)
Ju Dai(代菊)

My research interests include distributed robotics, mobile computing and programmable matter.

Junjun Pan(潘俊君)
Junjun Pan(潘俊君)
Professor of Beihang University

My research interests include computer vision, computer graphics, animation and medical simulation.