Radar hand gesture recognition based on three-dimensional united
attention network
Abstract
Radar-based dynamic gesture recognition has great potential in
human-computer interaction (HCI) applications. With the development of
wideband radar, the radar signal of hand gestures is often represented
by the three-dimensional (3D) range-Doppler-time cube, which is mainly
processed by multichannel two-dimensional (2D) CNNs and 3D CNNs.
However, the utilization and fusion of different kinds of features in
existing networks is simple and not well optimized. In this paper, an
efficient attention method named three-dimensional united attention
(3D-UA) module in 3D space is proposed. The 3D-UA module applies a
multi-scale pyramid convolution spatially, extracts channel attention
weights on feature maps and captures the global temporal cues
simultaneously. Furthermore, a network named 3D-UANet is proposed by
replacing the 3x3x3 convolution with the 3D-UA module in the 3D-ResNet.
3D-UANet can efficiently extract the range-Doppler-time features of
gestures. Experimental results show that the proposed method has good
generalization performance on data from subjects in complex scenes.