Fig. 5 Mean per joint position error of interacting hand
sequences on various test sets.
Different methods were used to test interacting hand pictures. We
averaged the error of the left and right hand joints as the error of
each joint point. As seen in Fig. 5, it was more difficult to predict
the joint points near the fingertips than those near the palm. For all
joints, the average errors of our method were lower than those of the
compared methods. Fig. 6 shows the hand pose estimation results of
PoseNet, InterNet, and the MS-FF. Since most joints are flexible,
occlusions will be present when gestures interact, so it is more
complicated to estimate hand poses through a single RGB picture. As seen
in Fig. 6, our results are better than the others.