Point-voxel dual transformer for LiDAR 3D object detection
Jigang Tong , Fanhang Yang , Sen Yang , Shengzhi Du
Optoelectronics Letters ›› 2025, Vol. 21 ›› Issue (9) : 547 -554.
Point-voxel dual transformer for LiDAR 3D object detection
In this paper, a two-stage light detection and ranging (LiDAR) three-dimensional (3D) object detection framework is presented, namely point-voxel dual transformer (PV-DT3D), which is a transformer-based method. In the proposed PV-DT3D, point-voxel fusion features are used for proposal refinement. Specifically, keypoints are sampled from entire point cloud scene and used to encode representative scene features via a proposal-aware voxel set abstraction module. Subsequently, following the generation of proposals by the region proposal networks (RPN), the internal encoded keypoints are fed into the dual transformer encoder-decoder architecture. In 3D object detection, the proposed PV-DT3D takes advantage of both point-wise transformer and channel-wise architecture to capture contextual information from the spatial and channel dimensions. Experiments conducted on the highly competitive KITTI 3D car detection leaderboard show that the PV-DT3D achieves superior detection accuracy among state-of-the-art point-voxel-based methods.
Tianjin University of Technology
/
| 〈 |
|
〉 |