Point-voxel dual transformer for LiDAR 3D object detection

Jigang Tong , Fanhang Yang , Sen Yang , Shengzhi Du

Optoelectronics Letters ›› 2025, Vol. 21 ›› Issue (9) : 547 -554.

PDF
Optoelectronics Letters ›› 2025, Vol. 21 ›› Issue (9) : 547 -554. DOI: 10.1007/s11801-025-3134-9
Article
research-article

Point-voxel dual transformer for LiDAR 3D object detection

Author information +
History +
PDF

Abstract

In this paper, a two-stage light detection and ranging (LiDAR) three-dimensional (3D) object detection framework is presented, namely point-voxel dual transformer (PV-DT3D), which is a transformer-based method. In the proposed PV-DT3D, point-voxel fusion features are used for proposal refinement. Specifically, keypoints are sampled from entire point cloud scene and used to encode representative scene features via a proposal-aware voxel set abstraction module. Subsequently, following the generation of proposals by the region proposal networks (RPN), the internal encoded keypoints are fed into the dual transformer encoder-decoder architecture. In 3D object detection, the proposed PV-DT3D takes advantage of both point-wise transformer and channel-wise architecture to capture contextual information from the spatial and channel dimensions. Experiments conducted on the highly competitive KITTI 3D car detection leaderboard show that the PV-DT3D achieves superior detection accuracy among state-of-the-art point-voxel-based methods.

Keywords

A

Cite this article

Download citation ▾
Jigang Tong, Fanhang Yang, Sen Yang, Shengzhi Du. Point-voxel dual transformer for LiDAR 3D object detection. Optoelectronics Letters, 2025, 21(9): 547-554 DOI:10.1007/s11801-025-3134-9

登录浏览全文

4963

注册一个新账户 忘记密码

References

RIGHTS & PERMISSIONS

Tianjin University of Technology

AI Summary AI Mindmap
PDF

26

Accesses

0

Citation

Detail

Sections
Recommended

AI思维导图

/