PointNetV3: feature extraction with position encoding

Jun Wang, Xuefei Wang, Boxiong Zhou, Dongyan Guo

Optoelectronics Letters ›› 2024, Vol. 20 ›› Issue (8) : 483-489. DOI: 10.1007/s11801-024-3172-8
Article

PointNetV3: feature extraction with position encoding

Author information +
History +

Abstract

Feature extraction of point clouds is a fundamental component of three-dimensional (3D) vision tasks. While existing feature extraction networks primarily focus on enhancing the geometric perception abilities of networks and overlook the crucial role played by coordinates. For instance, though two airplane wings share the same shape, it demands distinct feature representations due to their differing positions. In this paper, we introduce a novel module called position aware module (PAM) to leverage the coordinate features of points for positional encoding, and integrating this encoding into the feature extraction network to provide essential positional context. Furthermore, we embed PAM into the PointNet++ framework, and design a novel feature extraction network, named PointNetV3. To validate the effectiveness of PointNetV3, we conducted comprehensive experiments including classification, object tracking and object detection on point cloud. The results of remarkable improvement in three tasks demonstrate the exceptional performance achieved by PointNetV3 in point cloud processing.

Cite this article

Download citation ▾
Jun Wang, Xuefei Wang, Boxiong Zhou, Dongyan Guo. PointNetV3: feature extraction with position encoding. Optoelectronics Letters, 2024, 20(8): 483‒489 https://doi.org/10.1007/s11801-024-3172-8

References

[[1]]
Wu Z, Song S, Khosla A, et al.. 3D shapenets: a deep representation for volumetric shapes[C]. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, June 7–12, 2015, Boston, MA, USA, 2015 New York IEEE 1912-1920
[[2]]
Maturana D, Scherer S. Voxnet: a 3D convolutional neural network for real-time object recognition[C]. 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), September 28–October 2, 2015, Hamburg, Germany, 2015 New York IEEE 922-928
[[3]]
Qi C R, Su H, Nießner M, et al.. Volumetric and multi-view CNNs for object classification on 3D data[C]. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, June 27–30, 2016, Las Vegas, NV, USA, 2016 New York IEEE 5648-5656
[[4]]
Su H, Maji S, Kalogerakis E, et al.. Multi-view convolutional neural networks for 3D shape recognition[C]. Proceedings of the IEEE International Conference on Computer Vision, December 7–13, 2015, Santiago, Chile, 2015 New York IEEE 945-953
[[5]]
Boulch A, Le Saux B, Audebert N. Unstructured point cloud semantic labeling using deep segmentation networks[J]. 3DOR, 2017, 3: 17-24
[[6]]
Lawin F J, Danelljan M, Tosteberg P, et al.. Deep projective 3D semantic segmentation[C]. Computer Analysis of Images and Patterns: 17th International Conference, August 22–24, 2017, Ystad, Sweden, 2017 Berlin, Heidelberg Springer 95-107,
CrossRef Google scholar
[[7]]
Qi C R, Su H, Mo K, et al.. PointNet: deep learning on point sets for 3D classification and segmentation[C]. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, July 21–26, 2017, Honolulu, HI, USA, 2017 New York IEEE 652-660
[[8]]
Qi C R, Yi L, Su H, et al.. PointNet++: deep hierarchical feature learning on point sets in a metric space[J]. Advances in neural information processing systems, 2017, 30: 5100-5110
[[9]]
Li Y, Bu R, Sun M, et al.. PointCNN: convolution on X-transformed points[J]. Advances in neural information processing systems, 2018, 31: 820-830
[[10]]
Thomas H, Qi C R, Deschaud J E, et al.. KPConv: flexible and deformable convolution for point clouds[C]. Proceedings of the IEEE/CVF International Conference on Computer Vision, October 27-November 2, 2019, Seoul, Korea (South), 2019 New York IEEE 6411-6420
[[11]]
Xu M, Ding R, Zhao H, et al.. PAConv: position adaptive convolution with dynamic kernel assembling on point clouds[C]. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, June 20–25, 2021, Nashville, TN, USA, 2021 New York IEEE 3173-3182
[[12]]
Xu Y, Fan T, Xu M, et al.. SpiderCNN: deep learning on point sets with parameterized convolutional filters[C]. Proceedings of the European Conference on Computer Vision, September 8–14, 2018, Munich, Germany, 2018 Berlin, Heidelberg Springer 87-102
[[13]]
Li K, Zhang Z, Zhong C, et al.. Robust structured declarative classifiers for 3D point clouds: defending adversarial attacks with implicit gradients[C]. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, June 18–24, 2022, New Orleans, LA, USA, 2022 New York IEEE 15294-15304
[[14]]
Chen J, Yang M, Velipasalar S. ViewNet: a novel projection-based backbone with view pooling for few-shot point cloud classification[C]. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, June 18–22, 2023, Vancouver, BC, Canada, 2023 New York IEEE 17652-17660
[[15]]
Zhang J, Jia J, Liu H, et al.. PointCert: point cloud classification with deterministic certified robustness guarantees[C]. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, June 18–22, 2023, Vancouver, BC, Canada, 2023 New York IEEE 9496-9505
[[16]]
Wang X, Girshick R, Gupta A, et al.. Non-local neural networks[C]. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, June 18–23, 2018, Salt Lake City, UT, USA, 2018 New York IEEE 7794-7803
[[17]]
Liu Z, Lin Y, Cao Y, et al.. Swin transformer: hierarchical vision transformer using shifted windows[C]. Proceedings of the IEEE/CVF International Conference on Computer Vision, October 10–17, 2021, Montreal, QC, Canada, 2021 New York IEEE 10012-10022
[[18]]
Zheng S, Lu J, Zhao H, et al.. Rethinking semantic segmentation from a sequence-to-sequence perspective with transformers[C]. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, June 20–25, 2021, Nashville, TN, USA, 2021 New York IEEE 6881-6890
[[19]]
Wang Y, Sun Y, Liu Z, et al.. Dynamic graph CNN for learning on point clouds[J]. ACM transactions on graphics (tog), 2019, 38(5): 1-12,
CrossRef Google scholar
[[20]]
Klokov R, Lempitsky V. Escape from cells: deep KD-networks for the recognition of 3D point cloud models[C]. Proceedings of the IEEE International Conference on Computer Vision, October 22–19, 2017, Venice, Italy, 2017 New York IEEE 863-872
[[21]]
Wu W, Qi Z, Fuxin L. Pointconv: deep convolutional networks on 3D point clouds[C]. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, June 15–20, 2019, Long Beach, CA, USA, 2019 New York IEEE 9621-9630
[[22]]
Jiang L, Zhao H, Liu S, et al.. Hierarchical point-edge interaction network for point cloud semantic segmentation[C]. Proceedings of the IEEE/CVF International Conference on Computer Vision, October 27–November 2, 2019, Seoul, Korea (South), 2019 New York IEEE 10433-10441
[[23]]
ATZMON M, MARON H, LIPMAN Y. Point convolutional neural networks by extension operators[EB/OL]. (2028-03-27) [2023-06-24]. https://arxiv.org/abs/1803.10091v1.
[[24]]
Engle N, Belaaiannis V, Dietmayer K. Point transformer[J]. IEEE access, 2021, 9: 134826-134840,
CrossRef Google scholar
[[25]]
Giancola S, Zarzar J, Ghanem B. Leveraging shape completion for 3D siamese tracking[C]. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, June 15–20, 2019, Long Beach, CA, USA, 2019 New York IEEE 1359-1368
[[26]]
Qi H, Feng C, Cao Z, et al.. P2B: point-to-box network for 3D object tracking in point clouds[C]. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, June 13–19, 2020, Seattle, WA, USA, 2020 New York IEEE 6329-6338
[[27]]
Zheng C, Yan X, Gao J, et al.. Box-aware feature enhancement for single object tracking on point clouds[C]. Proceedings of the IEEE/CVF International Conference on Computer Vision, October 10–17, 2021, Montreal, QC, Canada, 2021 New York IEEE 13199-13208
[[28]]
CUI Y, FANG Z, SHAN J, et al. 3D object tracking with transformer[EB/OL]. (2021-10-28) [2023-06-24]. https://arxiv.org/abs/2110.14921.
[[29]]
Hui L, Wang L, Cheng M, et al.. 3D Siamese voxel-to-BEV tracker for sparse point clouds[J]. Advances in neural information processing systems, 2021, 34: 28714-28727
[[30]]
Zhou C, Luo Z, Luo Y, et al.. PTTR: relational 3D point cloud object tracking with transformer[C]. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, June 18–24, 2022, New Orleans, LA, USA, 2022 New York IEEE 8531-8540
[[31]]
Menze M, Geiger A. Object scene flow for autonomous vehicles[C]. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, June 7–12, 2015, Boston, MA, USA, 2015 New York IEEE 3061-3070
[[32]]
Yang Z, Sun Y, Liu S, et al.. 3DSSD: point-based 3D single stage object detector[C]. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, June 13–19, 2020, Seattle, WA, USA, 2020 New York IEEE 11040-11048
[[33]]
Zhang Y, Hu Q, Xu G, et al.. Not all points are equal: learning highly efficient point-based detectors for 3D lidar point clouds[C]. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, June 18–24, 2022, New Orleans, LA, USA, 2022 New York IEEE 18953-18962

Accesses

Citations

Detail

Sections
Recommended

/