PointNetV3: feature extraction with position encoding

Jun Wang; Xuefei Wang; Boxiong Zhou; Dongyan Guo

doi:10.1007/s11801-024-3172-8

Optoelectronics Letters ›› 2024, Vol. 20 ›› Issue (8) : 483-489. DOI: 10.1007/s11801-024-3172-8

Article

PointNetV3: feature extraction with position encoding

Author information +

History +

Abstract

Feature extraction of point clouds is a fundamental component of three-dimensional (3D) vision tasks. While existing feature extraction networks primarily focus on enhancing the geometric perception abilities of networks and overlook the crucial role played by coordinates. For instance, though two airplane wings share the same shape, it demands distinct feature representations due to their differing positions. In this paper, we introduce a novel module called position aware module (PAM) to leverage the coordinate features of points for positional encoding, and integrating this encoding into the feature extraction network to provide essential positional context. Furthermore, we embed PAM into the PointNet++ framework, and design a novel feature extraction network, named PointNetV3. To validate the effectiveness of PointNetV3, we conducted comprehensive experiments including classification, object tracking and object detection on point cloud. The results of remarkable improvement in three tasks demonstrate the exceptional performance achieved by PointNetV3 in point cloud processing.

Cite this article

EndNote

Ris (Procite)

Bibtex

Download citation ▾

Jun Wang, Xuefei Wang, Boxiong Zhou, Dongyan Guo. PointNetV3: feature extraction with position encoding. Optoelectronics Letters, 2024, 20(8): 483‒489 https://doi.org/10.1007/s11801-024-3172-8

References

Publishing order | Descend order by publishing year | Descend order by cited within

[1]	WuZ, SongS, KhoslaA, et al.. 3D shapenets: a deep representation for volumetric shapes[C]. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, June 7–12, 2015, Boston, MA, USA, 2015, New York, IEEE: 1912-1920

[2]	MaturanaD, SchererS. Voxnet: a 3D convolutional neural network for real-time object recognition[C]. 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), September 28–October 2, 2015, Hamburg, Germany, 2015, New York, IEEE: 922-928

[3]	QiC R, SuH, NießnerM, et al.. Volumetric and multi-view CNNs for object classification on 3D data[C]. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, June 27–30, 2016, Las Vegas, NV, USA, 2016, New York, IEEE: 5648-5656

[4]	SuH, MajiS, KalogerakisE, et al.. Multi-view convolutional neural networks for 3D shape recognition[C]. Proceedings of the IEEE International Conference on Computer Vision, December 7–13, 2015, Santiago, Chile, 2015, New York, IEEE: 945-953

[5]	BoulchA, Le SauxB, AudebertN. Unstructured point cloud semantic labeling using deep segmentation networks[J]. 3DOR, 2017, 3: 17-24

[6]	LawinF J, DanelljanM, TostebergP, et al.. Deep projective 3D semantic segmentation[C]. Computer Analysis of Images and Patterns: 17th International Conference, August 22–24, 2017, Ystad, Sweden, 2017, Berlin, Heidelberg, Springer: 95-107 CrossRef Google scholar

[7]	QiC R, SuH, MoK, et al.. PointNet: deep learning on point sets for 3D classification and segmentation[C]. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, July 21–26, 2017, Honolulu, HI, USA, 2017, New York, IEEE: 652-660

[8]	QiC R, YiL, SuH, et al.. PointNet++: deep hierarchical feature learning on point sets in a metric space[J]. Advances in neural information processing systems, 2017, 30: 5100-5110

[9]	LiY, BuR, SunM, et al.. PointCNN: convolution on X-transformed points[J]. Advances in neural information processing systems, 2018, 31: 820-830

[10]	ThomasH, QiC R, DeschaudJ E, et al.. KPConv: flexible and deformable convolution for point clouds[C]. Proceedings of the IEEE/CVF International Conference on Computer Vision, October 27-November 2, 2019, Seoul, Korea (South), 2019, New York, IEEE: 6411-6420

[11]	XuM, DingR, ZhaoH, et al.. PAConv: position adaptive convolution with dynamic kernel assembling on point clouds[C]. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, June 20–25, 2021, Nashville, TN, USA, 2021, New York, IEEE: 3173-3182

[12]	XuY, FanT, XuM, et al.. SpiderCNN: deep learning on point sets with parameterized convolutional filters[C]. Proceedings of the European Conference on Computer Vision, September 8–14, 2018, Munich, Germany, 2018, Berlin, Heidelberg, Springer: 87-102

[13]

, Zhang

, Zhong

, et al.. Robust structured declarative classifiers for 3D point clouds: defending adversarial attacks with implicit gradients[C]. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, June 18–24, 2022, New Orleans, LA, USA, 2022, New York, IEEE: 15294-15304

[14]	ChenJ, YangM, VelipasalarS. ViewNet: a novel projection-based backbone with view pooling for few-shot point cloud classification[C]. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, June 18–22, 2023, Vancouver, BC, Canada, 2023, New York, IEEE: 17652-17660

[15]	ZhangJ, JiaJ, LiuH, et al.. PointCert: point cloud classification with deterministic certified robustness guarantees[C]. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, June 18–22, 2023, Vancouver, BC, Canada, 2023, New York, IEEE: 9496-9505

[16]	WangX, GirshickR, GuptaA, et al.. Non-local neural networks[C]. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, June 18–23, 2018, Salt Lake City, UT, USA, 2018, New York, IEEE: 7794-7803

[17]	LiuZ, LinY, CaoY, et al.. Swin transformer: hierarchical vision transformer using shifted windows[C]. Proceedings of the IEEE/CVF International Conference on Computer Vision, October 10–17, 2021, Montreal, QC, Canada, 2021, New York, IEEE: 10012-10022

[18]	ZhengS, LuJ, ZhaoH, et al.. Rethinking semantic segmentation from a sequence-to-sequence perspective with transformers[C]. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, June 20–25, 2021, Nashville, TN, USA, 2021, New York, IEEE: 6881-6890

[19]	WangY, SunY, LiuZ, et al.. Dynamic graph CNN for learning on point clouds[J]. ACM transactions on graphics (tog), 2019, 38(5):1-12 CrossRef Google scholar

[20]	KlokovR, LempitskyV. Escape from cells: deep KD-networks for the recognition of 3D point cloud models[C]. Proceedings of the IEEE International Conference on Computer Vision, October 22–19, 2017, Venice, Italy, 2017, New York, IEEE: 863-872

[21]	WuW, QiZ, FuxinL. Pointconv: deep convolutional networks on 3D point clouds[C]. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, June 15–20, 2019, Long Beach, CA, USA, 2019, New York, IEEE: 9621-9630

[22]	JiangL, ZhaoH, LiuS, et al.. Hierarchical point-edge interaction network for point cloud semantic segmentation[C]. Proceedings of the IEEE/CVF International Conference on Computer Vision, October 27–November 2, 2019, Seoul, Korea (South), 2019, New York, IEEE: 10433-10441

[23]	ATZMON M, MARON H, LIPMAN Y. Point convolutional neural networks by extension operators[EB/OL]. (2028-03-27) [2023-06-24]. https://arxiv.org/abs/1803.10091v1.

[24]	EngleN, BelaaiannisV, DietmayerK. Point transformer[J]. IEEE access, 2021, 9: 134826-134840 CrossRef Google scholar

[25]	GiancolaS, ZarzarJ, GhanemB. Leveraging shape completion for 3D siamese tracking[C]. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, June 15–20, 2019, Long Beach, CA, USA, 2019, New York, IEEE: 1359-1368

[26]	QiH, FengC, CaoZ, et al.. P2B: point-to-box network for 3D object tracking in point clouds[C]. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, June 13–19, 2020, Seattle, WA, USA, 2020, New York, IEEE: 6329-6338

[27]	ZhengC, YanX, GaoJ, et al.. Box-aware feature enhancement for single object tracking on point clouds[C]. Proceedings of the IEEE/CVF International Conference on Computer Vision, October 10–17, 2021, Montreal, QC, Canada, 2021, New York, IEEE: 13199-13208

[28]	CUI Y, FANG Z, SHAN J, et al. 3D object tracking with transformer[EB/OL]. (2021-10-28) [2023-06-24]. https://arxiv.org/abs/2110.14921.

[29]	HuiL, WangL, ChengM, et al.. 3D Siamese voxel-to-BEV tracker for sparse point clouds[J]. Advances in neural information processing systems, 2021, 34: 28714-28727

[30]	ZhouC, LuoZ, LuoY, et al.. PTTR: relational 3D point cloud object tracking with transformer[C]. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, June 18–24, 2022, New Orleans, LA, USA, 2022, New York, IEEE: 8531-8540

[31]	MenzeM, GeigerA. Object scene flow for autonomous vehicles[C]. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, June 7–12, 2015, Boston, MA, USA, 2015, New York, IEEE: 3061-3070

[32]	YangZ, SunY, LiuS, et al.. 3DSSD: point-based 3D single stage object detector[C]. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, June 13–19, 2020, Seattle, WA, USA, 2020, New York, IEEE: 11040-11048

[33]	ZhangY, HuQ, XuG, et al.. Not all points are equal: learning highly efficient point-based detectors for 3D lidar point clouds[C]. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, June 18–24, 2022, New Orleans, LA, USA, 2022, New York, IEEE: 18953-18962