PointGeo: Geometry Transformer for Point Cloud Analysis

Li An; Pengbo Zhou; Mingquan Zhou; Yong Wang; Guohua Geng; Yangyang Liu

doi:10.1049/cit2.70062

CAAI Transactions on Intelligence Technology ›› 2025, Vol. 10 ›› Issue (6) :1880 -1892. DOI: 10.1049/cit2.70062

ORIGINAL RESEARCH

research-article

PointGeo: Geometry Transformer for Point Cloud Analysis

Author information +

History +

PDF (2094KB)

Abstract

Point cloud processing plays a crucial role in tasks such as point cloud classification, partial segmentation and semantic seg-mentation. However, existing processing frameworks are constrained by several challenges, such as recognising features in irregular and complex spatial structures, large attention parameter volumes and limitations in generalisation across different scenes. We propose a geometry transformer (PointGeo) method for addressing these concerns through point cloud analysis. This method utilises a geometry transformation network to process point cloud data, effectively capturing both local and global features and enhancing the modelling capability for irregular structures. We extensively test this method on multiple datasets, including ModelNet and ScanObjectNN for point cloud classification tasks, ShapeNet for point cloud partial segmentation tasks and S3DIS and SemanticKITTI for point cloud semantic segmentation tasks. Experimental results show that our approach delivers outstanding performance across all tasks, validating its effectiveness and generalisation capability in handling point cloud data.

Keywords

3D / artificial intelligence / image classification / image segmentation

Cite this article

Download citation ▾

Li An, Pengbo Zhou, Mingquan Zhou, Yong Wang, Guohua Geng, Yangyang Liu. PointGeo: Geometry Transformer for Point Cloud Analysis. CAAI Transactions on Intelligence Technology, 2025, 10(6): 1880-1892 DOI:10.1049/cit2.70062

登录浏览全文

4963

注册一个新账户忘记密码

References

Publishing order | Descend order by publishing year | Descend order by cited within

[1]	Y. Wang, Y. Liu, P. Zhou, G. Geng, and Q. Zhang, “SparseFormer: Sparse Transformer Network for Point Cloud Classification,” Com-puters & Graphics 116 (2023): 24-32, https://doi.org/10.1016/j.cag.2023.07.040.

[2]	M. Dai, S. Xing, Q. Xu, et al., “Multiprototype Relational Network for Few-Shot ALS Point Cloud Semantic Segmentation by Transferring Knowledge From Photogrammetric Point Clouds,” IEEE Transactions on Geoscience and Remote Sensing 62 (2024): 1-17, https://doi.org/10.1109/TGRS.2024.3364181.

[3]	P. H. Akwensi, R. Wang, and B. Guo, “PReFormer: A Memory-Efficient Transformer for Point Cloud Semantic Segmentation,” Inter-national Journal of Applied Earth Observation and Geoinformation 128 (2024): 103730, https://doi.org/10.1016/j.jag.2024.103730.

[4]	Y. Liu, L. Kong, J. Cen, et al., “Segment Any Point Cloud Sequences by Distilling Vision Foundation Models,” Advances in Neural Informa-tion Processing Systems 36 (2024): 1-37, https://doi.org/10.48550/arXiv.2306.09347.

[5]	H. Cheng, J. Zhu, J. Lu, and X. Han, “EDGCNet: Joint Dynamic Hyperbolic Graph Convolution and Dual Squeeze-And-Attention for 3D Point Cloud Segmentation,” Expert Systems With Applications 237 (2024): 121551, https://doi.org/10.1016/j.eswa.2023.121551.

[6]	J. Jiang, L. Zhao, X. Lu, W. Hu,I. Razzak, and M. Wang, “DHGCN: Dynamic Hop Graph Convolution Network for Self-Supervised Point Cloud Learning,” in Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 38, no. 11 (2024), 1-9, https://doi.org/10.1609/aaai.v38i11.29185.

[7]	W. Zhou, Y. Zhao, Y. Xiao, X. Min, and J. Yi, “TNPC: Transformer-Based Network for Point Cloud Classification,” Expert Systems With Applications 239 (2024): 122438, https://doi.org/10.1016/j.eswa.2023.122438.

[8]	S. He, X. Jiang, W. Jiang, and H. Ding, “Prototype Adaption and Projection for Few-and Zero-Shot 3D Point Cloud Semantic Segmen-tation,” IEEE Transactions on Image Processing 32 (2023): 3199-3211, https://doi.org/10.1109/TIP.2023.3279660.

[9]	Y. Wang, G. Geng, P. Zhou, Q. Zhang, Z. Li, and R. Feng, “GC-MLP: Graph Convolution MLP for Point Cloud Analysis,” Sensors 22, no. 23 (2022): 9488, https://doi.org/10.3390/s22239488.

[10]	C. Chen, D. Liu, C. Xu, and T. K. Truong, “SAKS: Sampling Adaptive Kernels From Subspace for Point Cloud Graph Convolution,” IEEE Transactions on Circuits and Systems for Video Technology 33, no. 10 (2023): 6013-6025, https://doi.org/10.1109/TCSVT.2023.3263952.

[11]	Y. Lin, Z. Yan, H. Huang, et al., “Fpconv: Learning Local Flattening for Point Convolution,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, (2020), 4293-4302.

[12]	R. Zhang, Z. Guo, W. Zhang, et al., “Pointclip: Point Cloud Un-derstanding by Clip,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, (2022), 8552-8562.

[13]	J. Xu, R. Zhang, J. Dou, Y. Zhu,J. Sun, and S. Pu, “Rpvnet: A Deep and Efficient Range-Point-Voxel Fusion Network for Lidar Point Cloud Segmentation,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, (2021), 16024-16033.

[14]	C. Park, Y. Jeong, M. Cho and J. Park, “Fast Point Transformer ” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, (2022), 16949-16958.

[15]	H. Aljumaily, D. F. Laefer, D. Cuadra, and M. Velasco, “Point Cloud Voxel Classification of Aerial Urban LiDAR Using Voxel Attributes and Random Forest Approach,” International Journal of Applied Earth Observation and Geoinformation 118 (2023): 103208, https://doi.org/10.1016/j.jag.2023.103208.

[16]	D. Lu, Q. Xie, K. Gao, L. Xu, and J. Li, “3DCTN: 3D Convolution-Transformer Network for Point Cloud Classification,” IEEE Trans-actions on Intelligent Transportation Systems 23, no. 12 (2022): 24854-24865, https://doi.org/10.1109/tits.2022.3198836.

[17]	C. Zhao, J. Yang, X. Xiong, A. Zhu, Z. Cao, and X. Li, “Rotation Invariant Point Cloud Analysis: Where Local Geometry Meets Global Topology,” Pattern Recognition 127 (2022): 108626, https://doi.org/10.1016/j.patcog.2022.108626.

[18]	M. Xu, R. Ding,H. Zhao, and X. Qi, “Paconv: Position Adaptive Convolution With Dynamic Kernel Assembling on Point Clouds,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, (2021), 3173-3182.

[19]	H. Zhao, L. Jiang, J. Jia, P. H. Torr, and V. Koltun, “Point Trans-former ” in Proceedings of the IEEE/CVF International Conference on Computer Vision, (2021), 16259-16268.

[20]	X. Lai, J. Liu, L. Jiang, et al., “Stratified Transformer for 3D Point Cloud Segmentation,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, (2022), 8500-8509.

[21]	X. Yu, L. Tang, Y. Rao, T. Huang, J. Zhou, and J. Lu, “Point-Bert:Pre-training 3d Point Cloud Transformers With Masked Point Modeling,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, (2022), 19313-19322.

[22]	C. Zhang, H. Wan,X. Shen, and Z. Wu, “PatchFormer: An Efficient Point Transformer With Patch Attention,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, (2022), 11799-11808.

[23]	C. Li, Y. Liu, X. Li, Y. Zhang, T. Li, and J. Yuan, “Deep Hierarchical Learning for 3D Semantic Segmentation,” International Journal of Com-puter Vision 133, no. 7 (2025): 1-22, https://doi.org/10.1007/s11263-025-02387-6.

[24]	Z. Wu, S. Song, A. Khosla, et al., “3d Shapenets: A Deep Repre-sentation for Volumetric Shapes,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, (2015), 1912-1920.

[25]	C. R. Qi, H. Su,K. Mo, and L. J. Guibas, “Pointnet: Deep Learning on Point Sets for 3d Classification and Segmentation,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, (2017), 652-660.

[26]	C. R. Qi, L. Yi, H. Su, and L. J. Guibas, “Pointnet++: Deep Hier-archical Feature Learning on Point Sets in a Metric Space,” Advances in Neural Information Processing Systems 30 (2017), https://doi.org/10.48550/arXiv.1706.02413.

[27]	Y. Xu, T. Fan, M. Xu,L. Zeng, and Y. Qiao, “Spidercnn: Deep Learning on Point Sets With Parameterized Convolutional Filters,” in Proceedings of the European Conference on Computer Vision (ECCV), (2018), 87-102.

[28]	W. Wu, Z. Qi, and L. Fuxin, “Pointconv:Deep Convolutional Net-works on 3d Point Clouds,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, (2019), 9621-9630.

[29]	Y. Wang, Y. Sun, Z. Liu, S. E. Sarma, M. M. Bronstein, and J. M. Solomon, “Dynamic Graph Cnn for Learning on Point Clouds,” ACM Transactions on Graphics 38, no. 5 (2019): 1-12, https://doi.org/10.1145/3326362.

[30]	H. Thomas, C. R. Qi, J. E. Deschaud, B. Marcotegui,F. Goulette, and L. J. Guibas, “Kpconv: Flexible and Deformable Convolution for Point Clouds,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, (2019), 6411-6420.

[31]	J. Choe, C. Park, F. Rameau, J. Park, and I. S. Kweon, Pointmixer: Mlp-Mixer for Point Cloud Understanding (Springer, 2022),620-640.

[32]	X. Ma, C. Qin, H. You, H. Ran, and Y. Fu, “Rethinking Network Design and Local Geometry in Point Cloud: A Simple Residual Mlp Framework,” arXiv preprint arXiv:2202. 07123 (2022).

[33]	M. A. Uy, Q. H. Pham, B. S. Hua,T. Nguyen, and S. K. Yeung, “Revisiting Point Cloud Classification: A New Benchmark Dataset and Classification Model on Real-World Data,” in Proceedings of the IEEE/ CVF International Conference on Computer Vision, (2019), 1588-1597.

[34]	S. Cheng, X. Chen, X. He, Z. Liu, and X. Bai, “Pra-net: Point Relation-Aware Network for 3d Point Cloud Analysis,” IEEE Trans-actions on Image Processing 30 (2021): 4436-4448, https://doi.org/10.1109/tip.2021.3072214.

[35]	C. Q. Huang, F. Jiang, Q. H. Huang, X. Z. Wang, Z. M. Han, and W. Y. Huang, “Dual-Graph Attention Convolution Network for 3-D Point Cloud Classification,” IEEE Transactions on Neural Networks and Learning Systems 35, no. 4 (2022): 1-13, https://doi.org/10.1109/TNNLS.2022.3162301.

[36]	H. Lin, X. Zheng, L. Li, et al., “Meta Architecture for Point Cloud Analysis,” in 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2023), 17682-17691, https://doi.org/10.1109/cvpr52729.2023.01696.

[37]	L. Yi, V. G. Kim, D. Ceylan, et al., “A Scalable Active Framework for Region Annotation in 3d Shape Collections,” ACM Transactions on Graphics 35, no. 6 (2016): 1-12, https://doi.org/10.1145/2980179.2980238.

[38]	X. F. Han, Y. F. Jin, H. X. Cheng, and G. Q. Xiao, “Dual Transformer for Point Cloud Analysis,” IEEE Transactions on Multimedia 25 (2023): 5638-5648, https://doi.org/10.1109/TMM.2022.3198318.

[39]	J. Chen, B. Kakillioglu, and S. Velipasalar, “Background-Aware 3-D Point Cloud Segmentation With Dynamic Point Feature Aggregation,” IEEE Transactions on Geoscience and Remote Sensing 60 (2022): 1-12, https://doi.org/10.1109/tgrs.2022.3168555.

[40]	I. Armeni, O. Sener, A. R. Zamir, et al., “3d Semantic Parsing of Large-Scale Indoor Spaces,” in 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2016), 1534-1543, https://doi.org/10.1109/cvpr.2016.170.

[41]	Y. Li, R. Bu, M. Sun, W. Wu, X. Di, and B. Chen, “Pointcnn: Convolution on X-Transformed Points,” Advances in Neural Information Processing Systems 31 (2018): 6799-6810, https://doi.org/10.48550/arXiv.1801.07791.

[42]	L. Wang, Y. Huang, Y. Hou, S. Zhang, and J. Shan, “Graph Atten-tion Convolution for Point Cloud Semantic Segmentation,” in Pro-ceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, (2019), 10296-10305.

[43]	X. Yan, C. Zheng, Z. Li,S. Wang, and S. Cui, “Pointasnl: Robust Point Clouds Processing Using Nonlocal Neural Networks With Adap-tive Sampling,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, (2020), 5589-5598.

[44]	F. Yang, F. Davoine, H. Wang, and Z. Jin, “Continuous Conditional Random Field Convolution for Point Cloud Segmentation,” Pattern Recognition 122 (2022): 108357, https://doi.org/10.1016/j.patcog.2021.108357.

[45]	Y. Xu, W. Tang, Z. Zeng, et al., “NeiEA-NET: Semantic Segmenta-tion of Large-Scale Point Cloud Scene via Neighbor Enhancement and Aggregation,” International Journal of Applied Earth Observation and Geoinformation 119 (2023): 103285, https://doi.org/10.1016/j.jag.2023.103285.

[46]	J. Behley, M. Garbade, A. Milioto, et al., “Semantickitti: A Dataset for Semantic Scene Understanding of Lidar Sequences,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, (2019), 9297-9307.

[47]	Q. Hu, B. Yang, L. Xie, et al., “Randla-net: Efficient Semantic Seg-mentation of Large-Scale Point Clouds,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, (2020), 11108-11117.

[48]	S. Qiu,S. Anwar, and N. Barnes, “Semantic Segmentation for Real Point Cloud Scenes via Bilateral Augmentation and Adaptive Fusion,” in 2021 IEEE/CVF Conference on Computer Vision and Pattern Recog-nition (CVPR) 2021), 1757-1767, https://doi.org/10.1109/cvpr46437.2021.00180.

[49]	E. Camuffo and S. Milani, “Continual Learning for LiDAR Semantic Segmentation: Class-Incremental and Coarse-To-Fine Strategies on Sparse Data,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, (2023), 2446-2455.

[50]	M. Ibrahim, N. Akhtar, S. Anwar, and A. Mian, “SAT3D: Slot Attention Transformer for 3D Point Cloud Semantic Segmentation,” IEEE Transactions on Intelligent Transportation Systems 24, no. 5 (2023): 5456-5466, https://doi.org/10.1109/TITS.2023.3243643.

[51]	L. Zhan, W. Li, and W. Min, “FA-ResNet: Feature Affine Residual Network for Large-Scale Point Cloud Segmentation,” International Journal of Applied Earth Observation and Geoinformation 118 (2023): 103259, https://doi.org/10.1016/j.jag.2023.103259.