HybridPC: a hybrid implicit-explicit framework for zero-shot point cloud completion

Yongwei MIAO; Yijun LI; Ran FAN; Zhenghui HU; Fuchang LIU

doi:10.1007/s11704-025-50876-1

Front. Comput. Sci. ›› 2027, Vol. 21 ›› Issue (4) :2104703 DOI: 10.1007/s11704-025-50876-1

Image and Graphics

RESEARCH ARTICLE

HybridPC: a hybrid implicit-explicit framework for zero-shot point cloud completion

Author information +

History +

PDF (2961KB)

Abstract

Point cloud completion is a fundamental task in 3D perception and 3D vision. Existing point cloud completion methods typically rely on supervised learning with limited 3D data, resulting in poor generalization and suboptimal recovery in scenarios involving complex shape structures or large missing regions. To overcome these limitations, we propose a novel zero-shot point cloud completion framework (called HybridPC) that achieves high-fidelity 3D reconstruction without any 3D supervision or task-specific training. HybridPC leverages powerful 2D diffusion priors and a progressive implicit-explicit architecture to address severe incompleteness and complex geometries. The framework comprises three key stages: 1) Edge-aware neural field initialization: ControlNet-guided stable diffusion synthesizes multi-view images conditioned on text prompts and orthographic edge projections of the incomplete point cloud, providing strong shape constraints to initialize a coarse NeRF field via Score Distillation Sampling (SDS). 2) Multi-view diffusion collaborative completion: A pre-trained multi-view diffusion model enforces cross-view consistency, collaboratively completing the entire neural radiance field (NeRF) with globally coherent geometry. To reconcile gradient conflicts between ControlNet and multi-view diffusion during joint SDS optimization, a PCGrad-based multi-objective optimization strategy is introduced to balance the structural and semantic guidance, yielding higher-fidelity shape completion. 3) Geometry-aware tetrahedral refinement: The implicit field is converted into a tetrahedral mesh using DMTet, which is further refined via implicit SDS-based normal optimization and explicit geometric constraints on the mesh surface, ensuring structural fidelity to the partial input. Extensive experiments on the ShapeNetPart and Redwood datasets demonstrate that HybridPC outperforms existing supervised and zero-shot methods in both qualitative and quantitative comparisons. Specifically, HybridPC preserves the input structure more faithfully, completes missing regions more accurately, and shows stronger generalization ability, with particularly significant improvements on real-world scans from Redwood dataset. Our results show the strong potential of coupling 2D diffusion priors with 3D geometric modeling for scalable, training-free point cloud completion.

Graphical abstract

Keywords

point cloud completion / zero-shot / neural radiance field (NeRF) / stable diffusion / differentiable tetrahedral mesh

Cite this article

Download citation ▾

Yongwei MIAO, Yijun LI, Ran FAN, Zhenghui HU, Fuchang LIU. HybridPC: a hybrid implicit-explicit framework for zero-shot point cloud completion. Front. Comput. Sci., 2027, 21(4): 2104703 DOI:10.1007/s11704-025-50876-1

登录浏览全文

4963

注册一个新账户忘记密码

References

Publishing order | Descend order by publishing year | Descend order by cited within

[1]	Sun S, Gu Z, Sun T, Sun J, Yuan C, Han Y, Li D, Ang M H . DriveSceneGen: generating diverse and realistic driving scenarios from scratch. IEEE Robotics and Automation Letters, 2024, 9( 8): 7007–7014

[2]	Huang Z, Wen Y, Wang Z, Ren J, Jia K . Surface reconstruction from point clouds: a survey and a benchmark. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2024, 46( 12): 9727–9748

[3]	Liu J, Liu M, Wang Z, Lee L, Zhou K, An P, Yang S, Zhang R, Guo Y, Zhang S. RoboMamba: multimodal state space model for efficient robot reasoning and manipulation. 2024, arXiv preprint arXiv: 2406.04339

[4]	Tao F, Zhang H, Liu A, Nee A Y C . Digital twin in industry: state-of-the-art. IEEE Transactions on Industrial Informatics, 2019, 15( 4): 2405–2415

[5]	Shi S, Guo C, Jiang L, Wang Z, Shi J, Wang X, Li H. PV-RCNN: point-voxel feature set abstraction for 3D object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2020, 10526−10535

[6]	Thomas H, Qi C R, Deschaud J E, Marcotegui B, Goulette F, Guibas L J. KPConv: flexible and deformable convolution for point clouds. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV). 2019, 6410−6419

[7]	Yuan W, Khot T, Held D, Mertz C, Hebert M. PCN: point completion network. In: Proceedings of the International Conference on 3D Vision (3DV). 2018, 728−737

[8]	Tchapmi L P, Kosaraju V, Rezatofighi H, Reid I, Savarese S. TopNet: structural point cloud decoder. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2019, 383−392

[9]	Wen X, Xiang P, Han Z, Cao Y P, Wan P, Zheng W, Liu Y S. PMP-Net: point cloud completion by learning multi-step point moving paths. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2021, 7439−7448

[10]	Croitoru F A, Hondru V, Ionescu R T, Shah M . Diffusion models in vision: a survey. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023, 45( 9): 10850–10869

[11]	Poole B, Jain A, Barron J T, Mildenhall B. DreamFusion: text-to-3D using 2D diffusion. In: Proceedings of the 11th International Conference on Learning Representations. 2023

[12]	Cheng Y C, Lee H Y, Tulyakov S, Schwing A G, Gui L Y. SDFusion: multimodal 3D shape completion, reconstruction, and generation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2023, 4456−4465

[13]	Kasten Y, Rahamim O, Chechik G. Point-cloud completion with pretrained text-to-image diffusion models. In: Proceedings of the 37th International Conference on Neural Information Processing Systems. 2023, 534

[14]	Huang T, Yan Z, Zhao Y, Lee G H. Zero-shot point cloud completion via 2D priors. 2024, arXiv preprint arXiv: 2404.06814

[15]	Liu R, Wu R, Van Hoorick B, Tokmakov P, Zakharov S, Vondrick C. Zero-1-to-3: zero-shot one image to 3D object. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV). 2023, 9264−9275

[16]	Zhang L, Rao A, Agrawala M. Adding conditional control to text-to-image diffusion models. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV). 2023, 3813−3824

[17]	Rombach R, Blattmann A, Lorenz D, Esser P, Ommer B. High-resolution image synthesis with latent diffusion models. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2022, 10674−10685

[18]	Shen T, Gao J, Yin K, Liu M Y, Fidler S. Deep marching tetrahedra: a hybrid representation for high-resolution 3D shape synthesis. In: Proceedings of the 35th International Conference on Neural Information Processing Systems. 2021, 466

[19]	Wu J, Zhang C, Xue T, Freeman W T, Tenenbaum J B. Learning a probabilistic latent space of object shapes via 3D generative-adversarial modeling. In: Proceedings of the 30th International Conference on Neural Information Processing Systems. 2016, 82−90

[20]	Achlioptas P, Diamanti O, Mitliagkas I, Guibas L. Learning representations and generative models for 3D point clouds. In: Proceedings of the 35th International Conference on Machine Learning. 2018, 40−49

[21]	Vakalopoulou M, Chassagnon G, Bus N, Marini R, Zacharaki E I, Revel M P, Paragios N. AtlasNet: multi-atlas non-linear deep networks for medical image segmentation. In: Proceedings of the 21st International Conference on Medical Image Computing and Computer Assisted Intervention. 2018, 658−666

[22]	Sun Y, Wang Y, Liu Z, Siegel J E, Sarma S E. PointGrow: autoregressively learned point cloud generation with self-attention. In: Proceedings of 2020 IEEE Winter Conference on Applications of Computer Vision (WACV). 2020, 61−70

[23]	Mildenhall B, Srinivasan P P, Tancik M, Barron J T, Ramamoorthi R, Ng R . NeRF: representing scenes as neural radiance fields for view synthesis. Communications of the ACM, 2021, 65( 1): 99–106

[24]	Luo S, Hu W. Diffusion probabilistic models for 3D point cloud generation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2021, 2836−2844

[25]	Shi Y, Wang P, Ye J, Mai L, Li K, Yang X. MVDream: multi-view diffusion for 3D generation. In: Proceedings of the 12th International Conference on Learning Representations. 2024

[26]	Höllein L, Cao A, Owens A, Johnson J, Nießner M. Text2Room: extracting textured 3D meshes from 2D text-to-image models. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV). 2023, 7875−7886

[27]	Mescheder L, Oechsle M, Niemeyer M, Nowozin S, Geiger A. Occupancy networks: learning 3D reconstruction in function space. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019, 4455−4465

[28]	Liu Z, Feng Y, Black M J, Nowrouzezahrai D, Paull L, Liu W. MeshDiffusion: score-based generative 3D mesh modeling. In: Proceedings of the 11th International Conference on Learning Representations. 2023

[29]	Niemeyer M, Mescheder L, Oechsle M, Geiger A. Differentiable volumetric rendering: learning implicit 3D representations without 3D supervision. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020, 3501−3512

[30]	Gadelha M, Maji S, Wang R. 3D shape induction from 2D views of multiple objects. In: Proceedings of 2017 International Conference on 3D Vision (3DV). 2017, 402−411

[31]	Wu Z, Song S, Khosla A, Yu F, Zhang L, Tang X, Xiao J. 3D ShapeNets: a deep representation for volumetric shapes. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2015, 1912−1920

[32]	Chu R, Xie E, Mo S, Li Z, Nießner M, Fu C W, Jia J. DiffComplete: diffusion-based generative 3D shape completion. In: Proceedings of the 37th International Conference on Neural Information Processing Systems. 2023, 3318

[33]	Kazhdan M, Bolitho M, Hoppe H. Poisson surface reconstruction. In: Proceedings of the 4th Eurographics Symposium on Geometry Processing. 2006, 61−70

[34]	Kimura T, Matsubara T, Uehara K. ChartPointFlow for topology-aware 3D point cloud generation. In: Proceedings of the 29th ACM International Conference on Multimedia. 2021, 1396−1404

[35]	Jia H, Zhu L, Zhao N. H3R: hybrid multi-view correspondence for generalizable 3D reconstruction. 2025, arXiv preprint arXiv: 2508.03118

[36]	Wu T, Zheng C, Guan F, Vedaldi A, Cham T J. Amodal3R: amodal 3D reconstruction from occluded 2D images. 2025, arXiv preprint arXiv: 2503.13439

[37]	Izquierdo S, Civera J. SfM-TTR: using structure from motion for test-time refinement of single-view depth networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2023, 21466−21476

[38]	Yu T, Kumar S, Gupta A, Levine S, Hausman K, Finn C. Gradient surgery for multi-task learning. In: Proceedings of the 34th International Conference on Neural Information Processing Systems. 2020, 5824−5836

[39]	Chang A X, Funkhouser T, Guibas L, Hanrahan P, Huang Q, Li Z, Savarese S, Savva M, Song S, Su H, Xiao J, Yi L, Yu F. ShapeNet: an information-rich 3D model repository. 2015, arXiv preprint arXiv: 1512.03012

[40]	Choi S, Zhou Q Y, Miller S, Koltun V. A large dataset of object scans. 2016, arXiv preprint arXiv: 1602.02481

[41]	Xiang P, Wen X, Liu Y S, Cao Y P, Wan P, Zheng W, Han Z. SnowflakeNet: point cloud completion by Snowflake Point Deconvolution with skip-transformer. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV). 2021, 5479−5489

[42]	Yu X, Rao Y, Wang Z, Liu Z, Lu J, Zhou J. PoinTr: diverse point cloud completion with geometry-aware transformers. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV). 2021, 12478−12487

[43]	Zhou H, Cao Y, Chu W, Zhu J, Lu T, Tai Y, Wang C. SeedFormer: patch seeds based point cloud completion with Upsample Transformer. In: Proceedings of the 17th European Conference on Computer Vision. 2022, 416−432

[44]	Yu X, Rao Y, Wang Z, Lu J, Zhou J . AdaPoinTr: diverse point cloud completion with adaptive geometry-aware transformers. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023, 45( 12): 14114–14130

[45]	Wang J, Cui Y, Guo D, Li J, Liu Q, Shen C. PointAttN: you only need attention for point cloud completion. In: Proceedings of the 38th AAAI Conference on Artificial Intelligence. 2024, 5472−5480

[46]	Chen Z, Badrinarayanan V, Lee C Y, Rabinovich A. GradNorm: gradient normalization for adaptive loss balancing in deep multitask networks. In: Proceedings of the 35th International Conference on Machine Learning. 2018, 794−803

[47]	Désidéri J A . Multiple-gradient descent algorithm (MGDA) for multiobjective optimization. Comptes Rendus Mathematique, 2012, 350( 5−6): 313–318