2.5D-GS: sparse-view geometry-aware Gaussian splatting via depth and normal clues

Yan XING , Yali GUO , Pan WANG , Yongxin WU , Jieqing TAN , Xiaonan LUO

Front. Comput. Sci. ›› 2027, Vol. 21 ›› Issue (4) : 2104702

PDF (7199KB)
Front. Comput. Sci. ›› 2027, Vol. 21 ›› Issue (4) :2104702 DOI: 10.1007/s11704-025-50355-7
Image and Graphics
RESEARCH ARTICLE
2.5D-GS: sparse-view geometry-aware Gaussian splatting via depth and normal clues
Author information +
History +
PDF (7199KB)

Abstract

Recently, 3D Gaussian Splatting explicitly represents scenes and synthesizes high-quality novel views with impressive performance. However, reconstructing accurate Gaussian geometry becomes extremely challenging when using pure RGB images with few-shot inputs. We propose 2.5D-GS, which projects Gaussians into structured 2D spaces and utilizes the 2.5D representations from monocular models to separately optimize the projected depth and normal maps, ultimately achieving consistent and accurate Gaussian geometry. First, we ensure the spatial accuracy of Gaussians with Depth Plane Constraints. Since monocular depth maps construct only rough shapes, Normal Plane Constraints are then applied to refine the orientations of the Gaussians and enhance surface connectivity. Additionally, we introduce Density Ratio-Based Pruning to eliminate redundant Gaussians generated during optimization, leading to compact and efficient scene representations. Extensive experiments on the LLFF, DTU, Blender, and Mip-NeRF360 datasets demonstrate that 2.5D-GS accurately reconstructs scene geometry and renders high-quality novel views with sparse inputs.

Graphical abstract

Keywords

3D Gaussian splatting / sparse-view novel view synthesis / depth regularization / normal regularization

Cite this article

Download citation ▾
Yan XING, Yali GUO, Pan WANG, Yongxin WU, Jieqing TAN, Xiaonan LUO. 2.5D-GS: sparse-view geometry-aware Gaussian splatting via depth and normal clues. Front. Comput. Sci., 2027, 21(4): 2104702 DOI:10.1007/s11704-025-50355-7

登录浏览全文

4963

注册一个新账户 忘记密码

References

[1]

Botsch M, Hornung A, Zwicker M, Kobbelt L . High-quality surface splatting on today’s GPUs. In: Proceedings of the Eurographics/IEEE VGTC Symposium Point-Based Graphics, 2005, 2005: 17–141

[2]

Yifan W, Serena F, Wu S, Öztireli C, Sorkine-Hornung O. Differentiable surface splatting for point-based geometry processing. ACM Transactions on Graphics, 2019, 38(6): 230

[3]

Munkberg J, Chen W, Hasselgren J, Evans A, Shen T, Müller T, Gao J, Fidler S. Extracting triangular 3D models, materials, and lighting from images. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2022, 8270–8280

[4]

Mildenhall B, Srinivasan P P, Tancik M, Barron J T, Ramamoorthi R, Ng R . NeRF: representing scenes as neural radiance fields for view synthesis. Communications of the ACM, 2022, 65( 1): 99–106

[5]

Niemeyer M, Barron J T, Mildenhall B, Sajjadi M S M, Geiger A, Radwan N. RegNeRF: regularizing neural radiance fields for view synthesis from sparse inputs. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2022, 5470–5480

[6]

Barron J T, Mildenhall B, Verbin D, Srinivasan P P, Hedman P. Zip-NeRF: anti-aliased grid-based neural radiance fields. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. 2023, 19640–19648

[7]

Chen Z, Funkhouser T, Hedman P, Tagliasacchi A. MobileNeRF: exploiting the polygon rasterization pipeline for efficient neural field rendering on mobile architectures. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2023, 16569–16578

[8]

Fridovich-Keil S, Yu A, Tancik M, Chen Q, Recht B, Kanazawa A. Plenoxels: radiance fields without neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2022, 5491–5500

[9]

Kerbl B, Kopanas G, Leimkuehler T, Drettakis G . 3D Gaussian splatting for real-time radiance field rendering. ACM Transactions on Graphics, 2023, 42( 4): 139

[10]

Wu J, Wang Y, Xue T, Sun X, Freeman W T, Tenenbaum J B. MarrNet: 3D shape reconstruction via 2.5D sketches. In: Proceedings of the 31st International Conference on Neural Information Processing Systems. 2017, 540–550

[11]

Marr D. Vision: A Computational Investigation into the Human Representation and Processing of Visual Information. Cambridge: MIT Press, 2010

[12]

Fischler M A. Recovering intrinsic scene characteristics from images. Menlo Park: Artificial Intelligence Center, 1982

[13]

Tappen M F, Freeman W T, Adelson E H. Recovering intrinsic images from a single image. In: Proceedings of the 16th International Conference on Neural Information Processing Systems. 2002, 1367–1374

[14]

Ye C, Qiu L, Gu X, Zuo Q, Wu Y, Dong Z, Bo L, Xiu Y, Han X . StableNormal: reducing diffusion variance for stable and sharp normal. ACM Transactions on Graphics, 2024, 43( 6): 250

[15]

Ranftl R, Bochkovskiy A, Koltun V. Vision transformers for dense prediction. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. 2021, 12159–12168

[16]

Ranftl R, Lasinger K, Hafner D, Schindler K, Koltun V . Towards robust monocular depth estimation: mixing datasets for zero-shot cross-dataset transfer. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022, 44( 3): 1623–1637

[17]

Bochkovskii A, Delaunoy A, Germain H, Santos M, Zhou Y, Richter S R, Koltun V. Depth pro: sharp monocular metric depth in less than a second. In: Proceedings of the 13th International Conference on Learning Representations. 2025

[18]

Miangoleh S M H, Reddy M, Aksoy Y. Scale-invariant monocular depth estimation via SSI depth. In: Proceedings of the ACM SIGGRAPH 2024 Conference Papers. 2024, 118

[19]

Li J, Zhang J, Bai X, Zheng J, Ning X, Zhou J, Gu L. DNGaussian: optimizing sparse-view 3D gaussian radiance fields with global-local depth normalization. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2024, 20775–20785

[20]

Avidan S, Shashua A. Novel view synthesis in tensor space. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. 1997, 1034–1040

[21]

Barron J T, Mildenhall B, Tancik M, Hedman P, Martin-Brualla R, Srinivasan P P. Mip-NeRF: a multiscale representation for anti-aliasing neural radiance fields. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. 2021, 5835–5844

[22]

Barron J T, Mildenhall B, Verbin D, Srinivasan P P, Hedman P. Mip-NeRF 360: unbounded anti-aliased neural radiance fields. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2022, 5460–5469

[23]

Zwicker M, Pfister H, van Baar J, Gross M . Ewa splatting. IEEE Transactions on Visualization and Computer Graphics, 2002, 8( 3): 223–238

[24]

Wu T, Yuan Y J, Zhang L X, Yang J, Cao Y P, Yan L Q, Gao L . Recent advances in 3D Gaussian splatting. Computational Visual Media, 2024, 10( 4): 613–642

[25]

Chen G, Wang W. A survey on 3D Gaussian splatting. 2024, arXiv preprint arXiv: 2401.03890

[26]

Fei B, Xu J, Zhang R, Zhou Q, Yang W, He Y . 3D Gaussian splatting as a new era: a survey. IEEE Transactions on Visualization and Computer Graphics, 2025, 31( 8): 4429–4449

[27]

Cheng K, Long X, Yang K, Yao Y, Yin W, Ma Y, Wang W, Chen X. GaussianPro: 3D Gaussian splatting with progressive propagation. In: Proceedings of the 41st International Conference on Machine Learning. 2024

[28]

Liu Y, Luo C, Fan L, Wang N, Peng J, Zhang Z. CityGaussian: real-time high-quality large-scale scene rendering with Gaussians. In: Proceedings of the 18th European Conference on Computer Vision. 2025, 265–282

[29]

Lin J, Li Z, Tang X, Liu J, Liu S, Liu J, Lu Y, Wu X, Xu S, Yan Y, Yang W. VastGaussian: vast 3D Gaussians for large scene reconstruction. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2024, 5166–5175

[30]

Xie T, Zong Z, Qiu Y, Li X, Feng Y, Yang Y, Jiang C. PhysGaussian: physics-integrated 3D Gaussians for generative dynamics. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2024, 4389–4398

[31]

Lei X, Wang M, Zhou W, Li H . GaussNav: Gaussian splatting for visual navigation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2025, 47( 5): 4108–4121

[32]

Zhou X, Lin Z, Shan X, Wang Y, Sun D, Yang M H. DrivingGaussian: composite gaussian splatting for surrounding dynamic autonomous driving scenes. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2024, 21634–21643

[33]

Chen Z, Wang F, Wang Y, Liu H. Text-to-3D using Gaussian splatting. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2024, 21401–21412

[34]

Tang J, Ren J, Zhou H, Liu Z, Zeng G. DreamGaussian: generative Gaussian splatting for efficient 3D content creation. In: Proceedings of the 12th International Conference on Learning Representations. 2024

[35]

Chen A, Xu Z, Zhao F, Zhang X, Xiang F, Yu J, Su H. MVSNeRF: fast generalizable radiance field reconstruction from multi-view stereo. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. 2021, 14104–14113

[36]

Cong W, Liang H, Wang P, Fan Z, Chen T, Varma M, Wang Y, Wang Z. Enhancing NeRF akin to enhancing LLMs: generalizable NeRF transformer with mixture-of-view-experts. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. 2023, 3170–3181

[37]

Yu A, Ye V, Tancik M, Kanazawa A. pixelNeRF: neural radiance fields from one or few images. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2021, 4576–4585

[38]

Jain A, Tancik M, Abbeel P. Putting NeRF on a diet: semantically consistent few-shot view synthesis. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. 2021, 5865–5874

[39]

Wang G, Chen Z, Loy C C, Liu Z. SparseNeRF: distilling depth ranking for few-shot novel view synthesis. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. 2023, 9031–9042

[40]

Deng K, Liu A, Zhu J Y, Ramanan D. Depth-supervised NeRF: fewer views and faster training for free. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2022, 12872–12881

[41]

Roessle B, Barron J T, Mildenhall B, Srinivasan P P, Nießner M. Dense depth priors for neural radiance fields from sparse input views. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2022, 12882–12891

[42]

Liu F, Sun W, Wang H, Wang Y, Sun H, Ye J, Zhang J, Duan Y. ReconX: reconstruct any scene from sparse views with video diffusion model. 2024, arXiv preprint arXiv: 2408.16767

[43]

Yu W, Xing J, Yuan L, Hu W, Li X, Huang Z, Gao X, Wong T T, Shan Y, Tian Y. ViewCrafter: taming video diffusion models for high-fidelity novel view synthesis. 2024, arXiv preprint arXiv: 2409.02048

[44]

Chen Y, Zheng C, Xu H, Zhuang B, Vedaldi A, Cham T J, Cai J. MVSplat360: feed-forward 360 scene synthesis from sparse views. In: Proceedings of the 38th International Conference on Neural Information Processing Systems. 2024, 3399

[45]

Liu X, Zhou C, Huang S. 3DGS-enhancer: enhancing unbounded 3D Gaussian splatting with view-consistent 2D diffusion priors. In: Proceedings of the 38th International Conference on Neural Information Processing Systems. 2024, 4237

[46]

Wang H, Liu F, Chi J, Duan Y. VideoScene: distilling video diffusion model to generate 3D scenes in one step. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2025, 16475–16485

[47]

Zhu Z, Fan Z, Jiang Y, Wang Z. FSGS: real-time few-shot view synthesis using gaussian splatting. In: Proceedings of the 18th European Conference on Computer Vision. 2025, 145–163

[48]

Peng R, Xu W, Tang L, Liao L, Jiao J, Wang R. Structure consistent Gaussian splatting with matching prior for few-shot novel view synthesis. In: Proceedings of the 38th Neural Information Processing Systems. 2024, 3087

[49]

Xu H, Peng S, Wang F, Blum H, Barath D, Geiger A, Pollefeys M. DepthSplat: connecting gaussian splatting and depth. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2025, 16453–16463

[50]

Zhang J, Li J, Yu X, Huang L, Gu L, Zheng J, Bai X. CoR-GS: sparse-view 3D gaussian splatting via co-regularization. In: Proceedings of the 18th European Conference on Computer Vision. 2025, 335–352

[51]

Charatan D, Li S L, Tagliasacchi A, Sitzmann V. PixelSplat: 3D Gaussian splats from image pairs for scalable generalizable 3D reconstruction. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2024, 19457–19467

[52]

Chen Y, Xu H, Zheng C, Zhuang B, Pollefeys M, Geiger A, Cham T J, Cai J. MVSplat: efficient 3D Gaussian splatting from sparse multi-view images. In: Proceedings of the 18th European Conference on Computer Vision. 2025, 370–386

[53]

Wei M, Wu Q, Zheng J, Rezatofighi H, Cai J. Normal-GS: 3D Gaussian splatting with normal-involved rendering. In: Proceedings of the 38th International Conference on Neural Information Processing Systems. 2024, 2432

[54]

Yu H, Long X, Tan P. LM-Gaussian: boost sparse-view 3D Gaussian splatting with large model priors. 2024, arXiv preprint arXiv: 2409.03456

[55]

Turkulainen M, Ren X, Melekhov I, Seiskari O, Rahtu E, Kannala J. DN-Splatter: depth and normal priors for gaussian splatting and meshing. In: Proceedings of 2025 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV). 2025, 2421–2431

[56]

Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Zhai X, Unterthiner T, Dehghani M, Minderer M, Heigold G, Gelly S, Uszkoreit J, Houlsby N. An image is worth 16x16 words: transformers for image recognition at scale. In: Proceedings of the 9th International Conference on Learning Representations. 2021

[57]

Hu M, Yin W, Zhang C, Cai Z, Long X, Chen H, Wang K, Yu G, Shen C, Shen S . Metric3D v2: a versatile monocular geometric foundation model for zero-shot metric depth and surface normal estimation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2024, 46( 12): 10579–10596

[58]

Yang L, Kang B, Huang Z, Xu X, Feng J, Zhao H. Depth anything: unleashing the power of large-scale unlabeled data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2024, 10371–10381

[59]

Yang L, Kang B, Huang Z, Zhao Z, Xu X, Feng J, Zhao H. Depth anything V2. In: Proceedings of the 38th International Conference on Neural Information Processing Systems. 2024, 688

[60]

Bao H, Dong L, Piao S, Wei F. BEiT: BERT pre-training of image transformers. In: Proceedings of the 10th International Conference on Learning Representations. 2022

[61]

Oquab M, Darcet T, Moutakanni T, Vo H V, Szafraniec M, , et al. DINOv2: learning robust visual features without supervision. Transactions on Machine Learning Research, 2023, arXiv preprint arXiv: 2304.07193

[62]

Gupta Y C, Panwar S, Banyal N, Thakur N, Dhiman M R. Marigold. In: Datta S K, Gupta Y C, eds. Floriculture and Ornamental Plants. Singapore: Springer, 2022, 1–23

[63]

Rombach R, Blattmann A, Lorenz D, Esser P, Ommer B. High-resolution image synthesis with latent diffusion models. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2022, 10674–10685

[64]

Schönberger J L, Frahm J M. Structure-from-motion revisited. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016, 4104–4113

[65]

Schönberger J L, Zheng E, Frahm J M, Pollefeys M. Pixelwise view selection for unstructured multi-view stereo. In: Proceedings of the 14th European Conference on Computer Vision. 2016, 501–518

[66]

Jiang Y, Tu J, Liu Y, Gao X, Long X, Wang W, Ma Y. GaussianShader: 3D Gaussian splatting with shading functions for reflective surfaces. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2024, 5322–5332

[67]

Mildenhall B, Srinivasan P P, Ortiz-Cayon R, Kalantari N K, Ramamoorthi R, Ng R, Kar A . Local light field fusion: practical view synthesis with prescriptive sampling guidelines. ACM Transactions on Graphics, 2019, 38( 4): 29

[68]

Jensen R, Dahl A, Vogiatzis G, Tola E, Aanæs H. Large scale multi-view stereopsis evaluation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2014, 406–413

[69]

Yang J, Pavone M, Wang Y. FreeNeRF: improving few-shot neural rendering with free frequency regularization. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2023, 8254–8263

[70]

Xu Q, Xu Z, Philip J, Bi S, Shu Z, Sunkavalli K, Neumann U. Point-NeRF: point-based neural radiance fields. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2022, 5428–5438

[71]

Wang Z, Bovik A C, Sheikh H R, Simoncelli E P . Image quality assessment: from error visibility to structural similarity. IEEE Transactions on Image Processing, 2004, 13( 4): 600–612

[72]

Zhang R, Isola P, Efros A A, Shechtman E, Wang O. The unreasonable effectiveness of deep features as a perceptual metric. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2018, 586–595

RIGHTS & PERMISSIONS

Higher Education Press

PDF (7199KB)

Supplementary files

Highlights

Supplementary materials

449

Accesses

0

Citation

Detail

Sections
Recommended

/