A deep learning-based framework for intelligent modeling: From architectural sketch to 3D model

Yuqian Li , Weiguo Xu

Front. Archit. Res. ›› 2025, Vol. 14 ›› Issue (6) : 1567 -1584.

PDF (1883KB)
Front. Archit. Res. ›› 2025, Vol. 14 ›› Issue (6) :1567 -1584. DOI: 10.1016/j.foar.2025.04.007
RESEARCH ARTICLE

A deep learning-based framework for intelligent modeling: From architectural sketch to 3D model

Author information +
History +
PDF (1883KB)

Abstract

This paper proposes a deep learning-based intelligent modeling framework for generating 3D architectural models from manual sketches, addressing the domain gap in 2D-to-3D transformation. By integrating architectural domain knowledge—specifically the phased, selective, and cyclic characteristics of the design process—the framework ensures a structured and iterative generative approach. The framework consists of a 2D design phase, where image retrieval, Stable Diffusion, and CycleGAN facilitate conceptual exploration, multi-scheme generation, and depth map extraction, and a 3D design phase, where Pixel2Mesh generates 3D forms, refined through Grasshopper-based parametric optimization. Empirical evaluation demonstrates that the framework effectively preserves structural fidelity while allowing for generative variations. Structural similarity and geometric accuracy metrics validate its performance, confirming its ability to balance AI-driven massing generation with architectural precision. A Mars habitat case study, conducted in an academic research setting, serves as a controlled experiment to assess adaptability. While demonstrating the framework's potential for AI-assisted architectural generation, the study also highlights the need for broader validation across diverse architectural typologies. This research bridges traditional and AI-driven design methodologies by integrating computer vision and generative models into architectural workflows. The proposed framework contributes to architectural design by introducing a cross-disciplinary approach that enhances the efficiency, quality, and innovation of design processes.

Keywords

Architectural sketches / 3D models / Deep learning / Intelligent modeling

Cite this article

Download citation ▾
Yuqian Li, Weiguo Xu. A deep learning-based framework for intelligent modeling: From architectural sketch to 3D model. Front. Archit. Res., 2025, 14(6): 1567-1584 DOI:10.1016/j.foar.2025.04.007

登录浏览全文

4963

注册一个新账户 忘记密码

References

[1]

Armeni, I. , 2017. Joint 2D-3D semantic data for indoor scene understanding. arXiv preprint arXiv: 1702.01105.

[2]

Armeni, I. , Sener, O. , Zamir, A.R. , Jiang, H. , Brilakis, I. , Fischer, M. , Savarese, S. , 2016. 3d semantic parsing of largescale indoor spaces. Paper Presented at the Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.

[3]

Banasiak, R. , Bujnowicz, M. , Fabijańska, A. , 2024. Study on quality assessment methods for enhanced resolution graph-based reconstructed images in 3D capacitance tomography. Appl.Sci. 14 (22), 10222.

[4]

Bilda, Z. , Demirkan, H. , 2003. An insight on designers' sketching activities in traditional versus digital media. Des. Stud. 24 (1), 27- 50.

[5]

Chen, J. , Zheng, X. , Shao, Z. , Ruan, M. , Li, H. , Zheng, D. , Liang, Y. , 2025. Creative interior design matching the indoor structure generated through diffusion model with an improved control network. Frontiers of Architectural Research 14 (3), 614- 629.

[6]

Chen, Y. , He, T. , Huang, D. , Ye, W. , Chen, S. , Tang, J. , Yu, G. , 2024a. MeshAnything: Artist-Created mesh generation with autoregressive transformers. arXiv preprint arXiv: 2406.10163.

[7]

Chen, Y. , Wang, Y. , Luo, Y. , Wang, Z. , Chen, Z. , Zhu, J. , Lin, G. , 2024b. Meshanything v2: Artist-created mesh generation with adjacent mesh tokenization. arXiv preprint arXiv: 2408.02555.

[8]

Chen, Z. , Jing, L. , Li, Y. , Li, B. , 2024c. Bridging the domain gap: self-Supervised 3d scene understanding with foundation models. Adv. Neural Inf. Process. Syst. 36

[9]

Cross, N. , 1982. Designerly ways of knowing. Des. Stud. 3 (4), 221- 227.

[10]

Cui, D. , Wang, W. , Hu, W. , Peng, J. , Zhao, Y. , Zhang, Y. , Wang, J. , 2024. 3D reconstruction of building structures incorporating neural radiation fields and geometric constraints. Autom. ConStruct. 165, 105517.

[11]

Darke, J. , 1979. The primary generator and the design process. Des. Stud. 1 (1), 36- 44.

[12]

Deng, Q. , Li, X. , Liu, Y. , 2022. Using Pix2Pix to achieve the spatial refinement and transformation of taihu stone. Paper Presented at the the International Conference on Computational Design and Robotic Fabrication.

[13]

Di Carlo, R. , Mittal, D. , Veselý O. , 2022. Generating 3D building volumes for a given urban context using Pix2Pix GAN. Paper Presented at the Proceedings of the 40th Conference on Education and Research in Computer Aided Architectural Design in Europe. Ghent, Belgium.

[14]

Ferguson, E.S. , 1994. Engineering and the Mind's Eye. MIT press.

[15]

Gkioxari, G. , Malik, J. , Johnson, J. , 2019. Mesh R-CNN. Proceedings of the IEEE/CVF International Conference on Computer Vision. Paper presented at the.

[16]

Goldschmidt, G. , 2014. Modeling the role of sketching in design idea generation. In: An Anthology of Theories and Models of Design: Philosophy, Approaches and Empirical Explorations. Springer, pp. 433-450.

[17]

Grabska, E.J. , 2022. Generative and evolutionary techniques for the process of creating architectural objects on the base of a 3D prototype model. Buildings 12 (7), 899.

[18]

Groueix, T. , Fisher, M. , Kim, V.G. , Russell, B.C. , Aubry, M. , 2018. AtlasNet: a papier-Mché approach to learning 3D surface generation.

[19]

Hajdu, A. , Hajdu, L. , Tijdeman, R. , 2012. Approximations of the Euclidean distance by chamfer distances. arXiv preprint arXiv: 1201.0876.

[20]

Ibrahim, R. , Rahimian, F.P. , 2010. Comparison of CAD and manual sketching tools for teaching architectural design. Autom. ConStruct. 19 (8), 978- 987.

[21]

Jaminet, J. , Esquivel, G. , Bugni, S. , 2022. Serlio and artificial intelligence: problematizing the image-to-object workflow. Paper Presented at the Proceedings of the 2021 DigitalFUTURES: the 3rd International Conference on Computational Design and Robotic Fabrication (CDRF 2021) 3.

[22]

Jones, J.C. , 1970. Design methods: seeds of human futures.

[23]

Kim, D. , Lee, L.S. , Kim, H. , 2023. Elemental sabotage: diffusing functional morphologies. Proceedings of the 28th CAADRIA Conference. Paper presented at the.

[24]

Kim, F.C. , Huang, J. , 2022. Perspectival gan-architectural formmaking through dimensional transformation. Proceedings of the 40th International Conference on Education and Research in Computer Aided Architectural Design in Europe (Ecaade). Paper presented at the.

[25]

Lawson, B. , 2006. How Designers Think. Architectural Press.

[26]

Li, Y. , Xu, W. , 2023. Research on architectural sketch to scheme image based on context encoder. Paper Presented at the Proceedings of the 28th International Conference of the Association for Computer-Aided Architectural Design Research in Asia(CAADRIA) 2023.

[27]

Li, Y. , Xu, W. , Liu, X. , 2023. Research on architectural generation design of specific architect's sketch based on image-to-image translation. Paper Presented at the the International Conference on Computational Design and Robotic Fabrication. Shanghai.

[28]

Lin, Y. , 2022. The experiment of Natural Language Processing (NLP) and Computer Vision (CV) in architectural design. Paper Presented at the POST-CARBON-Proceedings of the 27th CAADRIA Conference, pp. 343-352. Sydney, 9-15 April 2022.

[29]

Liu, Y. , Li, H. , Deng, Q. , Hu, K. , 2023. Diffusion probabilistic model assisted 3D form finding and design latent space exploration: a case Study for taihu stone spacial transformation. Paper Presented at the the International Conference on Computational Design and Robotic Fabrication.

[30]

Maturana, D. , Scherer, S. , 2015. Voxnet: a 3d convolutional neural network for real-time object recognition. Paper Presented at the 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[31]

Mescheder, L. , Oechsle, M. , Niemeyer, M. , Nowozin, S. , Geiger, A. , 2019. Occupancy networks: learning 3d reconstruction in function space. Paper Presented at the Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[32]

Mildenhall, B. , Srinivasan, P.P. , Tancik, M. , Barron, J.T. , Ramamoorthi, R. , Ng, R. , 2021. Nerf: representing scenes as neural radiance fields for view synthesis. Commun. ACM 65 (1), 99- 106.

[33]

Morris, A. , 1962. Introduction to Design. Prentice-Hall.

[34]

Newton, D. , 2018. Multi-objective qualitative optimization (MOQO) in architectural design. Appl. Constr. Optim 1 (36), 187- 196.

[35]

Park, J.J. , Florence, P. , Straub, J. , Newcombe, R. , Lovegrove, S. , 2019. Deepsdf: learning continuous signed distance functions for shape representation. Paper Presented at the Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[36]

Pei, E. , Campbell, I. , Evans, M. , 2011. A taxonomic classification of visual design representations used by industrial designers and engineering designers. Des. J. 14 (1), 64- 91.

[37]

Peng, S. , Niemeyer, M. , Mescheder, L. , Pollefeys, M. , Geiger, A. , 2020. Convolutional occupancy networks.

[38]

Qi, C.R. , Su, H. , Mo, K. , Guibas, L.J. , 2017a. PointNet: deep learning on point sets for 3D classification and segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 652-660.

[39]

Qi, C.R. , Yi, L. , Su, H. , Guibas, L.J. , 2017b. PointNet++: deep hierarchical feature learning on point sets in a metric space.

[40]

Ren, Y. , Zheng, H. , 2020. The spire of AI: Voxel-based 3D neural style transfer. Paper Presented at the 25th International Conference on Computer-Aided Architectural Design Research in Asia (CAADRIA 2020).

[41]

Riegler, G. , Ulusoy, A.O. , Geiger, A. , 2017. OctNet: learning deep 3D representations at high resolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 3577-3586.

[42]

Rombach, R. , Blattmann, A. , Lorenz, D. , Esser, P. , Ommer, B. , 2022. High-resolution image synthesis with latent diffusion models. Paper Presented at the Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[43]

Sara, U. , Akter, M. , Uddin, M.S. , 2019. Image quality assessment through FSIM, SSIM, MSE and PSNRdA comparative study. J. Comput. Commun. 7 (3), 8- 18.

[44]

Schön, D.A. , 2017. The Reflective Practitioner: How Professionals Think in Action. Routledge.

[45]

Shu, D.W. , Park, S.W. , Kwon, J. , 2019. 3d point cloud generative adversarial network based on tree structured graph convolutions. Paper Presented at the Proceedings of the IEEE/CVF International Conference on Computer Vision.

[46]

Spezialetti, R. , Tan, D.J. , Tonioni, A. , Tateno, K. , Tombari, F. , 2020. A divide et Impera approach for 3D shape reconstruction from multiple views. Paper Presented at the 2020 International Conference on 3D Vision (3DV).

[47]

Sun, C. , Zhou, Y. , Han, Y. , 2022. Automatic generation of architecture facade for historical urban renovation using generative adversarial network. Build. Environ. 212, 108781.

[48]

Vistisen, P. , 2015. The roles of sketching in design: mapping the tension between functions in design sketches. Nord 1 (6).

[49]

Wang, N. , Zhang, Y. , Li, Z. , Fu, Y. , Liu, W. , Jiang, Y.-G. , 2018. Pixel2mesh: generating 3d mesh models from single rgb images. Proceedings of the European Conference on Computer Vision(ECCV). Paper presented at the.

[50]

Wang, Z. , Bovik, A.C. , Sheikh, H.R. , Simoncelli, E.P. , 2004. Image quality assessment: from error visibility to structural similarity. IEEE Trans. Image Process. 13 (4), 600- 612.

[51]

Wortmann, T. , Nannicini, G. , 2017. Introduction to architectural design optimization. City Networks: Collaboration and Planning for Health and Sustainability, pp. 259-278.

[52]

Wu, J. , Zhang, C. , Xue, T. , Freeman, B. , Tenenbaum, J. , 2016. Learning a probabilistic latent space of object shapes via 3d generative-adversarial modeling. Neural Inf. Process. Syst. 29

[53]

Wu, Z. , Song, S. , Khosla, A. , Yu, F. , Zhang, L. , Tang, X. , Xiao, J. , 2015. 3d shapenets: a deep representation for volumetric shapes. Paper Presented at the Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.

[54]

Xiang, J. , Lv, Z. , Xu, S. , Deng, Y. , Wang, R. , Zhang, B. , Yang, J. , 2024. Structured 3d latents for scalable and versatile 3d generation. arXiv preprint arXiv: 2412.01506.

[55]

Yang, H.-B. , Johanes, M. , Kim, F.C. , Bernhard, M. , Huang, J. , 2023. Architectural sketch to 3D model: an experiment on simpleform houses. Paper Presented at the International Conference on Computer-Aided Architectural Design Futures.

[56]

Zeisel, J. , 1984. Inquiry by Design: Tools for environment-behaviour Research. CUP archive.

[57]

Zhang, H. , Blasetti, E. , 2020. 3D architectural form style transfer through machine learning (Full version).

[58]

Zhang, L. Li, C. , 2001. The Process and Expression of Creative Architectural Thinking. China Architecture & Building Press, Beijing.

[59]

Zhao, H. , 2010. Research on generalization and improvement of typical Architecture creation process pattern. (Doctor). Xi'an University of Architecture and technology.

[60]

Zhou, Y. , Liu, Y. , Zhou, H. , Li, W. , 2021. Wasserstein distance feature alignment learning for 2D image-based 3D model retrieval. J. Vis. Commun. Image Represent. 79, 103197.

[61]

Zhu, J.-Y. , Park, T. , Isola, P. , Efros, A.A. , 2017. Unpaired image-toimage translation using cycle-consistent adversarial networks. Paper Presented at the Proceedings of the IEEE International Conference on Computer Vision.

RIGHTS & PERMISSIONS

The Author(s). Publishing services by Elsevier B.V. on behalf of Higher Education Press and KeAi.

AI Summary AI Mindmap
PDF (1883KB)

265

Accesses

0

Citation

Detail

Sections
Recommended

AI思维导图

/