Multi-Robot Collaborative Complex Indoor Scene Segmentation via Multiplex Interactive Learning

Jinfu Liu , Zhongzien Jiang , Xinhua Xu , Wenhao Li , Mengyuan Liu , Hong Liu

CAAI Transactions on Intelligence Technology ›› 2025, Vol. 10 ›› Issue (6) : 1646 -1660.

PDF (1775KB)
CAAI Transactions on Intelligence Technology ›› 2025, Vol. 10 ›› Issue (6) :1646 -1660. DOI: 10.1049/cit2.70066
ORIGINAL RESEARCH
research-article

Multi-Robot Collaborative Complex Indoor Scene Segmentation via Multiplex Interactive Learning

Author information +
History +
PDF (1775KB)

Abstract

Indoor scene semantic segmentation is essential for enabling robots to understand and interact with their environments effectively. However, numerous challenges remain unresolved, particularly in single-robot systems, which often struggle with the complexity and variability of indoor scenes. To address these limitations, we introduce a novel multi-robot collaborative framework based on multiplex interactive learning (MPIL) in which each robot specialises in a distinct visual task within a unified multitask architecture. During training, the framework employs task-specific decoders and cross-task feature sharing to enhance collaborative optimisation. At inference time, robots operate independently with optimised models, enabling scalable, asynchronous and efficient deployment in real-world scenarios. Specifically, MPIL employs specially designed modules that integrate RGB and depth data, refine feature representations and facilitate the simultaneous execution of multiple tasks, such as instance segmentation, scene classification and semantic segmentation. By leveraging these modules, distinct agents within multi-robot systems can effectively handle specialised tasks, thereby enhancing the overall system's fiexibility and adaptability. This collaborative effort maximises the strengths of each robot, resulting in a more comprehensive understanding of envi-ronments. Extensive experiments on two public benchmark datasets demonstrate MPIL's competitive performance compared to state-of-the-art approaches, highlighting the effectiveness and robustness of our multi-robot system in complex indoor environments.

Keywords

cross-task interactive / learning (artificial intelligence) / multi-modal / multiplex interactive learning / multitask / object segmentation / semantic segmentation

Cite this article

Download citation ▾
Jinfu Liu, Zhongzien Jiang, Xinhua Xu, Wenhao Li, Mengyuan Liu, Hong Liu. Multi-Robot Collaborative Complex Indoor Scene Segmentation via Multiplex Interactive Learning. CAAI Transactions on Intelligence Technology, 2025, 10(6): 1646-1660 DOI:10.1049/cit2.70066

登录浏览全文

4963

注册一个新账户 忘记密码

References

[1]

D. Seichter, M. Köhler, B. Lewandowski, T. Wengefeld, and H.-M. Gross, “Efficient RGB-D Semantic Segmentation for Indoor Scene Analysis,” in 2021 IEEE International Conference on Robotics and Automation (ICRA) IEEE, 2021), 13.525-13.531.

[2]

K. M. Wurm, C. Stachniss, and W. Burgard, “Coordinated Multi-Robot Exploration Using a Segmentation of the Environment,” in 2008 IEEE/RSJ International Conference on Intelligent Robots and Sys-tems IEEE, 2008), 1160-1165.

[3]

J. Zhang, K. Wang, and C. Mu, “Multi-Station Multi-Robot Task Assignment Method Based on Deep Reinforcement Learning,” CAAI Transactions on Intelligence Technology 10, no. 1 (2025): 134-146, https://doi.org/10.1049/cit2.12394.

[4]

J. Lin, Z. Chen, Y. Wang, K. Huang, and Y. Chen, “A Multi-Robot Collaborative Exploration Technology Based on Instance Segmenta-tion,” in Chinese Intelligent Systems Conference (Springer, 2022), 426-434.

[5]

Y. Li, J. Zhang, D. Ma,Y. Wang, and C. Feng, “Multi-Robot Scene Completion: Towards Task-Agnostic Collaborative Perception,” in Conference on Robot Learning (PMLR, 2023), 2062-2072.

[6]

J. Blumenkamp, S. Morad, J. Gielis, and A. Prorok, “CoVis-Net: A Cooperative Visual Spatial Foundation Model for Multi-Robot Appli-cations,” preprint, arXiv:2405.01107, May 2, 2024, https://doi.org/10.48550/arXiv.2405.01107.

[7]

Q. Yan, S. Li, C. Liu, M. Liu, and Q. Chen, “RoboSeg: Real-Time Semantic Segmentation on Computationally Constrained Robots,” IEEE Transactions on Systems, Man, and Cybernetics: Systems 52, no. 3 (2022): 1567-1577, https://doi.org/10.1109/tsmc.2020.3032437.

[8]

Y. Zhou, J. Xiao, Y. Zhou, and G. Loianno, “Multi-Robot Collabo-rative Perception With Graph Neural Networks,” IEEE Robotics and Automation Letters 7, no. 2 (2022): 2289-2296, https://doi.org/10.1109/lra.2022.3141661.

[9]

J. Wu, Y. Rao, S. Zeng, and B. Zhang, “Pre-Trained SAM as Data Augmentation for Image Segmentation,” CAAI Transactions on Intelli-gence Technology 10, no. 1 (2025): 268-282, https://doi.org/10.1049/cit2.12381.

[10]

H. Shimodaira, “Improving Prediction Accuracy of Semantic Seg-mentation Methods Using Convolutional Autoencoder Based Pre-processing Layers,” Advances in Artificial Intelligence and Machine Learning 4, no. 2 (2024): 2369-2386, https://doi.org/10.54364/aaiml.2024.42137.

[11]

M. Manko, A. Popov, J. M. Gorriz, and J. Ramirez, “Improved Or-gans at Risk Segmentation Based on Modified U-Net With Self-Attention and Consistency Regularisation,” CAAI Transactions on Intelligence Technology 9, no. 4 (2024): 850-865, https://doi.org/10.1049/cit2.12303.

[12]

G. Chen, L. Li, J. Zhang, and Y. Dai, “Rethinking the Unpretentious U-Net for Medical Ultrasound Image Segmentation,” Pattern Recogni-tion 142 (2023): 109728, https://doi.org/10.1016/j.patcog.2023.109728.

[13]

O. Frigo,L. Martin-Gaffé and C. Wacongne, “DoodleNet: Double DeepLab Enhanced Feature Fusion for Thermal-Color Semantic Seg-mentation,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2022), 3021-3029.

[14]

Y. Wang, S. Chen, H. Bian,W. Li, and Q. Lu, “Spatial-Assistant Encoder-Decoder Network for Real Time Semantic Segmentation,” e-prints, arXiv (2023): arXiv-2309.

[15]

R. Azad, M. Heidari, M. Shariatnia, et al. “TransDeepLab: Convolution-Free Transformer-Based Deeplab v3+ for Medical Image Segmentation,” , in International Workshop on Predictive Intelligence in MEdicine (Springer, 2022), 91-102.

[16]

S. D. Khan and K. M. Othman, “Indoor Scene Classification Through Dual-Stream Deep Learning: A Framework for Improved Scene Understanding in Robotics,” Computers 13, no. 5 (2024): 121, https://doi.org/10.3390/computers13050121.

[17]

K. Jabeen, M. A. Khan, A. Hamza, et al., “An Efficientnet Integrated ResNet Deep Network and Explainable AI for Breast Lesion Classifi-cation From Ultrasound Images,” CAAI Transactions on Intelligence Technology 10, no. 3 (2025): 842-857, https://doi.org/10.1049/cit2.12385.

[18]

M. Defaoui, L. Koutti, M. El Ansari, R. Lahmyed, and L. Masmoudi, “PedVis-VGG-16: A Fine-Tuned Deep Convolutional Neural Network for Pedestrian Image Classifications,” in 2022 9th International Confer-ence on Wireless Networks and Mobile Communications (WINCOM) IEEE, 2022), 1-6.

[19]

H. Bai, H. Mao, and D. Nair, “Dynamically Pruning SegFormer for Efficient Semantic Segmentation,” in ICASSP 2022-2022 IEEE Interna-tional Conference on Acoustics, Speech and Signal Processing (ICASSP) IEEE, 2022), 3298-3302.

[20]

S. D. Khan, L. Alarabi, and S. Basalamah, “An Encoder-Decoder Deep Learning Framework for Building Footprints Extraction From Aerial Imagery,” Arabian Journal for Science and Engineering 48, no. 2 (2023): 1273-1284, https://doi.org/10.1007/s13369-022-06768-8.

[21]

A. Y. Noori, “A Survey of RGB-D Image Semantic Segmentation by Deep Learning,” in 2021 7th International Conference on Advanced Computing and Communication Systems (ICACCS)Vol. 1 (IEEE, 2021), 1953-1957, https://doi.org/10.1109/icaccs51430.2021.9441924.

[22]

C. Sevastopoulos, J. Hussain, S. Konstantopoulos, V. Karkaletsis, and F. Makedon, “Depth-Guided Free-Space Segmentation for a Mobile Robot,”e-prints, arXiv (2023): arXiv-2311.

[23]

H. Wu, Z. Zhao, J. Zhong, W. Wang, Z. Wen, and J. Qin, “Poly-pSeg+: A Lightweight Context-Aware Network for Real-Time Polyp Segmentation,” IEEE Transactions on Cybernetics 53, no. 4 (2022):2610-2621, https://doi.org/10.1109/tcyb.2022.3162873.

[24]

C. Yu, J. Wang, C. Peng, C. Gao,G. Yu, and N. Sang, “BiseNet: Bilateral Segmentation Network for Real-Time Semantic Segmentation,” in Proceedings of the European Conference on Computer Vision (ECCV) (2018), 325-341.

[25]

M. Kang, M. Lee, and S. Lee, “Real-Time Semantic Segmentation With Bilateral Patch Attention,” in 2024 International Conference on Electronics, Information, and Communication (ICEIC) IEEE, 2024), 1-4.

[26]

K. Hu, Z. Xie, and Q. Hu, “Dual-Resolution Transformer Combined With Multi-Layer Separable Convolution Fusion Network for Real-Time Semantic Segmentation,” Computers & Graphics 118 (2024): 220-232, https://doi.org/10.1016/j.cag.2023.12.015.

[27]

D. Kumar and X. Zhang, “Improving More Instance Segmentation and Better Object Detection in Remote Sensing Imagery Based on Cascade Mask R-CNN,” in 2021 IEEE International Geoscience and Remote Sensing Symposium IGARSS IEEE, 2021), 4672-4675.

[28]

Z. Zhao, X. Tong, Y. Sun, et al., “Large Scale Instance Segmentation of Outdoor Environment Based on Improved YOLACT,” Concurrency and Computation: Practice and Experience 34, no. 28 (2022): e7370, https://doi.org/10.1002/cpe.7370.

[29]

B. De Brabandere, D. Neven, L. Van Gool, and K. Esat-Psi, “Se-mantic Instance Segmentation With a Discriminative Loss Function,” preprint, arXiv:1708.02551, August 8, 2017, https://doi.org/10.48550/arXiv.1708.02551.

[30]

K. Song, F. Zhu, and L. Song, “Moving Target Detection Algorithm Based on Sift Feature Matching,” in 2022 International Conference on Frontiers of Artificial Intelligence and Machine Learning (FAIML) IEEE, 2022), 196-199.

[31]

S. K. Abbas, M. U. G. Khan, J. Zhu, et al., “Vision Based Intelligent Traffic Light Management System Using Faster R-CNN,” CAAI Trans-actions on Intelligence Technology 9, no. 4 (2024): 932-947, https://doi.org/10.1049/cit2.12309.

[32]

J. Yao, S. Fidler, and R. Urtasun, “Describing the Scene as a Whole: Joint Object Detection, Scene Classification and Semantic Segmenta-tion,” in 2012 IEEE Conference on Computer Vision and Pattern Recog-nition IEEE, 2012), 702-709.

[33]

S. Majumder and A. Yao,“Content-Aware Multi-Level Guidance for Interactive Instance Segmentation,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2019), 11.602-11.611.

[34]

W. Yu, X. Zhang, S. Das, X. X. Zhu, and P. Ghamisi, “MaskCD: A Remote Sensing Change Detection Network Based on Mask Classifica-tion,” IEEE Transactions on Geoscience and Remote Sensing 62 (2024): 1-16, https://doi.org/10.1109/tgrs.2024.3424300.

[35]

R. Pereira, L. Garrote, T. Barros,A. Lopes, and U. J. Nunes, “Exploiting Object-Based and Segmentation-Based Semantic Features for Deep Learning-Based Indoor Scene Classification,” e-prints, arXiv (2024): arXiv-2404.

[36]

D. Seichter, S. B. Fischedick, M. Köhler, and H.-M. Groß “Efficient Multi-Task RGB-D Scene Analysis for Indoor Environments,” in 2022 International Joint Conference on Neural Networks (IJCNN) IEEE, 2022), 1-10.

[37]

E. Romera, J. M. Alvarez, L. M. Bergasa, and R. Arroyo, “ErfNet: Efficient Residual Factorized ConvNet for Real-Time Semantic Seg-mentation,” IEEE Transactions on Intelligent Transportation Systems 19, no. 1 (2017): 263-272, https://doi.org/10.1109/tits.2017.2750080.

[38]

J. Hu, L. Shen and G. Sun, “Squeeze-and-Excitation Networks in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2018), 7132-7141.

[39]

Z. Yang, L. Zhu,Y. Wu, and Y. Yang, “Gated Channel Trans-formation for Visual Recognition,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2020), 11.794-11.803.

[40]

X. Wang, R. Girshick, A. Gupta and K. He, “Non-Local Neural Networks in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2018), 7794-7803.

[41]

H. Zhao, Y. Zhang, S. Liu, et al., “PsaNet: Point-Wise Spatial Attention Network for Scene Parsing,” in Proceedings of the European Conference on Computer Vision (ECCV) (2018), 267-283.

[42]

X. Chen, K.-Y. Lin, J. Wang, et al. “Bi-Directional Cross-Modality Feature Propagation With Separation-and-Aggregation Gate for RGB-D Semantic Segmentation,” , in European Conference on Computer Vision (Springer, 2020), 561-577.

[43]

W. Zhou, J. Liu, J. Lei, L. Yu, and J.-N. Hwang, “GmNet: Graded-Feature Multilabel-learning Network for RGB-Thermal Urban Scene Semantic Segmentation,” IEEE Transactions on Image Processing 30 (2021): 7790-7802, https://doi.org/10.1109/tip.2021.3109518.

[44]

G. Gao, G. Xu, J. Li, Y. Yu, H. Lu, and J. Yang, “FbsNet: A Fast Bilateral Symmetrical Network for Real-Time Semantic Segmentation,” IEEE Transactions on Multimedia 25 (2022): 3273-3283, https://doi.org/10.1109/tmm.2022.3157995.

[45]

W. Zhou, S. Dong, J. Lei, and L. Yu, “Mtanet: Multitask-Aware Network With Hierarchical Multimodal Fusion for RGB-T Urban Scene Understanding,” IEEE Transactions on Intelligent Vehicles 8, no. 1 (2022): 48-58, https://doi.org/10.1109/tiv.2022.3164899.

[46]

D. Bhattacharjee, T. Zhang,S. Süsstrunk, and M. Salzmann, “Mult: A End-to-End Multitask Learning Transformer,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2022), 12.031-12.041.

[47]

I. Lopes, T.-H. Vu and R. de Charette,“Cross-Task Attention Mechanism for Dense Multi-Task Learning,” in Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (2023), 2329-2338.

[48]

Y. Xu, X. Li, H. Yuan, Y. Yang, and L. Zhang, “Multi-Task Learning With Multi-Query Transformer for Dense Prediction,” IEEE Trans-actions on Circuits and Systems for Video Technology 34, no. 2 (2023): 1228-1240, https://doi.org/10.1109/tcsvt.2023.3292995.

[49]

J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei, “ImageNet: A Large-Scale Hierarchical Image Database,” in 2009 IEEE Conference on Computer Vision and Pattern Recognition IEEE, 2009), 248-255.

[50]

P. Mishra and K. Sarawadekar, “Polynomial Learning Rate Policy With Warm Restart for Deep Neural Network,” in TENCON 2019-2019 IEEE Region 10 Conference (TENCON) IEEE, 2019), 2087-2092.

[51]

Y. Zhang, C. Xiong, J. Liu, X. Ye, and G. Sun, “Spatial-Information Guided Adaptive Context-Aware Network for Efficient RGB-D Semantic Segmentation,” IEEE Sensors Journal 23, no. 19 (2023): 23512-23521, https://doi.org/10.1109/jsen.2023.3304637.

[52]

S. B. Fischedick, D. Seichter, R. Schmidt, L. Rabes, and H.-M. Gross, “Efficient Multi-Task Scene Analysis With RGB-D Transformers,” in 2023 International Joint Conference on Neural Networks (IJCNN) IEEE, 2023), 1-10.

[53]

X. Tang, Z. Zhang, Y. Meng, J. Xie, C. Tang, and W. Zhang, “Cascading Context Enhancement Network for RGB-D Semantic Seg-mentation,” Multimedia Tools and Applications 84, no. 9 (2024): 1-19, https://doi.org/10.1007/s11042-024-19110-1.

[54]

Q. Zhao, Y. Wan, J. Xu, and L. Fang, “Cross-Modal Attention Fusion Network for RGB-D Semantic Segmentation,” Neurocomputing 548 (2023): 126389, https://doi.org/10.1016/j.neucom.2023.126389.

[55]

J. Ni, Z. Zhang, K. Shen, G. Tang, and S. X. Yang, “An Improved Deep Network-Based RGB-D Semantic Segmentation Method for Indoor Scenes,” International Journal of Machine Learning and Cybernetics 15, no. 2 (2024): 589-604, https://doi.org/10.1007/s13042-023-01927-1.

[56]

J. Liu, H. Liu, X. Li, J. Ren, and X. Xu, “MilNet: Multiplex Inter-active Learning Network for RGB-T Semantic Segmentation,” IEEE Transactions on Image Processing 34 (2025): 1686-1699, https://doi.org/10.1109/tip.2025.3544484.

[57]

W. Zhou, H. Zhang, W. Yan, and W. Lin, “MMSMCNet: Modal Memory Sharing and Morphological Complementary Networks for RGB-T Urban Scene Semantic Segmentation,” IEEE Transactions on Circuits and Systems for Video Technology 33, no. 12 (2023): 7096-7108, https://doi.org/10.1109/tcsvt.2023.3275314.

[58]

T. Fang, Z. Liang, X. Shao, Z. Dong, and J. Li, “Depth Removal Distillation for RGB-D Semantic Segmentation,” in ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and Signal Process-ing (ICASSP) IEEE, 2022), 2405-2409.

[59]

S. Vandenhende,S. Georgoulis, and L. Van Gool, “Mti-Net: Multi-Scale Task Interaction Networks for Multi-Task Learning,” in Com-puter Vision-ECCV 2020: 16th European Conference, Glasgow, UK, August 23-28, 2020, Proceedings, Part IV 16 Springer, 2020),527-543.

[60]

G. Lin, F. Liu, A. Milan, C. Shen, and I. Reid, “RefineNet: Multi-Path Refinement Networks for Dense Prediction,” IEEE Transactions on Pattern Analysis and Machine Intelligence 42, no. 1 (2016): 1-1242, https://doi.org/10.1109/tpami.2019.2893630.

[61]

M. Zhen, J. Wang, L. Zhou,T. Fang, and L. Quan, “Learning Fully Dense Neural Networks for Image Semantic Segmentation,” Proceedings of the AAAI Conference on Artificial Intelligence 33, no. 1 (2019): 9283-9290, https://doi.org/10.1609/aaai.v33i01.33019283.

[62]

X. Yan, S. Hou, A. Karim, and W. Jia, “RafNet: RGB-D Attention Feature Fusion Network for Indoor Semantic Segmentation,” Displays 70 (2021): 102082, https://doi.org/10.1016/j.displa.2021.102082.

[63]

X. Hu, K. Yang, L. Fei, and K. Wang, “ACNet: Attention Based Network to Exploit Complementary Features for RGBD Semantic Seg-mentation,” in 2019 IEEE International Conference on Image Processing (ICIP) IEEE, 2019), 1440-1444.

[64]

W. Zou, Y. Peng, Z. Zhang, S. Tian, and X. Li, “RGB-D Gate-Guided Edge Distillation for Indoor Semantic Segmentation,” Multimedia Tools and Applications 81, no. 25 (2022): 35.815-35.830, https://doi.org/10.1007/s11042-021-11395-w.

[65]

L.-Z. Chen Z. Lin Z. Wang Y.-L. Yang and M. -M. Cheng, “Spatial Information Guided Convolution for Real-Time RGBD Semantic Seg-mentation,” IEEE Transactions on Image Processing 30 (2021): 2313-2324, https://doi.org/10.1109/tip.2021.3049332.

[66]

Z. Zheng, D. Xie, C. Chen, and Z. Zhu, “Multi-Resolution Cascaded Network With Depth-Similar Residual Module for Real-Time Semantic Segmentation on RGB-D Images,” in 2020 IEEE International Confer-ence on Networking, Sensing and Control (ICNSC) IEEE, 2020), 1-6.

[67]

Y. Lv, Z. Liu, and G. Li, “Context-Aware Interaction Network for RGB-T Semantic Segmentation,” IEEE Transactions on Multimedia 26 (2024): 6348-6360, https://doi.org/10.1109/tmm.2023.3349072.

[68]

B. Shuai, H. Ding, T. Liu, G. Wang, and X. Jiang, “Toward Achieving Robust Low-Level and High-Level Scene Parsing,” IEEE Transactions on Image Processing 28, no. 3 (2018): 1378-1390, https://doi.org/10.1109/tip.2018.2878975.

[69]

F. Jiang, F. Guo, and R. Ji, “DSNET: Accelerate Indoor Scene Se-mantic Segmentation,” in ICASSP 2019-2019 IEEE International Con-ference on Acoustics, Speech and Signal Processing (ICASSP) IEEE, 2019), 3317-3321.

[70]

L. Zhu, Z. Kang, M. Zhou, et al., “CMANet: Cross-Modality Atten-tion Network for Indoor-Scene Semantic Segmentation,” Sensors 22, no. 21 (2022): 8520, https://doi.org/10.3390/s22218520.

[71]

J. Cao, H. Leng, D. Lischinski, D. Cohen-Or,C. Tu, and Y. Li, “ShapeConv: Shape-Aware Convolutional Layer for Indoor RGB-D Se-mantic Segmentation,” in Proceedings of the IEEE/CVF International Conference on Computer Vision (2021), 7088-7097.

[72]

J. Cao, H. Leng, D. Cohen-Or, et al., “RGB×D: Learning Depth-Weighted RGB Patches for RGB-D Indoor Semantic Segmentation,” Neurocomputing 462 (2021): 568-580, https://doi.org/10.1016/j.neucom.2021.08.009.

[73]

Z. Xiong, Y. Yuan,N. Guo, and Q. Wang, “Variational Context-Deformable ConvNets for Indoor Scene Parsing,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2020), 3992-4002.

[74]

X. Zhu, H. Hu,S. Lin, and J. Dai, “Deformable ConvNets v2:More Deformable, Better Results,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2019), 9308-9316.

[75]

S. Du, S. Tang, W. Wang, X. Li,Y. Lu, and R. Guo, “PscNet: Efficient RGB-D Semantic Segmentation Parallel Network Based on Spatial and Channel Attention,” ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences 1 (2022): 129-136, https://doi.org/10.5194/isprs-annals-v-1-2022-129-2022.

[76]

B. Xiong, Y. Peng, J. Zhu, J. Gu, Z. Chen, and W. Qin, “AgwNet: Attention-Guided Adaptive Shuffie Channel Gate Warped Feature Network for Indoor Scene RGB-D Semantic Segmentation,” Displays 83 (2024): 102730, https://doi.org/10.1016/j.displa.2024.102730.

[77]

W. Zhou, E. Yang, J. Lei, and L. Yu, “FRNet: Feature Reconstruc-tion Network for RGB-D Indoor Scene Parsing,” IEEE Journal of Selected Topics in Signal Processing 16, no. 4 (2022): 677-687, https://doi.org/10.1109/jstsp.2022.3174338.

[78]

A. Radford, J. W. Kim, C. Hallacy, et al., “Learning Transferable Visual Models From Natural Language Supervision,” in International Conference on Machine Learning (PMLR, 2021), 8748-8763.

[79]

A. Kirillov, E. Mintun, N. Ravi, et al. “Segment Anything , in Proceedings of the IEEE/CVF International Conference on Computer Vision (2023), 4015-4026.

AI Summary AI Mindmap
PDF (1775KB)

34

Accesses

0

Citation

Detail

Sections
Recommended

AI思维导图

/