Multi-Robot Collaborative Complex Indoor Scene Segmentation via Multiplex Interactive Learning
Jinfu Liu , Zhongzien Jiang , Xinhua Xu , Wenhao Li , Mengyuan Liu , Hong Liu
CAAI Transactions on Intelligence Technology ›› 2025, Vol. 10 ›› Issue (6) : 1646 -1660.
Multi-Robot Collaborative Complex Indoor Scene Segmentation via Multiplex Interactive Learning
Indoor scene semantic segmentation is essential for enabling robots to understand and interact with their environments effectively. However, numerous challenges remain unresolved, particularly in single-robot systems, which often struggle with the complexity and variability of indoor scenes. To address these limitations, we introduce a novel multi-robot collaborative framework based on multiplex interactive learning (MPIL) in which each robot specialises in a distinct visual task within a unified multitask architecture. During training, the framework employs task-specific decoders and cross-task feature sharing to enhance collaborative optimisation. At inference time, robots operate independently with optimised models, enabling scalable, asynchronous and efficient deployment in real-world scenarios. Specifically, MPIL employs specially designed modules that integrate RGB and depth data, refine feature representations and facilitate the simultaneous execution of multiple tasks, such as instance segmentation, scene classification and semantic segmentation. By leveraging these modules, distinct agents within multi-robot systems can effectively handle specialised tasks, thereby enhancing the overall system's fiexibility and adaptability. This collaborative effort maximises the strengths of each robot, resulting in a more comprehensive understanding of envi-ronments. Extensive experiments on two public benchmark datasets demonstrate MPIL's competitive performance compared to state-of-the-art approaches, highlighting the effectiveness and robustness of our multi-robot system in complex indoor environments.
cross-task interactive / learning (artificial intelligence) / multi-modal / multiplex interactive learning / multitask / object segmentation / semantic segmentation
| [1] |
|
| [2] |
|
| [3] |
|
| [4] |
|
| [5] |
|
| [6] |
|
| [7] |
|
| [8] |
|
| [9] |
|
| [10] |
|
| [11] |
|
| [12] |
|
| [13] |
|
| [14] |
|
| [15] |
|
| [16] |
|
| [17] |
|
| [18] |
|
| [19] |
|
| [20] |
|
| [21] |
|
| [22] |
|
| [23] |
|
| [24] |
|
| [25] |
|
| [26] |
|
| [27] |
|
| [28] |
|
| [29] |
|
| [30] |
|
| [31] |
|
| [32] |
|
| [33] |
|
| [34] |
|
| [35] |
|
| [36] |
|
| [37] |
|
| [38] |
|
| [39] |
|
| [40] |
|
| [41] |
|
| [42] |
|
| [43] |
|
| [44] |
|
| [45] |
|
| [46] |
|
| [47] |
|
| [48] |
|
| [49] |
|
| [50] |
|
| [51] |
|
| [52] |
|
| [53] |
|
| [54] |
|
| [55] |
|
| [56] |
|
| [57] |
|
| [58] |
|
| [59] |
|
| [60] |
|
| [61] |
|
| [62] |
|
| [63] |
|
| [64] |
|
| [65] |
L.-Z. |
| [66] |
|
| [67] |
|
| [68] |
|
| [69] |
|
| [70] |
|
| [71] |
|
| [72] |
|
| [73] |
|
| [74] |
|
| [75] |
|
| [76] |
|
| [77] |
|
| [78] |
|
| [79] |
|
/
| 〈 |
|
〉 |