BEDiff: denoising diffusion probabilistic models for building extraction
Yanjing Lei , Yuan Wang , Sixian Chan , Jie Hu , Xiaolong Zhou , Hongkai Zhang
Optoelectronics Letters ›› 2025, Vol. 21 ›› Issue (5) : 298 -305.
BEDiff: denoising diffusion probabilistic models for building extraction
Accurately identifying building distribution from remote sensing images with complex background information is challenging. The emergence of diffusion models has prompted the innovative idea of employing the reverse denoising process to distill building distribution from these complex backgrounds. Building on this concept, we propose a novel framework, building extraction diffusion model (BEDiff), which meticulously refines the extraction of building footprints from remote sensing images in a stepwise fashion. Our approach begins with the design of booster guidance, a mechanism that extracts structural and semantic features from remote sensing images to serve as priors, thereby providing targeted guidance for the diffusion process. Additionally, we introduce a cross-feature fusion module (CFM) that bridges the semantic gap between different types of features, facilitating the integration of the attributes extracted by booster guidance into the diffusion process more effectively. Our proposed BEDiff marks the first application of diffusion models to the task of building extraction. Empirical evidence from extensive experiments on the Beijing building dataset demonstrates the superior performance of BEDiff, affirming its effectiveness and potential for enhancing the accuracy of building extraction in complex urban landscapes.
| [1] |
|
| [2] |
|
| [3] |
|
| [4] |
|
| [5] |
DOSOVITSKIY A, BEYER L, KOLESNIKOV A, et al. An image is worth 16×16 words: transformers for image recognition at scale[EB/OL]. (2020-10-22) [2024-01-23]. https://arxiv.org/abs/2010.11929. |
| [6] |
|
| [7] |
|
| [8] |
|
| [9] |
WU J, FU R, FANG H H, et al. MedSegDiff: medical image segmentation with diffusion probabilistic model[EB/OL]. (2022-11-01) [2024-01-23]. https://arxiv.org/abs/2211.00611. |
| [10] |
|
| [11] |
BICAKCI Y S, SARICA B. ATTransUNet: semantic segmentation model for building segmentation from aerial image and laser data[J]. Nordic machine intelligence, 2022, 2(3). |
| [12] |
|
| [13] |
|
| [14] |
|
| [15] |
|
| [16] |
|
| [17] |
|
| [18] |
|
| [19] |
|
| [20] |
|
| [21] |
|
| [22] |
|
| [23] |
|
Tianjin University of Technology
/
| 〈 |
|
〉 |