Content annotation in images from outdoor construction jobsites using YOLO V8 and Swin transformer

Layan Farahat; Ehsan Rezazadeh Azar

doi:10.1007/s44268-024-00036-4

Smart Construction and Sustainable Cities ›› 2024, Vol. 2 ›› Issue (1) :10 DOI: 10.1007/s44268-024-00036-4

Research

Content annotation in images from outdoor construction jobsites using YOLO V8 and Swin transformer

Layan Farahat ¹
, Ehsan Rezazadeh Azar ²^,^b

Author information +

History +

PDF

Abstract

Digital visual data, such as images and videos, are valuable sources of information for various construction engineering and management purposes. Advances in low-cost image-capturing and storing technologies, along with the emergence of artificial intelligence methods have resulted in a considerable increase in using digital imaging in construction sites. Despite these advances, these rich data sources are not typically used to their full potential because they are processed and documented subjectively, and several valuable contents could be overlooked. Semantic content analysis and annotation of the images could enhance retrieval and application of the relevant instances in large databases. This research proposes an ensemble approach to use deep learning-based object recognition, pixel-level segmentation, and text classification for medium-level (ongoing activities) and high-level (project type) annotation of still images from various outdoor construction scenes. The proposed method can annotate images with and without construction actors, i.e. equipment and workers. The experimental results have shown the potential of this approach in annotating construction activities with an 82% overall recall rate.

Cite this article

Download citation ▾

Layan Farahat, Ehsan Rezazadeh Azar. Content annotation in images from outdoor construction jobsites using YOLO V8 and Swin transformer. Smart Construction and Sustainable Cities, 2024, 2(1): 10 DOI:10.1007/s44268-024-00036-4