3D vision-based anomaly detection in manufacturing: A survey

Juan DU; Chengyu TAO; Xuanming CAO; Fugee TSUNG

doi:10.1007/s42524-025-4189-9

Front. Eng ›› 2025, Vol. 12 ›› Issue (2) : 343 -360. DOI: 10.1007/s42524-025-4189-9

Industrial Engineering and Intelligent Manufacturing

REVIEW ARTICLE

3D vision-based anomaly detection in manufacturing: A survey

Author information +

History +

PDF (3582KB)

Abstract

Surface quality monitoring of manufacturing products is critical for manufacturing industries to ensure product quality and production efficiency. With the rapid development of 3D scanning technology, high-density 3D point cloud data can be generated by 3D scanners in complex manufacturing systems. However, due to the challenges of complex surface modeling and various types, it lacks effective surface anomaly detection methods that can meet the practical requirements regarding detection accuracy and speed. This survey aims to review the surface anomaly detection methodology of manufacturing products based on 3D machine vision. Specifically, the machine learning methodologies will be systematically reviewed for 3D point cloud data modeling and anomaly detection. Related public data sets for this research are also summarized. Finally, the future research directions are pointed out.

Graphical abstract

Keywords

anomaly detection / 3D Vision / manufacturing / machine learning

Cite this article

Download citation ▾

Juan DU, Chengyu TAO, Xuanming CAO, Fugee TSUNG. 3D vision-based anomaly detection in manufacturing: A survey. Front. Eng, 2025, 12(2): 343-360 DOI:10.1007/s42524-025-4189-9

登录浏览全文

4963

注册一个新账户忘记密码

1 Introduction

Surface anomaly detection plays a crucial role in the manufacturing industry. Accurate surface anomaly detection will facilitate process adjustments, thereby improving quality and reducing defective rates. In the context of manufacturing, any deviation from manufacturing standards can be referred to as an anomaly (Cheng et al., 2017, 2018; Du et al., 2022). In this paper, “anomaly” refers to geometric anomalies that deviate from the nominal surface. Currently, surface anomaly detection in real manufacturing processes still heavily relies on manual inspection. However, manual surface anomaly detection may lead to practical mistakes due to visual fatigue and subjectivity. Therefore, the development of automatic inspection methods for surface anomaly detection to replace manual inspection is of paramount importance for enhancing the quality of manufacturing systems.

Anomaly detection techniques based on 2D images have been extensively investigated for decades and have shown significant efficacy, successfully applied in practical manufacturing processes. In recent years, the rapid development of advanced 3D scanning technology has made it possible to obtain high-precision 3D point cloud data quickly at a low cost, leading to increasing attention in the industrial sector. Depending on the scanning mechanism, 3D point cloud data can be classified into two types: structured and unstructured 3D point cloud data (Tao et al., 2023). Structured 3D point cloud data, which can be collected by coordinate measuring machines or depth cameras, is typically organized as a grid and can be represented in a matrix or tensor form. In contrast, unstructured 3D point cloud data tends to be acquired by hand-held scanners scanning from various directions, consisting of an unordered set of points and lacking a predefined grid structure.

3D point cloud data offers numerous advantages over 2D images, such as being less affected by the lighting conditions and capturing complete geometric information of object surfaces, including depth, dimensions, and volume. The newly added information is crucial for accurate anomaly detection when the manufacturing surface and anomaly types are complex. However, research on manufacturing surface anomaly detection based on 3D point cloud data are limited compared to image-based anomaly detection, making it a promising research direction.

However, many challenges hinder accurate anomaly detection based on 3D point cloud data in the context of manufacturing:

• Difficult data representation. While structured point clouds can be represented as matrices or tensors, allowing them to be easily processed using existing image-based methods, unstructured point clouds pose a significant challenge in terms of data representation due to the lack of a regular structure (Du et al., 2022; Tao et al., 2023). Furthermore, manufacturing surfaces often exhibit complexity, making modeling and analysis challenging.

• Limited training samples. In modern manufacturing, the occurrence of anomalous data are infrequent, and the collection and manual annotation of these anomalous data are time-consuming and labor-intensive (Cohen and Ni, 2022). Furthermore, in certain scenarios, obtaining normal data can be difficult due to privacy concerns. As a result, the lack of training data has led to many methods struggling to learn effectively, thereby preventing them from achieving superior performance.

• Diverse and local sparsity of anomaly. In actual manufacturing processes, anomaly patterns are difficult to predict and are diverse. Unseen anomalies may occur at any time, and even the same type of anomaly may exhibit significant variations, presenting challenges for anomaly analysis. In addition, anomalies are typically sparsely distributed on manufacturing surfaces (Wang et al., 2023b), constituting only a small portion of the anomaly sample region. This sparsity makes learning anomaly features challenging.

Many algorithms have been proposed to address these challenges. Recently, some surveys have reviewed anomaly detection based on 3D point cloud data. For example, Huo et al. (2023) provided a comprehensive review, focusing primarily on classical machine learning algorithms. However, their review was less inclusive of the latest unsupervised deep learning algorithms, which are increasingly influential in the field. Rani et al. (2024) offered a systematic review of state-of-the-art point cloud segmentation and classification algorithms, as well as their application in anomaly detection tasks. Despite their thoroughness, their review was predominantly centered on supervised algorithms, with minimal coverage of unsupervised techniques that are gaining traction in the mainstream. Cao et al. (2024c) approached anomaly detection from a computer vision perspective, reviewing existing techniques across both 2D and 3D modalities. However, their review lacked an in-depth exploration of anomaly detection methods tailored to specific manufacturing scenarios.

In contrast, this paper not only compiles a summary of existing publicly available anomaly detection data sets, which are not addressed in the previous surveys, but also conducts a systematic review of the most current unsupervised and supervised technologies. The comparison of different surveys is shown in Tab.1. Our survey stands out as the most comprehensive in the domain of 3D-based anomaly detection tasks, bridging the gap between the latest research developments and practical applications in manufacturing and other relevant fields. To sum up, the main contributions of this paper are as follows:

• A comprehensive review of anomaly detection based on 3D point cloud data, targeting real industrial scenarios.

• From a methodological perspective, we offer a systematic classification of anomaly detection methods across various industrial application scenarios.

• Our study presents a detailed analysis of the current problems and challenges in the field of anomaly detection using 3D point cloud data. Additionally, we offer insights into potential research directions to advance the future of this field.

This paper is organized as follows: Sections 2 and 3 provide a detailed review of existing 3D machine-vision-based anomaly detection methods, structured according to our methodology tree as shown in Fig.1. We also highlight milestone techniques and data sets over the years. Section 4 offers a summary of existing public data sets and commonly used evaluation metrics. The paper concludes with a summary and an outlook on future directions.

2 Supervised anomaly detection

In some manufacturing settings, it is possible to manually annotate collected samples. Supervised anomaly detection techniques utilize this annotated data, applying machine learning algorithms to classify each sample as normal or anomalous. This section is divided based on the model architecture, categorizing the supervised approaches into two types: classical machine learning-based methods and deep learning-based methods.

2.1 Classical machine learning-based methods

As the normal parts and anomalies always exhibit different underlying 3D geometries, traditional machine learning approaches begin by analyzing local geometric characteristics with defined statistics or descriptors, which can address the data representation challenge for 3D point cloud data representation. Subsequently, a machine learning model is trained to learn the decision boundary between normal and anomalous characteristics, facilitating effective anomaly detection. The popular local shape descriptors are provided in Tab.2.

Based on the PFH, Madrigal et al. (2017) designed a new descriptor, Model Feature Histograms, for defect detection. Based on the deviations of points from the CAD model, Yacob et al. (2019) formed the histograms of deviations as shape statistics for anomaly detection. Similarly, Li et al. (2021) calculated the difference between the target point cloud and its source counterpart, wherein the discrepancies were aggregated into various patches. Then, the patch-based statistics were defined and used for identification. With the planarity assumption, Chen et al. (2021) clustered points and used descriptive features like maximum height and bounding area size to distinguish these clusters as normal parts or anomalies.

2.2 Deep learning-based methods

The deep learning methods can more effectively capture complex patterns and relationships in the data, thus enabling more accurate anomaly detection in an end-to-end manner. We categorize these methods based on the different architectures, including the Pointwise Multi-Layer Perceptrons (MLPs), Convolutional Neural Networks (CNNs), Graph Neural Networks (GNNs), Long Short-term Memory (LSTM), and Transformers.

2.2.1 Pointwise MLP networks

This category employs shared MLPs to extract features from individual points. The features are aggregated into a global feature using a symmetric aggregation function (e.g., sum, max, mean). The above operations enable the tackling of irregular and unstructured points with permutation invariance, which address unstructured 3D point cloud data representation.

PointNet (Charles et al., 2017) is a pioneering network, whose architecture is illustrated in Fig.2. PointNet learned pointwise features by multiple shared MLPs. Subsequently, a max-pooling operator was then applied to extract the global feature for classification or segmentation tasks. PointNet + + (Qi et al., 2017) improved PointNet by integrating local geometric information through point neighborhood sampling, grouping, and learning within the PointNet layer. RandLA-Net (Hu et al., 2020), a prominent architecture for processing large-scale point clouds, contained local feature extraction and global feature aggregation modules. Moreover, the random sampling strategy enhanced computational efficiency and scalability in the model.

For anomaly detection, Hutton (2023) proposed a PointNet-based method for spacecraft surface inspection. Furthermore, Bolourian et al. (2023) improved PointNet + + by incorporating surface normals for bridge defect inspection on concrete surfaces. Kaji et al. (2022) adopted RandLA-Net for in situ monitoring of metal additive manufacturing.

2.2.2 CNNs

CNNs are designed to automatically and adaptively learn spatial hierarchies of features from depth images or structured point clouds. Such methods cannot be applied to unstructured 3D point cloud data modeling due to unstructured representation. Fig.3 illustrates the principle of a typical convolutional layer, requiring the grid data structure as input.

Shao et al. (2021) collected diverse depth images from multiple perspectives to capture comprehensive information on solder joints, which were then simultaneously fed into a CNN for defect detection. Dong et al. (2023) converted the unstructured point cloud into a 2D image and employed the YOLOv5 to identify cracks. A similar strategy was proposed by Wang et al. (2024), where YOLOv8 was used to locate defects in images, followed by accurate identification in point clouds. Due to the complementary information from RGB and depth data, various data fusion methods were proposed in the literature. Wang et al. (2022a) proposed a collaborative learning attention network, consisting of a popular CNN, ResNet101, for feature extraction, and an attention module for cross-modal information fusion. Similarly, Gao et al. (2023) proposed a novel attention mechanism with a dynamic self-driven loss reweighting strategy. Furthermore, Yang et al. (2023) proposed a bidirectional feature alignment module to exploit the consistency of cross-modal features.

2.2.3 GNNs

Point clouds can be transformed into graphs by representing each point as a vertex and creating edges based on spatial relationships, like k-nearest neighbors. This conversion preserves the geometric and topological characteristics of point clouds, which addresses unstructured 3D point cloud data representation challenges and facilitates efficient learning using GNNs.

GNNs can be categorized into spectral and spatial GNNs (Zhang et al., 2019). Spectral GNNs use graph Laplacian eigenvalues and eigenvectors for convolutions, capturing global graph structures but are computationally expensive for large-scale graphs. In contrast, Spatial GNNs operate in the spatial domain, performing convolutions based on the spatial relationships between nodes. The dynamic graph convolutional neural network (DGCNN) is a typical spatial GNN (Wang et al., 2019b), which dynamically updates the graph structure and proposes a novel edge-based convolution module. In the study (Wang et al., 2019a), the authors introduced the graph attention convolution (GAC). This technique accounts for the varying importance of neighboring points, allowing the model to concentrate on the most relevant parts. The general overflow of GNN is shown in Fig.4.

Based on GAC, Li et al. (2020) proposed a dynamic attention GNN for point cloud segmentation, which was further applied to workpiece defect detection. Furthermore, Bahreini and Hammad (2021) applied the DGCNN for concrete surface inspection. Wang et al. (2022b) introduced a hierarchical attentive edge convolution (HAEConv) and developed a coarse-to-fine anomaly detection framework. This framework segments the input point cloud into semantic parts and employs a classification subnetwork to identify damaged samples.

2.2.4 LSTMs and Transformers

Long Short-Term Memory (LSTM), as a variant of recurrent neural network (RNN), excels in modeling sequential data by capturing long-term dependencies. Leveraging this capability, Xu et al. (2024) treated point cloud lines from a laser profilometer as sequential data, which addresses the data representation challenge. They introduced a bidirectional LSTM-based multi-line detection method, efficiently extracting contextual information for detecting wood defects. The overview of this method is shown in Fig.5.

Transformer was first proposed by Vaswani (2017), utilizing self-attention mechanisms to capture relationships between different words in a sentence. Furthermore, the Point Transformer was proposed by Zhao et al. (2021), which is designed specifically for processing 3D point cloud data, achieving the superior performance of 3D object detection and semantic segmentation. Based on this work, Zhou et al. (2022) proposed a Transformer-based classification network (TransPCNet) to detect sewer defects, significantly outperforming their PointNet and DGCNN-based counterparts. The basic structure of the self-attention module in TransPCNet is illustrated in Fig.6, which applies multiple branches to learn the attention map.

2.3 Summary

We summarize the milestones of supervised anomaly detection methodologies in Fig.7. The qualitative results and applications are summarized in Tab.3. For the difficult representation of 3D point cloud data, classical methods try to design suitable features, while the pointwise MLPs and GNNs can directly handle the irregular input. Furthermore, though the classical machine learning-based methods show good interpretability and less data requirements for the labeled data, the predefined 3D descriptors in these methods may lack discriminative power, and the models’ limited size can decrease classification accuracy, hindering their ability to effectively detect complex anomalies across diverse surfaces. In contrast, deep-learning methods can deal with the situation but require sufficient data. Within the deep learning methods, the CNN-based and LSTM-based methods require depth images and point cloud lines, respectively. The above requirement limits the practical applications of the above methods. Therefore, how to develop deep learning methods given limited samples still needs future research.

3 Unsupervised anomaly detection

Due to limited annotated data, most of the current research has shifted toward the exploration of unsupervised methods. Specifically, based on their requirement for training data, these methods can be further classified into training-based methods and untrained methods. The training-based unsupervised methods assume sufficient anomaly-free training samples to learn the intrinsic patterns of targeted surfaces, based on which anomalies can be detected. By contrast, untrained methods do not need anomaly-free samples for training. Only a single sample is used for untrained methods.

3.1 Training-based methods

Basically, training-based anomaly detection can be primarily achieved through two types of techniques, namely feature embedding and reconstruction.

3.1.1 Feature embedding

Feature embedding-based methods mainly consist of two steps: first, extracting latent features from anomaly-free training data, and then modeling the distribution of anomaly-free features through various approaches such as memory bank, knowledge distillation, etc. Based on this distribution, surface anomalies can be detected during inference.

Memory bank. Memory bank-based methods extract features from anomaly-free training point cloud patches (the pink circles shown in Fig.8) and store them in a memory bank, which is then downsampled using PatchCore (Roth et al., 2022) to represent the distribution of anomaly-free features. During inference, features significantly deviating from this constructed distribution are flagged as anomalies. The general pipeline of memory bank-based methods is shown in Fig.8. A notable approach within this category, Back To the Feature (BTF) (Horwitz and Hoshen, 2023), highlights the effectiveness of the traditional FPFH descriptor in achieving state-of-the-art results in 3D point cloud-based anomaly detection. Considering the advance of deep neural networks, M3DM (Wang et al., 2023a) proposed to utilize a pretrained encoder, i.e., PointMAE (Pang et al., 2022) to extract the latent features from 3D point cloud data. However, the anomaly detection results are proven to be inferior to those obtained using traditional descriptors.

In addition to the representative methods, several innovative approaches have been developed to improve feature extraction from 3D point clouds. CPMF (Cao et al., 2024b) introduced a technique that employed a 2D pre-trained model, specifically ResNet (He et al., 2016), for feature extraction by first projecting 3D point clouds into 2D images from various perspectives. The extracted 2D features, when combined with 3D features from FPFH, result in enhanced latent features, showing significant promise in image-level detection. Nonetheless, this approach faces challenges such as the critical selection of projection perspectives and the time-consuming process of 2D image rendering. Besides, Chu et al. (2023) introduced the use of neural implicit functions through signed distance fields (SDFs) to capture the local geometry of point clouds, called Shape-guided. Utilizing PointNet for feature extraction, their method effectively harnesses SDF features, delivering superior detection capabilities over both classical descriptors and previous deep learning approaches.

Knowledge distillation (KD). Knowledge distillation (KD)-based methods, originally developed for model compression, involve transferring knowledge from a larger, general model to a smaller, domain-specific model to facilitate deployment in downstream tasks. In the context of anomaly detection, these methods detect anomalies by comparing the output features of a teacher network with those of a student network. 3D-ST (Bergmann and Sattlegger, 2023), as illustrated in Fig.9, innovatively applied this approach by using a pre-trained RandLA-Net (Hu et al., 2021) as the teacher and a learnable model with the same architecture but randomly initialized parameters as the student. The student network is trained on anomaly-free data to mimic the teacher’s output features closely. During inference, discrepancies between the teacher and student outputs on anomalous samples are used to calculate anomaly scores. A notable challenge of this method is the student network’s excessive generalization, which can lead to the accurate representation of anomalous features even without direct exposure during training, thus diminishing detection precision. AST (Rudolph et al., 2023) addressed this issue by proposing an asymmetric architecture with different structures for the teacher and student networks, specifically employing a normalizing flow for the teacher and a conventional convolutional neural network for the student.

3.1.2 Reconstruction-based methods

Different from feature-based methods that achieve anomaly detection at the feature level, reconstruction-based methods aim to achieve anomaly detection at point level. Basically, there are two kinds of reconstruction-based methods for 3D anomaly detection, which are Autoencoder (AE) and Principal component analysis (PCA).

Autoencoder. Autoencoder-based methods, trained exclusively on anomaly-free data, use a feature encoder and decoder to reconstruct input 3D point cloud data. During inference, they aim to reconstruct anomalous surfaces as anomaly-free, leveraging discrepancies between the input and the reconstruction to calculate anomaly scores. This approach leverages the autoencoder’s limitation in accurately reconstructing unseen features for effective anomaly detection. EasyNet (Chen et al., 2023) employed a multi-scale, multi-modality feature encoder-decoder for 3D depth map reconstruction, which offers real-time detection without relying on large pre-trained models or memory banks. Cheating Depth (Zavrtanik et al., 2024) introduced the Depth-Aware Discrete Autoencoder (DADA) to learn a discrete latent space for both RGB and 3D data, enhancing 3D surface anomaly detection. However, both EasyNet and DADA require structured depth maps as input and struggle with unstructured point clouds, limiting their applicability in some manufacturing scenarios. To overcome this limitation, Li and Xu (2023) proposed a self-supervised Iterative Mask Reconstruction Network (IMRNet) for reconstructing unstructured point clouds and improving anomaly detection accuracy, as shown in Fig.10, and also developed a new anomaly detection data set based on 3D point clouds using a synthetic data set ShapeNet (Chang et al., 2015), addressing data scarcity in 3D anomaly detection.

PCA. PCA-based methods identify anomaly-free patterns via principal components from training data to form a basis set. This set is then used during inference to reconstruct test samples’ surfaces for anomaly detection. von Enzberg and Al-Hamadi (2016) introduced a method that integrates PCA with multiresolution analysis for surface approximation in anomaly detection, applied to measurements of a door handle cup and a car front hood. Zhang et al. (2018) developed a 3D pavement anomaly detection approach, integrating crack and deformation detection. This approach reconstructs pavement profiles from line laser scanner data using PCA, with anomalies identified by calculating the residuals between the input and the reconstruction.

The quantitative results of training-based methods reported in their papers are illustrated in Tab.4.

3.1.3 Summary

Training-based methods, while effective in addressing data representation challenges and the diverse challenges of anomalies, are limited by their requirement for extensive anomaly-free training data, which can be a barrier in scenarios with limited data samples. Since training-based methods have well-designed feature extractors which can excel in handling unstructured point clouds. In addition, they learn from anomaly-free samples without making assumptions about anomalies, theoretically enabling them to detect any type of anomaly. However, in practical applications, it has been observed that anomalies with sparse properties or similar patterns with anomaly-free ones can lead to inaccurate anomaly detection results.

Reconstruction-based methods have the advantage of obtaining more accurate anomaly boundaries but struggle to accurately reconstruct complex surfaces, leading to higher false positive rates. In contrast, feature embedding-based methods excel in precisely localizing anomalies and exhibit lower false positive rates. However, due to the reduced resolution of the feature map used for calculating anomaly scores, feature embedding-based methods face challenges in obtaining accurate anomaly boundaries. We summarize the representative milestones of unsupervised training-based methods in Fig.11.

Overall, unsupervised training-based methods effectively model complex manufacturing surfaces by learning from a large amount of anomaly-free data, without the need for annotated anomaly samples, overcoming the problem of a lack of anomaly samples. These methods focus on learning normal patterns without modeling anomaly patterns, thereby avoiding addressing the challenges related to anomalies. Additionally, most methods based on deep neural networks or descriptors can directly handle unstructured point clouds. Therefore, unsupervised training-based methods are suitable for large-scale manufacturing scenarios where a large amount of anomaly-free data can be collected.

3.2 Untrained methods

The training methods require annotated (for supervised methods) or anomaly-free (for unsupervised methods) data sets for learning a model. However, these requirements may not be practically satisfied due to: (i) the burden of manually annotating millions of points in a high-density point cloud, and (ii) the limited production capabilities and small sample sizes of customized products. Therefore, the training methods reviewed above may not work for all industrial scenarios. In contrast, the untrained methods can handle a single sample directly, without the need for training on a large data set. Finally, we will review the untrained methods for anomaly detection.

3.2.1 Local geometry-based methods

Surface anomalies may exhibit different local geometries compared to those of normal structures. Concretely, existing methods explore various local geometric characteristics to capture local geometries. Subsequently, different rules and criteria are proposed to identify anomalies associated with abnormal characteristics.

Jovančević et al. (2017) employed a region-growing segmentation algorithm to segment point cloud airplane surfaces into defective and non-defective areas, leveraging local normal and curvature data. For micro-indentation detection via laser scanning confocal microscopy, Hitchcox and Zhao (2018) introduced a graph-based segmentation technique for anomaly point labeling, necessitating manual seed point selection. Wei et al. (2021) developed three innovative local features to compute anomaly scores by kernel density estimation, where a higher score signals an anomalous point. Miao et al. (2022) implemented a multi-step detection algorithm utilizing FPFH and normal vector aggregation for identifying defects, such as scratches and deformations on gas turbine blades. The diagram for calculating FPFH is illustrated in Fig.12. Furthermore, Wang et al. (2023b) calculated the covariance matrix within the neighborhood of each point, where the eigenvalues were analyzed to indicate an anomaly point. Additionally, Lee et al. (2023) converted the point cloud into surface normals, which were analyzed by the Gabor filter to identify surface defects.

3.2.2 Global geometry-based methods

Instead of adopting the local geometric information, the global geometry methods make use of the global shapes of manufacturing parts. There are mainly three categories, which are statistical model-based and CAD model-based methods.

Statistical-based methods. This category of methods is based on statistical assumptions for the shape of the product, such as low-rankness and smoothness. Assuming a globally smooth surface of products, locally smooth anomalies, and locally non-smooth outliers, Tao et al. (2023) formulated the anomaly detection problem within a probabilistic framework, and proposed a statistical inference algorithm to identify the type of each point, which was validated by the steel production process. The overview of this method can be referred to in Fig.13. To deal with more general smooth free-form surfaces, the PointSGRADE proposed by Tao and Du (2023) formulated the problem into a penalized regression problem, assuming sparse anomalies. For complex surfaces with axial symmetry, Cao et al. (2024a) transformed the point cloud into low-rank profiles and applied the robust PCA algorithm to detect the sparse anomalies.

CAD model-based methods. With the provided standard CAD designs, it is possible to compare the point cloud with CAD models by rigid registration, where the anomalies exhibit large differences. The standard iterative closest point (ICP) algorithm for rigid registration is adopted by Zhao and Del Castillo (2021a) and Zhao et al. (2023) to detect small 3D printing defects. To accelerate the ICP algorithm, He et al. (2023) developed an octree-based registration algorithm. Considering the inherited shape shrinkage of printed parts in additive manufacturing, Decker et al. (2020) proposed a non-rigid registration algorithm based on the ICP algorithm by incorporating manufacturing process knowledge. Finally, considering the challenge of finding the optimal registration parameters, Zhao and Del Castillo (2021b) proposed a registration-free method to align the point cloud with the associated CAD model.

3.3 Summary

We summarize the untrained methods in terms of milestones in Fig.14. Instead of learning the normal patterns from training data sets in unsupervised methods, untrained methods rely more on prior knowledge that enables the modeling of anomaly-free reference surface or possible anomalies, thus ensuring the capability of anomaly detection on a single sample. The local geometrical-based methods are susceptible to noise and outliers, which may lead to false detection. In contrast, the global methods utilizing the prior information about manufacturing products can alleviate the above issues. For unsupervised anomaly detection, real-time anomaly detection may not be easily achieved, especially for untrained scenarios. By comparison, most supervised methods can achieve fast calculation for online anomaly detection when equipped with a GPU, although the training time can be relatively long. How to achieve real-time anomaly detection for untrained methods needs future investigations.

We summarized the qualitative results and applications of the above untrained methods in Tab.5.

4 Data sets and metrics

4.1 Public data sets

Data are of vital importance for research methodology validation. According to our review, the current public data sets could be categorized as point cloud-based and RGBD-based, where the depth modality in RGBD data can be transformed as structured 3D point cloud data. In Tab.6, we provide a brief description of these public data sets for 3D industrial anomaly detection up to date.

(1) MVTec-3D AD (Bergmann et al., 2022)

Data description. As shown in Fig.15, the data set includes 10 categories of real scanned objects, with data modalities including RGB images and their corresponding depth maps. Each object is divided into training, validation, and test data sets, with the training data set ranging from 210 to 361 samples, the validation data sets ranging from 22 to 42 samples, and the test data set ranging from 100 to 159 samples. The training and validation data sets only consist of normal samples, and the test data set encompasses 4 to 5 types of anomalies such as scratches, dents, and contaminations, with approximately 20 samples per anomaly category.

Potential studies. This data set is annotated with pixel-level labels, enabling its utilization for anomaly detection tasks. The inclusion of RGB and Depth modalities also facilitates research in multimodal data fusion and 3D-based anomaly detection. The experiments of most unsupervised training-based methods are conducted on this data set. Furthermore, the test set encompasses multiple anomaly categories, making it suitable for anomaly classification studies. However, the limited sample size may motivate research on few-shot anomaly classification.

(2) Real3D-AD (Liu et al., 2024)

Data description. The Real3D-AD data set exclusively comprises the 3D point cloud modality and encompasses 12 object categories. Each category contains 4 samples for training, with the test data set comprising 100 samples, including 50 anomalous samples. The point cloud samples vary in point size, ranging from 35k to 780k points.

Potential studies. This data set has detailed pixel-level labels, making it suitable for anomaly detection research based on 3D point clouds. However, due to the limited number of training samples, with only 4 training samples per object category, it is only suitable for few-shot-related research.

(3) Anomaly-ShapeNet (Li and Xu, 2023)

Data description. The Anomaly-ShapeNet data set is constructed based on a synthetic public 3D point cloud data set, ShapeNet. It comprises 40 object categories and 6 types of anomalies, as illustrated in Fig.16. Similar to the Real3D-AD data set, there are 4 normal training samples for each object category, with 29 anomalous samples in the test data set, and each sample contains between 8k to 30k points.

Potential studies. This data set contains object-level and pixel-level labels, making it suitable for few-shot anomaly detection and classification tasks.

(4) NEU RSDDS-AUG (Wang et al., 2022a)

Data description: This data set contains RGB-D data of no-service rail P60 including defects such as scar defect and hole defect. There are 704 images with only one defect and 1158 images with multiple defects. The image resolutions cover a wide range. Most of them are concentrated around 500 × 500. The data set has been divided into a training data set with 1500 images and a test data set with 362 images according to the 8:2 ratio.

Potential studies. This data set is designed for anomaly or defect detection tasks based on RGBD data. Since some samples have multiple defects, this data set can be utilized for multi-label classification tasks.

(5) Concrete surface defect data set (Bolourian et al., 2023)

Data description: This data set is scanned from four RC bridges in Montreal using a FARO Focus3D X 130 scanner, which automatically captures images during scanning and detects the color of each point. The scanned point clouds were annotated manually and labeled into three classes: cracks, spalls, and normal (i.e., without anomalies). In the training and evaluation data sets, there are a total of 81 segments, comprising 475 cracks and 588 spalls. In the test data set, there are 21 segments, including 120 cracks and 185 spalls. Each segment contains multiple types of defects, with the point sizes ranging from 79,719 to 486,062.

Potential studies. Because each segment in the data set is annotated at the pixel level, it is evidently suitable for research in anomaly detection. Additionally, as each segment contains multiple types of defects, the data set can also be utilized for research in multi-label classification.

(6) Sewer 3D point cloud data set (Haurum et al., 2021)

Data description. The Sewer Defect Data set is a pure 3D point cloud data set, which consists of two parts: synthetic data set and real scanned sewer 3D point cloud data. The synthetic data samples are significantly more numerous than the actual scanned data, which are used for model training enhancement. The data set has been divided into training (11074), validation (2768), and test (3185) data sets, each containing four defect categories: normal, displacement, brick, and rubber ring. Each point cloud sample consists of 1024 points.

Potential studies. The data set provides object-level annotations, making it suitable for supervised anomaly classification research.

4.2 Evaluation metrics

There are various metrics to evaluate the performance of 3D anomaly detection. Defining FP as the incorrectly predicted anomaly points, TN as the correctly predicted anomaly points, FN as the erroneously predicted normal surface points, and TP as the correctly predicted normal surface points, Tao and Du (2023) adopted the following four metrics including the false omission rate (FOR), the false positive rate (FPR), the balanced accuracy (BA), and the Dice coefficient (DICE):

(1)

F O R = F N F N + T N, F P R = F P F P + T N,

(2)

B A = T P 2 (T P + F N) + T N 2 (T N + F P), D I C E = 2 T N 2 T N + F P + F N .

Moreover, the popular metrics for binary classification, i.e., precision, recall, are defined as:

(3)

P r e c i s i o n = T P T P + F P, R e c a l l = T P T P + F N .

Bergmann et al. (2022) adopted the area under the per-region overlap curve (AUPRO) and the area under the receiver operating characteristic curve (AUROC) as the benchmarks for MVTec 3D-AD data sets. Specifically, given the binary prediction

P

k t h

connected component of the ground truth

G k

, and the total number of components

K

, the per-region overlap (PRO) is defined as:

(4)

P R O = 1 K ∑ k = 1 K | P ∩ G k | | G k | .

Notably, the recall is also named as true positive rate (TPR). Then, given the limited false positive rate

t

, AUPRO and AUROC are formulated as:

(5)

A U P R O = ∫ 0 t (P R O) d (F P R), A U R O C = ∫ 01 (T P R) d (F P R) .

Finally, given the ground truth

G

, the intersection over union (IoU) is another popular metric for point cloud segmentation, which is defined as:

(6)

I o U = P ∩ G P ∪ G .

Tab.7 presents an overview of the forementioned evaluation metrics along with their definitions.

5 Conclusions and outlook

5.1 Conclusions

In conclusion, the field of 3D machine vision-based anomaly detection is rapidly evolving and has become a crucial area in the realm of quality control. However, the literature still lacks a survey that both comprehensively covers the methodologies for industrial and manufacturing scenarios and discusses the current challenges and potential research directions. In this paper, we have classified existing anomaly detection methods into three categories: supervised, unsupervised training-based, and untrained methods. For supervised methods, we have reviewed both classical machine learning-based and deep learning-based approaches. For unsupervised methods, we have categorized them into feature-embedding and reconstruction-based techniques. Additionally, we have explored local geometry-based and global methods for untrained anomaly detection. For these methods, we have summarized their strengths, limitations, and applications. Moreover, we have provided a detailed analysis of existing public data sets, including descriptions and potential research for various questions. Finally, we discussed the common evaluation metrics, serving as a basis for evaluating the anomaly detection methods.

5.2 Outlook

Based on the systematic review of 3D machine vision-based methods, from our perspective, there are still some directions worthy of further research.

High-precision 3D anomaly detection. High-precision 3D anomaly detection is a critical area with ongoing challenges. Existing methods can accurately locate anomalies but struggling to distinguish their precise anomaly boundaries. The primary challenge is that feature-based anomaly detection, due to resolution reduction, blurs the boundaries between anomalies and normal areas, preventing the accurate detection of boundaries. Therefore, developing methods to detect anomaly boundaries precisely and establishing corresponding evaluation metrics is necessary, yet remains an open research question.

Large language model-aided anomaly detection. Harnessing large language models for anomaly detection presents a potential venue, capitalizing on the abundant textual information from manufacturing processes and standards, which are typically neglected by conventional anomaly detection techniques. However, the prevailing challenge is lack of 3D foundation models, where current models predominantly process 2D images and text. Therefore, how to develop a 3D foundation model or transfer a pretrained 2D model to apply for 3D anomaly detection is an important research problem.

Anomaly detection based on multi-modality data. In certain manufacturing scenarios, the reliance on a single modality, such as images or 3D point cloud data, is insufficient for the effective detection of anomalies. Under these circumstances, complementary information from both modalities is often required. Therefore, exploring how to efficiently integrate multimodal information for anomaly detection emerges as a crucial area of research. Data fusion can be categorized as data-level, feature-level, and decision-level fusion. Current research has predominantly concentrated on feature-level fusion, indicating a substantial scope for investigation into data-level and decision-level fusion methods.

References

Publishing order | Descend order by publishing year | Descend order by cited within

[1]	BahreiniFHammad A (2021). Point cloud semantic segmentation of concrete surface defects using dynamic graph CNN. In: Proceedings of the International Symposium on Automation and Robotics in Construction, 38:379–386, IAARC Publications

[2]	Bello S A, Yu S, Wang C, Adam J M, Li J, (2020). Deep learning on 3D point clouds. Remote Sensing, 12( 11): 1729

[3]	BergmannPJin XSattleggerDStegerC (2022). The MVTec 3D-AD dataset for unsupervised 3D anomaly detection and localization: In: Proceedings of the 17th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications, 202–213

[4]	BergmannPSattlegger D (2023). Anomaly detection in 3D point clouds using deep geometric descriptors. In: 2023 IEEE/CVF Winter Conference on Applications of Computer Vision, 2612–2622

[5]	Bolourian N, Nasrollahi M, Bahreini F, Hammad A, (2023). Point cloud–based concrete surface defect semantic segmentation. Journal of Computing in Civil Engineering, 37( 2): 04022056

[6]	CaoXTaoC DuJ (2024a). 3D-CSAD: Untrained 3D anomaly detection for complex manufacturing surfaces. arXiv Preprint arXiv:2404.07748

[7]	Cao Y, Xu X, Shen W, (2024b). Complementary pseudo multimodal feature for point cloud anomaly detection. Pattern Recognition, 156: 110761

[8]	CaoYXuX ZhangJCheng YHuangXPangGShenW (2024c). A survey on visual anomaly detection: Challenge, approach, and prospect. arXiv Preprint arXiv:2401.16402

[9]	ChangA XFunkhouser TGuibasLHanrahanPHuangQ LiZSavareseS SavvaMSong SSuHXiaoJYiL YuF (2015). ShapeNet: An information-rich 3D model repository: arXiv:1512.03012

[10]	CharlesR QSu HKaichunMGuibasL J (2017). PointNet: Deep learning on point sets for 3D classification and segmentation. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition, 77–85

[11]	Chen L, Yao X, Xu P, Moon S K, Bi G, (2021). Rapid surface defect identification for additive manufacturing with in-situ point cloud processing and machine learning. Virtual and Physical Prototyping, 16( 1): 50–67

[12]	ChenRXie GLiuJWangJLuoZ WangJZheng F (2023). EasyNet: An easy network for 3D industrial anomaly detection. In: Proceedings of the 31st ACM International Conference on Multimedia, 7038–7046

[13]	Cheng L, Tsung F, Wang A, (2017). A statistical transfer learning perspective for modeling shape deviations in additive manufacturing. IEEE Robotics and Automation Letters, 2( 4): 1988–1993

[14]	Cheng L, Wang A, Tsung F, (2018). A prediction and compensation scheme for in-plane shape deviation of additive manufacturing with information on process parameters. IISE Transactions, 50( 5): 394–406

[15]	ChuY MLiu CHsiehT IChenH TLiuT L (2023). Shape-guided dual-memory learning for 3D anomaly detection. In: Proceedings of the 40 th International Conference on Machine Learning

[16]	Cohen J, Ni J, (2022). Semi-supervised learning for anomaly classification using partially labeled subsets. Journal of Manufacturing Science and Engineering, 144( 6): 061008

[17]	Decker N, Wang Y, Huang Q, (2020). Efficiently registering scan point clouds of 3D printed parts for shape accuracy assessment and modeling. Journal of Manufacturing Systems, 56: 587–597

[18]	Dong Q, Wang S, Chen X, Jiang W, Li R, Gu X, (2023). Pavement crack detection based on point cloud data and data fusion. Philosophical Transactions of the Royal Society of London. Series A, Mathematical and Physical Sciences, 381( 2254): 20220165

[19]	Du J, Yan H, Chang T S, Shi J, (2022). A tensor voting-based surface anomaly classification approach by using 3D point cloud data. Journal of Manufacturing Science and Engineering, 144( 5): 051005

[20]	Gao Y, Cao Z, Qin Y, Ge X, Lian L, Bai J, Yu H, (2023). Railway fastener anomaly detection via multi-sensor fusion and self-driven loss reweighting. IEEE Sensors Journal, 24( 2): 1812–1825

[21]	Guo Y, Sohel F, Bennamoun M, Lu M, Wan J, (2013). Rotational projection statistics for 3D local surface description and object recognition. International Journal of Computer Vision, 105( 1): 63–86

[22]	HaurumJAllahham MLyngeMHenriksenKNikolovI MoeslundT (2021). Sewer defect classification using synthetic point clouds. In: Proceedings of the 16th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications, 891–900

[23]	HeKZhangX RenSSunJ (2016). Deep residual learning for image recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition, 770–778

[24]	He Y, Ma W, Li Y, Hao C, Wang Y, Wang Y, (2023). An octree-based two-step method of surface defects detection for remanufacture. International Journal of Precision Engineering and Manufacturing-Green Technology, 10( 2): 311–326

[25]	Hitchcox T, Zhao Y F, (2018). Random walks for unorganized point cloud segmentation with application to aerospace repair. Procedia Manufacturing, 26: 1483–1491

[26]	HorwitzEHoshen Y (2023). Back to the feature: Classical 3D features are (almost) all you need for 3D anomaly detection. In: 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2968–2977

[27]	HuQYangB XieLRosaS GuoYWangZ TrigoniNMarkham A (2020). RandLA-Net: Efficient semantic segmentation of large-scale point clouds. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 11105–11114

[28]	Hu Q, Yang B, Xie L, Rosa S, Guo Y, Wang Z, Trigoni N, Markham A, (2021). Learning semantic egmentation of large-scale point clouds with random sampling. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44( 11): 8338–8354

[29]	HuoLLiuY YangYZhuang ZSunM (2023). Review: Research on product surface quality inspection technology based on 3D point cloud. Advances in Mechanical Engineering, 15(3): 16878132231159523

[30]	HuttonK T (2023). Anomaly Detection on Partial Point Clouds for the Purpose of Identifying Damage on the Exterior of Spacecrafts. Master’s thesis, The University of Western Ontario, Canada

[31]	Jovančević I, Pham H H, Orteu J J, Gilblas R, Harvent J, Maurice X, Brèthes L, (2017). 3D point cloud analysis for detection and characterization of defects on airplane exterior surface. Journal of Nondestructive Evaluation, 36: 74

[32]	Kaji F, Nguyen-Huu H, Budhwani A, Narayanan J A, Zimny M, Toyserkani E, (2022). A deep-learning-based in-situ surface anomaly detection methodology for laser directed energy deposition via powder feeding. Journal of Manufacturing Processes, 81: 624–637

[33]	Koenderink J J, Van Doorn A J, (1992). Surface shape and curvature scales. Image and Vision Computing, 10( 8): 557–564

[34]	Lee E T, Fan Z, Sencer B, (2023). A new approach to detect surface defects from 3D point cloud data with surface normal Gabor filter. Journal of Manufacturing Processes, 92: 196–205

[35]	Li R, Jin M, Paquit V C, (2021). Geometrical defect detection for additive manufacturing with machine learning models. Materials & Design, 206: 109726

[36]	LiWXuX (2023). Towards scalable 3D anomaly detection and localization: A benchmark via 3D anomaly synthesis and s self-supervised learning network. arXiv Preprint arXiv:2311.14897

[37]	LiYZhangR LiHShaoX (2020). Dynamic attention graph convolution neural network of point cloud segmentation for defect detection. In: 2020 IEEE International Conference on Artificial Intelligence and Information Systems, 18–23

[38]	Lin C H, Chen J Y, Su P L, Chen C H, (2014). Eigen-feature analysis of weighted covariance matrices for LiDAR point cloud classification. ISPRS Journal of Photogrammetry and Remote Sensing, 94: 70–79

[39]	Liu J, Xie G, Chen R, Li X, Wang J, Liu Y, Wang C, Zheng F, (2024). Real3d-ad: A dataset of point cloud anomaly detection. Advances in Neural Information Processing Systems, 36: 30402–30415

[40]	Madrigal C, Branch J, Restrepo A, Mery D, (2017). A method for automatic surface inspection using a model-based 3D descriptor. Sensors, 17( 10): 2262

[41]	Miao Y, Fu R, Wu H, Hao M, Li G, Hao J, Zhou D, (2022). Pipeline of turbine blade defect detection based on local geometric pattern analysis. Engineering Failure Analysis, 133: 105965

[42]	PangYWang WTayF E HLiuWTianY YuanL (2022). Masked autoencoders for point cloud self-supervised learning. In S. Avidan, G. Brostow, M. Cissé G. M. Farinella, & T. Hassner (Eds.), Computer Vision – ECCV 2022 (604–621) Springer Nature, Switzerlanddoi

[43]	Qi C R, Yi L, Su H, Guibas L J, (2017). PointNet++: Deep hierarchical feature learning on point sets in a metric space. Advances in Neural Information Processing Systems, 30: 5105–5114

[44]	RaniAOrtiz-Arroyo DDurdevicP (2024). Advancements in point cloud-based 3d defect detection and classification for industrial systems: A comprehensive survey. arXiv Preprint arXiv:2402.12923

[45]	RothKPemula LZepedaJScholkopfBBroxT GehlerP (2022). Towards total recall in industrial anomaly detection. In: 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 14298–14308

[46]	RudolphMWehrbein TRosenhahnBWandtB (2023). Asymmetric student-teacher networks for industrial anomaly detection. In: 2023 IEEE/CVF Winter Conference on Applications of Computer Vision, 2591–2601

[47]	RusuR BBlodow NBeetzM (2009). Fast point feature histograms (FPFH) for 3D registration. In: 2009 IEEE International Conference on Robotics and Automation, 3212–3217

[48]	RusuR BBlodow NMartonZ CBeetzM (2008). Aligning point cloud views using persistent feature histograms. In: 2008 IEEE/RSJ International Conference on Intelligent Robots and Systems, 3384–3391

[49]	ShaoZHao KWeiBTangX S (2021). Solder joint defect detection based on depth image CNN for 3D shape classification. In: 2021 CAA Symposium on Fault Detection, Supervision, and Safety for Technical Processes (SAFEPROCESS), 1–6

[50]	SunYAbidiM A (2001). Surface matching by 3D point’s fingerprint. In: Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001, 2, 263–269

[51]	Tao C, Du J, (2023). PointSGRADE: Sparse learning with graph representation for anomaly detection by using unstructured 3D point cloud data. IISE Transactions, 57( 2): 131–144

[52]	Tao C, Du J, Chang T S, (2023). Anomaly detection for fabricated artifact by using unstructured 3D point cloud data. IISE Transactions, 55( 11): 1174–1186

[53]	Vaswani A, (2017). Attention is all you need. Advances in Neural Information Processing Systems, 30: 6000–6010

[54]	von Enzberg S, Al-Hamadi A, (2016). A multiresolution approach to model-based 3-D surface quality inspection. IEEE Transactions on Industrial Informatics, 12( 4): 1498–1507

[55]	Wang J, Song K, Zhang D, Niu M, Yan Y, (2022a). Collaborative learning attention network based on RGB image and depth image for surface defect inspection of no-service rail. IEEE/ASME Transactions on Mechatronics, 27( 6): 4874–4884

[56]	WangJZhang SXiaoYSongR (2021). A review on graph neural network methods in financial applications. arXiv Preprint arXiv:2111.15367

[57]	WangLHuang YHouYZhangSShanJ (2019a). Graph attention convolution for point cloud semantic segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 10296–10305

[58]	Wang Q, Wang X, He Q, Huang J, Huang H, Wang P, Yu T, Zhang M, (2024). 3D tensor-based point cloud and image fusion for robust detection and measurement of rail surface defects. Automation in Construction, 161: 105342

[59]	WangYPeng JZhangJYiRWangY WangC (2023a). Multimodal industrial anomaly detection via hybrid fusion. In: 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 8032–8041

[60]	Wang Y, Sun W, Jin J, Kong Z J, Yue X, (2023b). MVGCN: Multi-view graph convolutional neural network for surface defect identification using three-dimensional point cloud. Journal of Manufacturing Science and Engineering, 145( 3): 031004

[61]	Wang Y, Sun Y, Liu Z, Sarma S E, Bronstein M M, Solomon J M, (2019b). Dynamic graph cnn for learning on point clouds. ACM Transactions on Graphics, 38( 5): 1–12

[62]	Wang Z, Zhang Y, Lv T, Luo L, (2022b). GTAINet: Graph neural network-based two-stage anomaly identification for locking wire point clouds using hierarchical attentive edge convolution. International Journal of Applied Earth Observation and Geoinformation, 115: 103106

[63]	Wei S, Chengyao S, Shilin C, Kailin Z, (2021). A microhardness indentation point cloud segmentation method based on voxel cloud connectivity segmentation. Measurement: Sensors, 18: 100124

[64]	Xu Z, Lin Y, Chen D, Yuan M, Zhu Y, Ai Z, Yuan Y, (2024). Wood broken defect detection with laser profilometer based on Bi-LSTM network. Expert Systems with Applications, 242: 122789

[65]	Yacob F, Semere D, Nordgren E, (2019). Anomaly detection in Skin Model Shapes using machine learning classifiers. International Journal of Advanced Manufacturing Technology, 105( 9): 3677–3689

[66]	Yang J, Zhou W, Wu R, Fang M, (2023). CSANet: Contour and semantic feature alignment fusion network for rail surface defect detection. IEEE Signal Processing Letters, 30: 972–976

[67]	ZavrtanikVKristan MSkočajD (2024). Cheating depth: enhancing 3D surface anomaly detection via depth simulation. In: 2024 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2153–2161

[68]	Zhang D, Zou Q, Lin H, Xu X, He L, Gui R, Li Q, (2018). Automatic pavement defect detection using 3D laser profiling technology. Automation in Construction, 96: 350–365

[69]	Zhang S, Tong H, Xu J, Maciejewski R, (2019). Graph convolutional networks: A comprehensive review. Computational Social Networks, 6: 11

[70]	ZhaoHJiang LJiaJTorrP HKoltunV (2021). Point transformer. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, 16259–16268

[71]	Zhao X, Del Castillo E, (2021a). An intrinsic geometrical approach for statistical process control of surface and manifold data. Technometrics, 63( 3): 295–312

[72]	ZhaoXDel Castillo E (2021b). Registration-free localization of defects in 3-D parts from mesh metrology data using functional maps. arXiv Preprint arXiv:2112.14870

[73]	Zhao X, Li Q, Xiao M, He Z, (2023). Defect detection of 3D printing surface based on geometric local domain features. International Journal of Advanced Manufacturing Technology, 125( 1–2): 183–194

[74]	Zhou Y, Ji A, Zhang L, (2022). Sewer defect detection from 3D point clouds using a transformer-based deep learning model. Automation in Construction, 136: 104163