1 Introduction
2 Proposed deep convolutional tree-inspired network
2.1 Learning the neural backbone of the seed nodes
Tab.1 Layer details of backbone CNN |
Layer | Set | Output shape |
---|---|---|
Input | ‒ | N×R×R×1 |
2D convolution layer | Kernel size: 1×1, channel: 16, stride: 1 | N×R×R×16 |
Batch normalization | Feature number: 16, eps: 10−5 | N×R×R×16 |
ReLU activation | ‒ | N×R×R×16 |
2D max pooling layer | Kernel size: 2×2 | N×R/2×R/2×16 |
2D convolution layer | Kernel size: 3×3, channel: 32, stride: 1 | N×R/2×R/2×32 |
Batch normalization | Feature number: 32, eps: 10−5 | N×R/2×R/2×32 |
ReLU activation | ‒ | N×R/2×R/2×32 |
2D max pooling layer | Kernel size: 2×2 | N×R/4×R/4×32 |
2D convolution layer | Kernel size: 3×3, channel: 64, stride: 1 | N×R/4×R/4×64 |
Batch normalization | Feature number: 64, eps: 10−5 | N×R/4×R/4×64 |
ReLU activation | ‒ | N×R/4×R/4×64 |
Adaptive average pooling layer | Kernel size: 1×1 | N×1×1×64 |
Fully-connected layer | Batch size: 64×1×1, out features: K, no bias | N×K |
2.2 Interpreting the tree-structured decision layer
2.3 Fine-tuning with decision loss
3 Proposed DCTN-based fault diagnosis approach of bearings
3.1 Aeronautical bearing test rig
Tab.2 Fault set details of aeronautical bearings |
Serial number | Fault location | Fault size/μm | Superclass | Subclass |
---|---|---|---|---|
N-1 | No defect | ‒ | N | 1 |
I-2 | On the inner ring | 450 | I | 2 |
I-3 | On the inner ring | 250 | I | 3 |
I-4 | On the inner ring | 150 | I | 4 |
R-5 | On a roller | 450 | R | 5 |
R-6 | On a roller | 250 | R | 6 |
R-7 | On a roller | 150 | R | 7 |
Tab.3 Operating condition details of aeronautical bearings |
Number | Load/N | Speed/(r∙min−1) |
---|---|---|
C1 | 0 | 6×103 |
C2 | 1000 | 6×103 |
C3 | 1400 | 6×103 |
C4 | 1800 | 6×103 |
C5 | 0 | 12×103 |
C6 | 1000 | 12×103 |
C7 | 1400 | 12×103 |
C8 | 1800 | 12×103 |
C9 | 0 | 18×103 |
C10 | 1000 | 18×103 |
C11 | 1400 | 18×103 |
C12 | 1800 | 18×103 |
C13 | 0 | 24×103 |
C14 | 1000 | 24×103 |
C15 | 1400 | 24×103 |
C16 | 0 | 3×104 |
C17 | 1000 | 3×104 |
3.2 Time‒frequency analysis based on CWT
3.3 DCTN-based hierarchical multiclass fault diagnosis network
3.4 DCTN-based cross-severity fault diagnosis network
4 Case studies
4.1 Case one: multiclass fault diagnosis of bearings
Tab.4 Fault diagnosis accuracy of bearings with different training data ratios |
Condition | Fault diagnosis accuracy/% | |||||||||
---|---|---|---|---|---|---|---|---|---|---|
0.1 | 0.2 | 0.3 | 0.4 | 0.5 | 0.6 | 0.7 | 0.8 | 0.9 | Mean | |
C1 | 96.19 | 93.21 | 99.39 | 100.0 | 100.0 | 100.0 | 100.0 | 100.0 | 100.0 | 98.75 |
C2 | 96.67 | 93.93 | 96.53 | 100.0 | 99.43 | 98.57 | 100.0 | 100.0 | 100.0 | 98.35 |
C3 | 96.83 | 88.57 | 96.94 | 96.19 | 99.43 | 99.29 | 100.0 | 100.0 | 100.0 | 97.47 |
C4 | 97.46 | 89.81 | 95.71 | 92.38 | 99.43 | 98.93 | 100.0 | 100.0 | 100.0 | 97.08 |
C5 | 99.03 | 100.0 | 100.0 | 100.0 | 100.0 | 100.0 | 100.0 | 100.0 | 100.0 | 99.89 |
C6 | 97.94 | 96.43 | 94.08 | 100.0 | 99.71 | 100.0 | 100.0 | 100.0 | 100.0 | 98.68 |
C7 | 93.65 | 95.00 | 96.12 | 100.0 | 100.0 | 100.0 | 100.0 | 100.0 | 100.0 | 98.31 |
C8 | 93.81 | 95.18 | 95.31 | 99.05 | 98.29 | 100.0 | 100.0 | 100.0 | 100.0 | 97.96 |
C9 | 97.78 | 99.82 | 100.0 | 100.0 | 100.0 | 100.0 | 100.0 | 100.0 | 100.0 | 99.73 |
C10 | 97.14 | 97.68 | 99.39 | 100.0 | 100.0 | 100.0 | 100.0 | 100.0 | 100.0 | 99.36 |
C11 | 93.81 | 98.93 | 100.0 | 100.0 | 100.0 | 100.0 | 100.0 | 100.0 | 100.0 | 99.19 |
C12 | 93.81 | 98.93 | 100.0 | 100.0 | 100.0 | 100.0 | 100.0 | 100.0 | 100.0 | 99.19 |
C13 | 99.05 | 100.0 | 100.0 | 100.0 | 100.0 | 100.0 | 100.0 | 100.0 | 100.0 | 99.89 |
C14 | 93.81 | 95.54 | 100.0 | 100.0 | 100.0 | 100.0 | 100.0 | 100.0 | 100.0 | 98.82 |
C15 | 94.44 | 98.21 | 100.0 | 100.0 | 99.43 | 100.0 | 100.0 | 100.0 | 100.0 | 99.12 |
C16 | 93.81 | 99.82 | 100.0 | 100.0 | 100.0 | 100.0 | 100.0 | 100.0 | 100.0 | 99.29 |
C17 | 97.46 | 98.21 | 100.0 | 100.0 | 100.0 | 100.0 | 100.0 | 100.0 | 100.0 | 99.52 |
Mean | 96.04 | 96.43 | 98.44 | 99.27 | 99.75 | 99.81 | 100.0 | 100.0 | 100.0 |
4.2 Case two: cross-severity fault diagnosis of bearings
Tab.5 Set of cross-severity fault diagnosis tasks |
Task | Categories of training bearings | Categories of test bearings |
---|---|---|
1 | N-1, I-3, I-4, R-6, R-7 | I-2, R-5 |
2 | N-1, I-2, I-4, R-5, R-7 | I-3, R-6 |
3 | N-1, I-2, I-3, R-5, R-6 | I-4, R-7 |
Tab.6 Fault diagnosis accuracies of different approaches in cross-severity fault diagnosis tasks |
Approach | Fault diagnosis accuracy/% | |||||||||
---|---|---|---|---|---|---|---|---|---|---|
I-2 | R-5 | Task 1 | I-3 | R-6 | Task 2 | I-4 | R-7 | Task 3 | Mean | |
TFD-DCTN | 86.00 | 100.0 | 93.00 | 99.00 | 99.00 | 99.00 | 96.00 | 83.00 | 89.50 | 93.83 |
TFD-CNN | 2.00 | 98.00 | 50.00 | 2.00 | 11.00 | 6.50 | 9.00 | 1.00 | 5.00 | 20.50 |
TFD-LBCNN | 8.00 | 98.00 | 53.00 | 5.00 | 97.00 | 51.00 | 0.00 | 100.0 | 50.00 | 51.33 |
TFD-PCA-SVM | 97.00 | 0.00 | 48.50 | 0.00 | 100.0 | 50.00 | 36.00 | 0.00 | 18.00 | 38.83 |
TFD-PCA-KNN | 19.00 | 93.00 | 56.00 | 0.00 | 96.00 | 48.00 | 0.00 | 0.00 | 0.00 | 34.67 |
TFD-PCA-ELM | 97.00 | 0.00 | 48.50 | 20.00 | 52.00 | 36.00 | 0.00 | 1.00 | 0.50 | 28.33 |
Time-features-SVM | 100.0 | 37.00 | 68.50 | 77.00 | 0.00 | 38.50 | 58.00 | 38.00 | 48.00 | 51.67 |
Time-features-KNN | 92.00 | 22.00 | 57.00 | 100.00 | 0.00 | 50.00 | 92.00 | 22.00 | 57.00 | 54.67 |
Time-features-ELM | 98.00 | 18.00 | 58.00 | 40.00 | 0.00 | 20.00 | 69.00 | 31.00 | 50.00 | 42.67 |
Raw-data-WDCNN | 12.00 | 100.0 | 56.00 | 24.00 | 96.00 | 60.00 | 0.00 | 100.0 | 50.00 | 55.33 |
5 Conclusions
6 Nomenclature
Abbreviations | |
CNN | Convolutional neural network |
CWT | Continuous wavelet transform |
DCTN | Deep convolutional tree-inspired network |
DL | Deep learning |
DNN | Deep neural network |
ELM | Extreme learning machine |
KNN | k-nearest neighbor |
LBCNN | Local binary convolutional neural network |
PCA | Principal component analysis |
SVM | Support vector machine |
TFD | Time‒frequency distribution |
WDCNN | Wide deep convolutional neural network |
Variables | |
a | Stretch factor |
b | Shift factor |
CWT time−frequency function of signal s(t) | |
dj (j = 1, 2, …, K) | Distance between the feature and each classification hyperplane |
H(p, q) | Cross-entropy loss function |
Loss function of the tree-structured decision layer | |
K | Number of sample categories |
Overall prediction | |
L | Feature dimension of the fully-connected layer |
N | Number of samples |
p(·) | Probability distribution of the predicted output |
True labels of the pre-trained network | |
True labels of the tree-structured decision layer | |
Path probabilities of the tree-structured decision layer | |
P (subclass) | Probability of correct prediction for seed nodes |
P (superclass) | Probability of correct prediction of leaf nodes |
q(·) | Probability distribution of the actual output |
Predicted probabilities of the pre-trained network | |
Predicted probabilities of the tree-structured decision layer | |
R | Dimension of the TFD matrix |
s(t) | Signal in time t |
swj | Weight vector of the jth leaf note |
wj | Weight vector of the jth vector in weight matrix W of the fully-connected layer |
Weight vector of the jth tree-structured decision layer after fine-tuning | |
W | Weight matrix |
x | Input features of the Softmax classifier in the cross-entropy loss |
x | Input feature vector of the tree-structured decision layer |
Prediction probabilities by the Softmax classifier | |
Predicted probability for the jth category | |
Prediction scope corresponding to K categories | |
Weight adjusting the pre-trained decision and tree-structured decision | |
Mother wavelet |