A Grad-CAM and capsule network hybrid method for remote sensing image scene classification

Zhan HE, Chunju ZHANG, Shu WANG, Jianwei HUANG, Xiaoyun ZHENG, Weijie JIANG, Jiachen BO, Yucheng YANG

PDF(6041 KB)
PDF(6041 KB)
Front. Earth Sci. ›› 2024, Vol. 18 ›› Issue (3) : 538-553. DOI: 10.1007/s11707-022-1079-x
RESEARCH ARTICLE

A Grad-CAM and capsule network hybrid method for remote sensing image scene classification

Author information +
History +

Abstract

Remote sensing image scene classification and remote sensing technology applications are hot research topics. Although CNN-based models have reached high average accuracy, some classes are still misclassified, such as “freeway,” “spare residential,” and “commercial_area.” These classes contain typical decisive features, spatial-relation features, and mixed decisive and spatial-relation features, which limit high-quality image scene classification. To address this issue, this paper proposes a Grad-CAM and capsule network hybrid method for image scene classification. The Grad-CAM and capsule network structures have the potential to recognize decisive features and spatial-relation features, respectively. By using a pre-trained model, hybrid structure, and structure adjustment, the proposed model can recognize both decisive and spatial-relation features. A group of experiments is designed on three popular data sets with increasing classification difficulties. In the most advanced experiment, 92.67% average accuracy is achieved. Specifically, 83%, 75%, and 86% accuracies are obtained in the classes of “church,” “palace,” and “commercial_area,” respectively. This research demonstrates that the hybrid structure can effectively improve performance by considering both decisive and spatial-relation features. Therefore, Grad-CAM-CapsNet is a promising and powerful structure for image scene classification.

Graphical abstract

Keywords

image scene classification / CNN / Grad-CAM / CapsNet / DenseNet

Cite this article

Download citation ▾
Zhan HE, Chunju ZHANG, Shu WANG, Jianwei HUANG, Xiaoyun ZHENG, Weijie JIANG, Jiachen BO, Yucheng YANG. A Grad-CAM and capsule network hybrid method for remote sensing image scene classification. Front. Earth Sci., 2024, 18(3): 538‒553 https://doi.org/10.1007/s11707-022-1079-x

References

[1]
AbaiZ, Rajmalwar N (2019). DenseNet models for tiny imagenet classification. arXiv preprint arXiv: 1904.10429
[2]
Ahmed A, Jalal A, Kim K (2020). A novel statistical method for scene classification based on multi-object categorization and logistic regression.Sensors (Basel), 20(14): 3871
CrossRef Google scholar
[3]
Bai S (2016). Growing random forest on deep convolutional neural networks for scene categorization.Expert Systems with Applications, 71: 279–287
CrossRef Google scholar
[4]
CastelluccioM, PoggiG, SansoneC (2015). Land use classification in remote sensing images by convolutional neural networks. arXiv preprint arXiv: 1508.00092
[5]
Chaib S, Liu H, Gu Y, Yao H (2017). Deep feature fusion for VHR remote sensing scene classification.IEEE Trans Geosci Remote Sens, 55(8): 4775–4784
CrossRef Google scholar
[6]
Chen J, Wang C, Ma Z, Chen J, He D, Ackland S (2018). Remote sensing scene classification based on convolutional neural networks pre-trained using attention-guided sparse filters.Remote Sens (Basel), 10(2): 290
CrossRef Google scholar
[7]
Cheng G, Han J, Lu X (2017a). Remote sensing image scene classification: benchmark and state of the art.Proc IEEE, 105(10): 1865–1883
CrossRef Google scholar
[8]
Cheng G, Li Z, Yao X, Guo L, Wei Z (2017b). Remote sensing image scene classification using bag of convolutional features.IEEE Geosci Remote Sens Lett, 14(10): 1735–1739
CrossRef Google scholar
[9]
Cheng G, Yang C, Yao X, Guo L, Han J (2018). When deep learning meets metric learning: remote sensing image scene classification via learning discriminative CNNs.IEEE Trans Geosci Remote Sens, 56(5): 2811–2821
CrossRef Google scholar
[10]
Cheng G, Zhou P, Han J (2016). Learning rotation-invariant convolutional neural networks for object detection in VHR optical remote sensing images.IEEE Trans Geosci Remote Sens, 54(12): 7405–7415
CrossRef Google scholar
[11]
FanR, WangL, FengR (2019). Attention based residual network for high-resolution remote sensing imagery scene classification. In: IGARSS 2019–2019 IEEE International Geoscience and Remote Sensing Symposium. IEEE, 1346–1349
[12]
Gan J, Li Q, Zhang Z, Wang J (2016). Two-level feature representation for aerial scene classification.IEEE Geosci Remote Sens Lett, 13(11): 1626–1630
CrossRef Google scholar
[13]
Gong C, Han J, Lu X (2017). Remote sensing image scene classification: benchmark and state of the art.In: Proceedings of the IEEE, 105(10): 1865–1883
CrossRef Google scholar
[14]
HouQ, ZhouD, FengJ (2021). Coordinate attention for efficient mobile network design. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 13713–13722
[15]
KingmaD P, BaJ (2014). Adam: a method for stochastic optimization. arXiv preprint arXiv: 1412.6980
[16]
Knorn J, Rabe A, Radeloff V C, Kuemmerle T, Kozak J, Hostert P (2009). Land cover mapping of large areas using chain classification of neighboring Landsat satellite images.Remote Sens Environ, 113(5): 957–964
CrossRef Google scholar
[17]
Lei R, Zhang C, Liu W, Zhang L, Zhang X, Yang Y, Huang J, Li Z, Zhou Z (2021). Hyperspectral remote sensing image classification using deep convolutional capsule network.IEEE J Sel Top Appl Earth Obs Remote Sens, 14: 8297–8315
CrossRef Google scholar
[18]
Lei R, Zhang C, Zhang X, Huang J, Li Z, Liu W, Cui H (2022). Multiscale feature aggregation capsule neural network for hyperspectral remote sensing image classification.Remote Sens (Basel), 14(7): 1652
CrossRef Google scholar
[19]
Li J, Lin D, Wang Y, Xu G, Zhang Y, Ding C, Zhou Y (2020). Deep discriminative representation learning with attention map for scene classification.Remote Sens (Basel), 12(9): 1366
CrossRef Google scholar
[20]
LiuY, ChengM M, HuX (2017). Richer convolutional features for edge detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 5872–5881
[21]
Liu Y, Huang C (2018). Scene classification via triplet networks.IEEE J Sel Top Appl Earth Obs Remote Sens, 11(1): 220–237
CrossRef Google scholar
[22]
Marmanis D, Datcu M, Esch T, Stilla U (2016). Deep learning earth observation classification using imagenet pretrained networks.IEEE Geosci Remote Sens Lett, 13(1): 105–109
CrossRef Google scholar
[23]
Mei X, Pan E, Ma Y, Dai X, Huang J, Fan F, Du Q, Zheng H, Ma J (2019). Spectral-spatial attention networks for hyperspectral image classification.Remote Sens (Basel), 11(8): 963
CrossRef Google scholar
[24]
Pan Z, Xu J, Guo Y, Hu Y, Wang G (2020). Deep learning segmentation and classification for urban village using a worldview satellite image based on U-Net.Remote Sens (Basel), 12(10): 1574
CrossRef Google scholar
[25]
Pires de Lima R, Marfurt K (2019). Convolutional neural network for remote-sensing scene classification: transfer learning analysis.Remote Sens (Basel), 12(1): 86
CrossRef Google scholar
[26]
Raiyani K, Gonçalves T, Rato L, Salgueiro P, Marques da Silva J R (2021). Sentinel-2 image scene classification: a comparison between Sen2Cor and a machine learning approach.Remote Sens (Basel), 13(2): 300
CrossRef Google scholar
[27]
Raza A, Huo H, Sirajuddin S, Fang T (2020). Diverse capsules network combining multiconvolutional layers for remote sensing image scene classification.IEEE J Sel Top Appl Earth Obs Remote Sens, 13: 5297–5313
CrossRef Google scholar
[28]
SabourS, FrosstN, HintonG E (2017). Dynamic routing between capsules. In: Proceedings of the 31st International Conference on Neural Information Processing Systems (NIPS’17), 3859–3869
[29]
Sheng G, Yang W, Xu T, Sun H (2012). High-resolution satellite scene classification using a sparse coding based multiple feature combination.Int J Remote Sens, 33(8): 2395–2412
CrossRef Google scholar
[30]
Sun X, Zhu Q, Qin Q (2021). A multi-level convolution pyramid semantic fusion framework for high-resolution remote sensing image scene classification and annotation.IEEE Access, 9: 18195–18208
CrossRef Google scholar
[31]
Szegedy C, Ioffe S, Vanhoucke V, Alemi A A (2017). Inception-v4, inception-ResNet and the impact of residual connections on learning. In: Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence (AAAI'17). AAAI Press, 4278–4284
[32]
Tian T, Liu X, Wang L (2019a). Remote sensing scene classification based on res-capsnet. In: IGARSS 2019–2019 IEEE International Geoscience and Remote Sensing Symposium.IEEE, 2019: 525–528
[33]
Tian X, An J, Mu G (2019b). Power System Transient Stability Assessment Method Based on CapsNet. In: 2019 IEEE Innovative Smart Grid Technologies-Asia (ISGT Asia).IEEE, 2019: 1159–1164
[34]
Tong W, Chen W, Han W, Li X, Wang L (2020). Channel-attention-based DenseNet network for remote sensing image scene classification.IEEE J Sel Top Appl Earth Obs Remote Sens, 13: 4121–4132
CrossRef Google scholar
[35]
Vo T, Tran D, Ma W (2015). Tensor decomposition and application in image classification with histogram of oriented gradients.Neurocomputing, 165: 38–45
CrossRef Google scholar
[36]
Wang Y, Zhang J, Kan M (2020). Self-supervised equivariant attention mechanism for weakly supervised semantic segmentation.In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020: 12275–12284
[37]
Weng Q, Mao Z, Lin J, Guo W (2017). Land-use classification via extreme learning classifier based on deep convolutional features.IEEE Geosci Remote Sens Lett, 14(5): 704–708
CrossRef Google scholar
[38]
Xia G S, Hu J, Hu F, Shi B, Bai X, Zhong Y, Zhang L, Lu X (2017). AID: A benchmark data set for performance evaluation of aerial scene classification.IEEE Trans Geosci Remote Sens, 55(7): 3965–3981
CrossRef Google scholar
[39]
Yang Y, Newsam S (2010). Bag-of-visual-words and spatial extensions for land-use classification.In: Proceedings of the 18th SIGSPATIAL International Conference on Advances in Geographic Information Systems, 2010: 270–279
[40]
Yu C, Gao C, Wang J, Yu G, Shen C, Sang N (2021). Bisenet v2: Bilateral network with guided aggregation for real-time semantic segmentation.Int J Comput Vis, 129(11): 3051–3068
CrossRef Google scholar
[41]
Yu Y, Liu F (2018a). A two-stream deep fusion framework for high-resolution aerial scene classification.Comput Intell Neurosci, 2018: 8639367
CrossRef Google scholar
[42]
Yu Y, Liu F (2018b). Dense connectivity based two-stream deep feature fusion framework for aerial scene classification.Remote Sens (Basel), 10(7): 1158
CrossRef Google scholar
[43]
Zhang W, Tang P, Zhao L (2019). Remote sensing image scene classification using CNN-CapsNet.Remote Sens (Basel), 11(5): 494
CrossRef Google scholar
[44]
Zhang X, Wang G, Zhao S G (2022). CapsNet-COVID19: Lung CT image classification method based on CapsNet model.Math Biosci Eng, 19(5): 5055–5074
CrossRef Google scholar
[45]
Zhao B, Zhong Y, Zhang L, Huang B (2016). The Fisher kernel coding framework for high spatial resolution scene classification.Remote Sens (Basel), 8(2): 157
CrossRef Google scholar
[46]
Zhao D, Chen Y, Lv L (2017). Deep reinforcement learning with visual attention for vehicle classification.IEEE Trans Cogn Dev Syst, 9(4): 356–367
CrossRef Google scholar
[47]
Zhao X, Zhang J, Tian J, Zhuo L, Zhang J (2020). Residual dense network based on channel-spatial attention for the scene classification of a high-resolution remote sensing image.Remote Sens (Basel), 12(11): 1887
CrossRef Google scholar
[48]
Zhou B, Khosla A, Lapedriza A (2016). Learning deep features for discriminative localization.Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016: 2921–2929

Acknowledgments

This research was funded by the open fund of the Key Laboratory of Jianghuai Arable Land Resources Protection and Eco-restoration (Ministry of Natural Resources) (No. 2022-ARPE-KF04), and the Open Fund of Key Laboratory of Urban Land Resources Monitoring and Simulation (Ministry of Natural Resources) (No. KF-2020-05-084).

Competing interests

The authors declare that they have no competing interests.

RIGHTS & PERMISSIONS

2024 Higher Education Press
审图号:GS京(2024)1973号
AI Summary AI Mindmap
PDF(6041 KB)

Accesses

Citations

Detail

Sections
Recommended

/