Hook, line, and spectra: machine learning for fish species identification and body part classification using rapid evaporative ionization mass spectrometry
Jesse Wood , Bach Nguyen , Bing Xue , Mengjie Zhang , Daniel Killeen
Intelligent Marine Technology and Systems ›› 2025, Vol. 3 ›› Issue (1) : 16
Hook, line, and spectra: machine learning for fish species identification and body part classification using rapid evaporative ionization mass spectrometry
Marine biomass composition analysis traditionally requires time-consuming processes and domain expertise. This study demonstrates the effectiveness of rapid evaporative ionization mass spectrometry (REIMS) combined with advanced machine learning (ML) techniques for accurate marine biomass composition determination. Using fish species and body parts as model systems representing diverse biochemical profiles, we investigate various ML methods, including unsupervised pretraining strategies for transformers. The deep learning approaches consistently outperformed traditional machine learning across all tasks. For fish species classification, the pretrained transformer achieved 99.62% accuracy, and for fish body parts classification, the transformer achieved 84.06% accuracy. We further explored the explainability of the best-performing and predominantly black box models using local interpretable model-agnostic explanations and gradient-weighted class activation mapping to identify the important features driving the decisions behind each of the best performing classifiers. REIMS analysis with ML can be an accurate and potentially explainable technique for automated marine biomass composition analysis. Thus, REIMS analysis with ML has potential applications in quality control, product optimization, and food safety monitoring in marine-based industries.
AI applications / Explainable AI / Machine learning / Marine biomass / Mass spectrometry / Multidisciplinary AI
| [1] |
|
| [2] |
Ba JL, Kiros JR, Hinton GE (2016) Layer normalization. Preprint at arXiv:1607.06450 |
| [3] |
|
| [4] |
|
| [5] |
Bettjeman BI, Hofman KA, Burgess EJ, Perry NB, Killeen DP (2018) Seafood phospholipids: extraction efficiency and phosphorous nuclear magnetic resonance spectroscopy (31P NMR) profiles. J Am Oil Chem Soc 95(7):779–786. https://doi.org/10.1002/aocs.12086 |
| [6] |
|
| [7] |
|
| [8] |
|
| [9] |
|
| [10] |
|
| [11] |
|
| [12] |
Devlin J, Chang MW, Lee K, Toutanova K (2018) BERT: pre-training of deep bidirectional transformers for language understanding. Preprint at arXiv:1810.04805 |
| [13] |
FAO. The state of world fisheries and aquaculture, 2020. FAO, Rome., 2020 |
| [14] |
Fix E, Hodges JL (1989) Discriminatory analysis. Nonparametric discrimination: consistency properties. Int Stat Rev 57(3):238–247. https://doi.org/10.2307/1403797 |
| [15] |
|
| [16] |
Goodfellow I, Bengio Y, Courville A (2016) Deep learning. MIT Press. Accessed 2 Jan 2025. http://www.deeplearningbook.org |
| [17] |
Gu A, Dao T (2023) Mamba: linear-time sequence modeling with selective state spaces. Preprint at arXiv:2312.00752 |
| [18] |
|
| [19] |
|
| [20] |
He KM, Zhang XY, Ren SQ, Sun J (2016) Deep residual learning for image recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, pp 770–778. https://doi.org/10.1109/CVPR.2016.90 |
| [21] |
Hendrycks D, Gimpel K (2016) Gaussian error linear units (GELUs). Preprint at arXiv:1606.08415 |
| [22] |
Ho TK (1995) Random decision forests. In: Proceedings of 3rd International Conference on Document Analysis and Recognition. IEEE, pp 278–282. https://doi.org/10.1109/ICDAR.1995.598994 |
| [23] |
|
| [24] |
|
| [25] |
Kingma DP, Ba J (2014) Adam: a method for stochastic optimization. Preprint at arXiv:1412.6980 |
| [26] |
Kingma DP, Welling M (2013) Auto-encoding variational bayes. Preprint at arXiv:1312.6114 |
| [27] |
|
| [28] |
Köppen M (2000) The curse of dimensionality. In: 5th Online World Conference on Soft Computing in Industrial Applications (WSC5). pp 4–8 |
| [29] |
|
| [30] |
|
| [31] |
|
| [32] |
|
| [33] |
|
| [34] |
Liu ZM, Wang YX, Vaidya S, Ruehle F, Halverson J, Soljačić M et al (2024) KAN: Kolmogorov-Arnold networks. Preprint at arXiv:2404.19756 |
| [35] |
Loshchilov I, Hutter F (2017) Decoupled weight decay regularization. Preprint at arXiv:1711.05101 |
| [36] |
McCann S, Lowe DG (2012) Local naive bayes nearest neighbor for image classification. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, pp 3650–3656 |
| [37] |
Ministry for Primary Industries (2024) Hoki: New Zealand’s largest fishery. https://www.mpi.govt.nz/fishing-aquaculture/fisheries-management/fish-stock-status/hoki-new-zealands-largest-fishery/. Accessed 6 Jan 2025. |
| [38] |
Morgan N, Bourlard H (1989) Generalization and parameter estimation in feedforward nets: some experiments. In: Proceedings of the 3rd International Conference on Neural Information Processing Systems. MIT Press, pp 630–637 |
| [39] |
Panse ML, Phalke SD (2016) World market of omega-3 fatty acids. In: Hegde M et al (eds) Omega-3 fatty acids. Springer, Cham, pp 79–88. https://doi.org/10.1007/978-3-319-40458-5_7 |
| [40] |
|
| [41] |
|
| [42] |
Pearl H (2016) Melbourne restaurant hunky dory accused of serving catfish to customers instead of dory. In: Daily Mail Australia, May 2016. https://www.dailymail.co.uk/news/article-3611999/Melbourne-restaurant-Hunky-Dory-accused-serving-catfish-customers-instead-dory.html. Accessed 4 Jan 2025. |
| [43] |
|
| [44] |
Plant and Food Research (2020) New research to maximise value from seafood resources - plant & food research. https://www.plantandfood.com/en-nz/article/new-research-to-maximise-value-from-seafood-resources. Accessed 2 Jan 2025. |
| [45] |
Ribeiro MT, Singh S, Guestrin C (2016) "Why should I trust you?": explaining the predictions of any classifier. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, pp 1135–1144. https://doi.org/10.1145/2939672.2939778 |
| [46] |
Selvaraju RR, Cogswell M, Das A, Vedantam R, Parikh D, Batra D (2017) Grad-CAM: visual explanations from deep networks via gradient-based localization. In: 2017 IEEE International Conference on Computer Vision (ICCV). IEEE, pp 618–626. https://doi.org/10.1109/ICCV.2017.74 |
| [47] |
|
| [48] |
|
| [49] |
Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z (2016) Rethinking the inception architecture for computer vision. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, pp 2818–2826. https://doi.org/10.1109/CVPR.2016.308 |
| [50] |
|
| [51] |
|
| [52] |
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN et al (2017) Attention is all you need. In: 31st Conference on Neural Information Processing Systems (NIPS 2017). pp 1–11 |
| [53] |
Wang J, Ji T, Wu YB, Yan H, Gui T, Zhang Q et al (2024) Length generalization of causal transformers without position encoding. Preprint at arXiv:2404.12224 |
| [54] |
Wood J, Nguyen BH, Xue B, Zhang MJ, Killeen D (2022) Automated fish classification using unprocessed fatty acid chromatographic data: a machine learning approach. In: Aziz H (eds) AI 2022: Advances in Artificial Intelligence. Lecture notes in computer science, vol 13728. Springer, Cham, pp 516–529. https://doi.org/10.1007/978-3-031-22695-3_36 |
| [55] |
Xiong RB, Yang YC, He D, Zheng K, Zheng SX, Xing C et al (2020) On layer normalization in the transformer architecture. In: Proceedings of the 37th International Conference on Machine Learning. PMLR, pp 10524–10533 |
The Author(s)
/
| 〈 |
|
〉 |