MetMiner: A user-friendly pipeline for large-scale plant metabolomics data analysis

Xiao Wang , Shuang Liang , Wenqi Yang , Ke Yu , Fei Liang , Bing Zhao , Xiang Zhu , Chao Zhou , Luis A. J. Mur , Jeremy A. Roberts , Junli Zhang , Xuebin Zhang

Journal of Integrative Plant Biology ›› 2024, Vol. 66 ›› Issue (11) : 2329 -2345.

PDF
Journal of Integrative Plant Biology ›› 2024, Vol. 66 ›› Issue (11) : 2329 -2345. DOI: 10.1111/jipb.13774
New Technology

MetMiner: A user-friendly pipeline for large-scale plant metabolomics data analysis

Author information +
History +
PDF

Abstract

The utilization of metabolomics approaches to explore the metabolic mechanisms underlying plant fitness and adaptation to dynamic environments is growing, highlighting the need for an efficient and user-friendly toolkit tailored for analyzing the extensive datasets generated by metabolomics studies. Current protocols for metabolome data analysis often struggle with handling large-scale datasets or require programming skills. To address this, we present MetMiner (https://github.com/ShawnWx2019/MetMiner), a user-friendly, full-functionality pipeline specifically designed for plant metabolomics data analysis. Built on R shiny, MetMiner can be deployed on servers to utilize additional computational resources for processing large-scale datasets. MetMiner ensures transparency, traceability, and reproducibility throughout the analytical process. Its intuitive interface provides robust data interaction and graphical capabilities, enabling users without prior programming skills to engage deeply in data analysis. Additionally, we constructed and integrated a plant-specific mass spectrometry database into the MetMiner pipeline to optimize metabolite annotation. We have also developed MDAtoolkits, which include a complete set of tools for statistical analysis, metabolite classification, and enrichment analysis, to facilitate the mining of biological meaning from the datasets. Moreover, we propose an iterative weighted gene co-expression network analysis strategy for efficient biomarker metabolite screening in large-scale metabolomics data mining. In two case studies, we validated MetMiner’s efficiency in data mining and robustness in metabolite annotation. Together, the MetMiner pipeline represents a promising solution for plant metabolomics analysis, providing a valuable tool for the scientific community to use with ease.

Keywords

data mining / iterative WGCNA / metabolomics / pipeline / shinyapp

Cite this article

Download citation ▾
Xiao Wang, Shuang Liang, Wenqi Yang, Ke Yu, Fei Liang, Bing Zhao, Xiang Zhu, Chao Zhou, Luis A. J. Mur, Jeremy A. Roberts, Junli Zhang, Xuebin Zhang. MetMiner: A user-friendly pipeline for large-scale plant metabolomics data analysis. Journal of Integrative Plant Biology, 2024, 66(11): 2329-2345 DOI:10.1111/jipb.13774

登录浏览全文

4963

注册一个新账户 忘记密码

References

[1]

Adusumilli, R., and Mallick, P. (2017). Data conversion with ProteoWizard msConvert. Methods Mol. Biol. 1550:339–368.

[2]

Beekwilder, J.,van Leeuwen, W.,van Dam, N.M.,Bertossi, M.,Grandi, V.,Mizzi, L.,Soloviev, M.,Szabados, L.,Molthoff, J.W.,Schipper, B., et al. (2008). The impact of the absence of aliphatic glucosinolates on insect herbivory in Arabidopsis. PLoS ONE 3: e2068.

[3]

Beeley, C. (2016). Web application development with R using Shiny. Packt Publishing Ltd,Birmingham, UK.

[4]

Caspi, R.,Billington, R.,Ferrer, L.,Foerster, H.,Fulcher, C.A.,Keseler, I.M.,Kothari, A.,Krummenacker, M.,Latendresse, M.,Mueller, L.A., et al. (2016). The MetaCyc database of metabolic pathways and enzymes and the BioCyc collection of pathway/genome databases. Nucleic Acids Res. 44: D471–D480.

[5]

Chen, C.J.,Wu, Y.,Li, J.W.,Wang, X.,Zeng, Z.H.,Xu, J.,Liu, Y.L.,Feng, J.T.,Chen, H.,He, Y.H., et al. (2023). TBtools-II: A “one for all, all for one” bioinformatics platform for biological big-data mining. Mol. Plant 16:1733–1742.

[6]

Chen, L.,Lu, W.Y.,Wang, L.,Xing, X.,Chen, Z.Y.,Teng, X.,Zeng, X.F.,Muscarella, A.D.,Shen, Y.H.,Cowan, A., et al. (2021). Metabolite discovery through global annotation of untargeted metabolomics data. Nat. Methods 18:1377–1385.

[7]

Chen, W.,Gong, L.,Guo, Z.L.,Wang, W.S.,Zhang, H.Y.,Liu, X.Q.,Yu, S.B.,Xiong, L.Z., and Luo, J. (2013). A novel integrated method for large-scale detection, identification, and quantification of widely targeted metabolites: Application in the study of rice metabolomics. Mol. Plant 6:1769–1780.

[8]

Cui, H.,Chen, Y.,Li, K.,Zhan, R.,Zhao, M.,Xu, Y.,Lin, Z.,Fu, Y.,He, Q., and Tang, P.C. (2021). Untargeted metabolomics identifies succinate as a biomarker and therapeutic target in aortic aneurysm and dissection. Eur. Heart J. 42:4373–4385.

[9]

Djoumbou Feunang, Y.,Eisner, R.,Knox, C.,Chepelev, L.,Hastings, J.,Owen, G.,Fahy, E.,Steinbeck, C.,Subramanian, S., and Bolton, E. (2016). ClassyFire: Automated chemical classification with a comprehensive, computable taxonomy. J. Cheminform. 8:61.

[10]

Duhrkop, K.,Fleischauer, M.,Ludwig, M.,Aksenov, A.A.,Melnik, A.V.,Meusel, M.,Dorrestein, P.C.,Rousu, J., and Bocker, S. (2019). SIRIUS 4: A rapid tool for turning tandem mass spectra into metabolite structure information. Nat. Methods 16:299–302.

[11]

Dunn, W.B.,Broadhurst, D.,Begley, P.,Zelena, E.,Francis-McIntyre, S.,Anderson, N.,Brown, M.,Knowles, J.D.,Halsall, A.,Haselden, J.N., et al. (2011). Procedures for large-scale metabolic profiling of serum and plasma using gas chromatography and liquid chromatography coupled to mass spectrometry. Nat. Protoc. 6:1060–1083.

[12]

Halkier, B.A., and Gershenzon, J. (2006). Biology and biochemistry of glucosinolates. Annu. Rev. Plant Biol. 57:303–333.

[13]

Han, S.,Van Treuren, W.,Fischer, C.R.,Merrill, B.D.,DeFelice, B.C.,Sanchez, J.M.,Higginbottom, S.K.,Guthrie, L.,Fall, L.A.,Dodd, D., et al. (2021). A metabolomics pipeline for the mechanistic interrogation of the gut microbiome. Nature 595:415–420.

[14]

Hawkins, C.,Ginzburg, D.,Zhao, K.M.,Dwyer, W.,Xue, B.,Xu, A.,Rice, S.,Cole, B.,Paley, S.,Karp, P., et al. (2021). Plant Metabolic Network 15: A resource of genome-wide metabolism databases for 126 plants and algae. J. Integr. Plant Biol. 63:1888–1905.

[15]

Hazrati, H.,Kudsk, P.,Ding, L.,Uthe, H., and Fomsgaard, I.S. (2022). Integrated LC–MS and GC–MS-based metabolomics reveal the effects of plant competition on the rye metabolome. J. Agric. Food Chem. 70:3056–3066.

[16]

Hellens, A.M.,Chabikwa, T.G.,Fichtner, F.,Brewer, P.B., and Beveridge, C.A. (2023). Identification of new potential downstream transcriptional targets of the strigolactone pathway including glucosinolate biosynthesis. Plant Direct 7: e486.

[17]

Heuckeroth, S.,Damiani, T.,Smirnov, A.,Mokshyna, O.,Brungs, C.,Korf, A.,Smith, J.D.,Stincone, P.,Dreolin, N.,Nothias, L.F., et al. (2024). Reproducible mass spectrometry data processing and compound annotation in MZmine 3. Nat. Protoc. 1:45.

[18]

Horai, H.,Arita, M.,Kanaya, S.,Nihei, Y.,Ikeda, T.,Suwa, K.,Ojima, Y.,Tanaka, K.,Tanaka, S.,Aoshima, K., et al. (2010). MassBank: A public repository for sharing mass spectral data for life sciences. J. Mass Spectrom. 45:703–714.

[19]

Joshi, S.K.,Nechiporuk, T.,Bottomly, D.,Piehowski, P.D.,Reisz, J.A.,Pittsenbarger, J.,Kaempf, A.,Gosline, S.J.,Wang, Y.-T., and Hansen, J.R. (2021). The AML microenvironment catalyzes a stepwise evolution to gilteritinib resistance. Cancer Cell 39:999–1014.e1018.

[20]

Kanehisa, M., and Goto, S. (2000). KEGG: Kyoto encyclopedia of genes and genomes. Nucleic Acids Res. 28:27–30.

[21]

Langfelder, P., and Horvath, S. (2008). WGCNA: An R package for weighted correlation network analysis. BMC Bioinformatics 9:1–13.

[22]

Nothias, L.F.,Petras, D.,Schmid, R.,Dührkop, K.,Rainer, J.,Sarvepalli, A.,Protsyuk, I.,Ernst, M.,Tsugawa, H.,Fleischauer, M., et al. (2020). Feature-based molecular networking in the GNPS analysis environment. Nat. Methods 17:905–980.

[23]

Pang, Z.Q.,Lu, Y.,Zhou, G.Y.,Hui, F.A.,Xu, L.,Viau, C.,Spigelman, A.F.,Macdonald, P.E.,Wishart, D.S.,Li, S.Z., et al. (2024). MetaboAnalyst 6.0: Towards a unified platform for metabolomics data processing, analysis and interpretation. Nucleic Acids Res. 8: gkae253.

[24]

Pomyen, Y.,Wanichthanarak, K.,Poungsombat, P.,Fahrmann, J.,Grapov, D., and Khoomrung, S. (2020). Deep metabolome: Applications of deep learning in metabolomics. Comput. Struct. Biotechnol. J. 18:2818–2825.

[25]

Sawada, Y.,Nakabayashi, R.,Yamada, Y.,Suzuki, M.,Sato, M.,Sakata, A.,Akiyama, K.,Sakurai, T.,Matsuda, F.,Aoki, T., et al. (2012). RIKEN tandem mass spectral database (ReSpect) for phytochemicals: A plant-specific MS/MS-based data resource and database. Phytochemistry 82:38–45.

[26]

Scholl, R.L.,May, S.T., and Ware, D.H. (2000). Seed and molecular resources for Arabidopsis. Plant Physiol. 124:1477–1480.

[27]

Shen, X.T.,Wang, R.H.,Xiong, X.,Yin, Y.D.,Cai, Y.P.,Ma, Z.J.,Liu, N., and Zhu, Z.J. (2019). Metabolic reaction network-based recursive metabolite annotation for untargeted metabolomics. Nat. Commun. 10:1516.

[28]

Shen, X.T.,Wu, S.,Liang, L.,Chen, S.J.,Contrepois, K.,Zhu, Z.J., and Snyder, M. (2022a). metID: an R package for automatable compound annotation for LC–MS-based data. Bioinformatics 38:568–569.

[29]

Shen, X.T.,Yan, H.,Wang, C.C.,Gao, P.,Johnson, C.H., and Snyder, M.P. (2022b). TidyMass an object-oriented reproducible analysis framework for LC-MS data. Nat. Commun. 13:4365.

[30]

Shen, X.T., and Zhu, Z.J. (2019). MetFlow: An interactive and integrated workflow for metabolomics data cleaning and differential metabolite discovery. Bioinformatics 35:2870–2872.

[31]

Shinbo, Y.,Nakamura, Y.,Altaf-Ul-Amin, M.,Asahi, H.,Kurokawa, K.,Arita, M.,Saito, K.,Ohta, D.,Shibata, D., and Kanaya, S. (2006). KNApSAcK: A comprehensive species-metabolite relationship database. In Plant Metabolomics. Biotechnology in Agriculture and Forestry. Saito K.,Dixon R.A.,Willmitzer L., eds, Volume 57 Berlin, Heidelberg: Springer), pp. 165–181.

[32]

Smith, C.A.,Want, E.J.,O’Maille, G.,Abagyan, R., and Siuzdak, G. (2006). XCMS: Processing mass spectrometry data for metabolite profiling using nonlinear peak alignment, matching, and identification. Anal. Chem. 78:779–787.

[33]

Sumner, L.W.,Amberg, A.,Barrett, D.,Beale, M.H.,Beger, R.,Daykin, C.A.,Fan, T.W.,Fiehn, O.,Goodacre, R.,Griffin, J.L., et al. (2007). Proposed minimum reporting standards for chemical analysis Chemical Analysis Working Group (CAWG) Metabolomics Standards Initiative (MSI). Metabolomics 3:211–221.

[34]

Sun, J.,Zhao, H.,Wu, F.,Zhu, M.,Zhang, Y.,Cheng, N.,Xue, X.,Wu, L., and Cao, W. (2021). Molecular mechanism of mature honey formation by GC–MS-and LC–MS-based metabolomics. J. Agric. Food Chem. 69:3362–3370.

[35]

Tal, L.,Palayam, M.,Ron, M.,Young, A.,Britt, A., and Shabek, N. (2022). A conformational switch in the SCF-D3/MAX2 ubiquitin ligase facilitates strigolactone signalling. Nat. Plants 8:561–573.

[36]

Tautenhahn, R.,Patti, G.J.,Rinehart, D., and Siuzdak, G. (2012). XCMS online: A web-based platform to process untargeted metabolomic data. Anal. Chem. 84:5035–5039.

[37]

Thevenot, E.A. (2016). ropls: PCA, PLS (-DA) and OPLS (-DA) for multivariate analysis and feature selection of omics data. R package version.

[38]

Tian, Z.T.,Hu, X.,Xu, Y.Y.,Liu, M.M.,Liu, H.B.,Li, D.Q.,Hu, L.S.,Wei, G.Z., and Chen, W. (2024). PMhub 1.0: A comprehensive plant metabolome database. Nucleic Acids Res. 52: D1579–D1587.

[39]

Tsugawa, H.,Cajka, T.,Kind, T.,Ma, Y.,Higgins, B.,Ikeda, K.,Kanazawa, M.,VanderGheynst, J.,Fiehn, O., and Arita, M. (2015). MS-DIAL: Data-independent MS/MS deconvolution for comprehensive metabolome analysis. Nat. Methods 12:523–526.

[40]

Wang, Y.,Sun, S.,Zhu, W.,Jia, K.,Yang, H., and Wang, X. (2013). Strigolactone/MAX2-induced degradation of brassinosteroid transcriptional effector BES1 regulates shoot branching. Dev. Cell 27:681–688.

[41]

Wishart, D.S.,Guo, A.C.,Oler, E.,Wang, F.,Anjum, A.,Peters, H.,Dizon, R.,Sayeeda, Z.,Tian, S.Y.,Lee, B.L., et al. (2022). HMDB 5.0: The Human Metabolome Database for 2022. Nucleic Acids Res. 50: D622–D631.

[42]

Wu, T.,Hu, E.,Xu, S.,Chen, M.,Guo, P.,Dai, Z.,Feng, T.,Zhou, L.,Tang, W., and Zhan, L. (2021). clusterProfiler 4.0: A universal enrichment tool for interpreting omics data. Innovation 2:100141.

[43]

Yu, K.,Yang, W.Q.,Zhao, B.,Wang, L.,Zhang, P.,Ouyang, Y.,Chang, Y.K.,Chen, G.Z.,Zhang, J.L.,Wang, S.J., et al. (2022). The Kelch-F-box protein SMALL AND GLOSSY LEAVES 1 (SAGL1) negatively influences salicylic acid biosynthesis in by promoting the turn-over of transcription factor SYSTEMIC ACQUIRED RESISTANCE DEFICIENT 1 (SARD1). New Phytol. 235:885–897.

[44]

Yu, S.I.,Kim, H.,Yun, D.J.,Suh, M.C., and Lee, B.H. (2019). Post-translational and transcriptional regulation of phenylpropanoid biosynthesis pathway by Kelch repeat F-box protein SAGL1. Plant Mol. Biol. 99:135–148.

[45]

Zhang, M.,Wang, Y.,Zhang, Q.,Wang, C.,Zhang, D.,Wan, J.-B., and Yan, C. (2018). UPLC/Q-TOF-MS-based metabolomics study of the anti-osteoporosis effects of Achyranthes bidentata polysaccharides in ovariectomized rats. Int. J. Biol. Macromol. 112:433–441.

[46]

Zhang, X.,Gou, M.,Guo, C.,Yang, H., and Liu, C.J. (2015). Down-regulation of Kelch domain-containing F-box protein in Arabidopsis enhances the production of (poly)phenols and tolerance to ultraviolet radiation. Plant Physiol. 167:337–350.

[47]

Zhang, X.B.,Abrahan, C.,Colquhoun, T.A., and Liu, C.J. (2017). A proteolytic regulator controlling chalcone synthase stability and flavonoid biosynthesis in Arabidopsis. Plant Cell 29:1157–1174.

[48]

Zhang, X.B.,Gonzalez-Carranza, Z.H.,Zhang, S.L.,Miao, Y.C.,Liu, C.J., and Roberts, J.A. (2019). F-box proteins in plants. Annu. Plant Rev. Online 2:307–327.

[49]

Zhang, X.B.,Gou, M.Y., and Liu, C.J. (2013). Kelch repeat F-box proteins regulate phenylpropanoid biosynthesis via controlling the turnover of phenylalanine ammonia-lyase. Plant Cell 25:4994–5010.

[50]

Zheng, F.J.,Zhao, X.J.,Zeng, Z.D.,Wang, L.C.,Lv, W.J.,Wang, Q.Q., and Xu, G.W. (2020). Development of a plasma pseudotargeted metabolomics method based on ultra-high-performance liquid chromatography-mass spectrometry. Nat. Protoc. 15:2519–2537.

RIGHTS & PERMISSIONS

2024 The Author(s). Journal of Integrative Plant Biology published by John Wiley & Sons Australia, Ltd on behalf of Institute of Botany, Chinese Academy of Sciences.

AI Summary AI Mindmap
PDF

200

Accesses

0

Citation

Detail

Sections
Recommended

AI思维导图

/