LIBRA: an adaptative integrative tool for paired single-cell multi-omics data

Xabier Martinez-de-Morentin , Sumeer A. Khan , Robert Lehmann , Sisi Qu , Alberto Maillo , Narsis A. Kiani , Felipe Prosper , Jesper Tegner , David Gomez-Cabrero

Quant. Biol. ›› 2023, Vol. 11 ›› Issue (3) : 246 -259.

PDF (7940KB)
Quant. Biol. ›› 2023, Vol. 11 ›› Issue (3) : 246 -259. DOI: 10.15302/J-QB-022-0318
RESEARCH ARTICLE
RESEARCH ARTICLE

LIBRA: an adaptative integrative tool for paired single-cell multi-omics data

Author information +
History +
PDF (7940KB)

Abstract

Background: Single-cell multi-omics technologies allow a profound system-level biology understanding of cells and tissues. However, an integrative and possibly systems-based analysis capturing the different modalities is challenging. In response, bioinformatics and machine learning methodologies are being developed for multi-omics single-cell analysis. It is unclear whether current tools can address the dual aspect of modality integration and prediction across modalities without requiring extensive parameter fine-tuning.

Methods: We designed LIBRA, a neural network based framework, to learn translation between paired multi-omics profiles so that a shared latent space is constructed. Additionally, we implemented a variation, aLIBRA, that allows automatic fine-tuning by identifying parameter combinations that optimize both the integrative and predictive tasks. All model parameters and evaluation metrics are made available to users with minimal user iteration. Furthermore, aLIBRA allows experienced users to implement custom configurations. The LIBRA toolbox is freely available as R and Python libraries at GitHub (TranslationalBioinformaticsUnit/LIBRA).

Results: LIBRA was evaluated in eight multi-omic single-cell data-sets, including three combinations of omics. We observed that LIBRA is a state-of-the-art tool when evaluating the ability to increase cell-type (clustering) resolution in the integrated latent space. Furthermore, when assessing the predictive power across data modalities, such as predictive chromatin accessibility from gene expression, LIBRA outperforms existing tools. As expected, adaptive parameter optimization (aLIBRA) significantly boosted the performance of learning predictive models from paired data-sets.

Conclusion: LIBRA is a versatile tool that performs competitively in both “integration” and “prediction” tasks based on single-cell multi-omics data. LIBRA is a data-driven robust platform that includes an adaptive learning scheme.

Graphical abstract

Keywords

single-cell / multi-omic / Autoencoder / auto-finetuning

Cite this article

Download citation ▾
Xabier Martinez-de-Morentin, Sumeer A. Khan, Robert Lehmann, Sisi Qu, Alberto Maillo, Narsis A. Kiani, Felipe Prosper, Jesper Tegner, David Gomez-Cabrero. LIBRA: an adaptative integrative tool for paired single-cell multi-omics data. Quant. Biol., 2023, 11(3): 246-259 DOI:10.15302/J-QB-022-0318

登录浏览全文

4963

注册一个新账户 忘记密码

References

[1]

Chen, S., Lake, B. B. (2019). High-throughput sequencing of the transcriptome and chromatin accessibility in the same cell. Nat. Biotechnol., 37: 1452–1457

[2]

Cao, J., Cusanovich, D. Ramani, V., Aghamirzaie, D., Pliner, H. Hill, A. Daza, R. McFaline-Figueroa, J. Packer, J. Christiansen, L. . (2018). Joint profiling of chromatin accessibility and gene expression in thousands of single cells. Science, 361: 1380–1385

[3]

Ma, S., Zhang, B., LaFave, L. M., Earl, A. S., Chiang, Z., Hu, Y., Ding, J., Brack, A., Kartha, V. K., Tay, T. . (2020). Chromatin potential identified by shared single-cell profiling of RNA and chromatin. Cell, 183: 1103–1116.e20

[4]

Zhu, C., Yu, M., Huang, H., Juric, I., Abnousi, A., Hu, R., Lucero, J., Behrens, M. M., Hu, M. (2019). An ultra high-throughput method for single-cell joint analysis of open chromatin and transcriptome. Nat. Struct. Mol. Biol., 26: 1063–1070

[5]

Stuart, T., Butler, A., Hoffman, P., Hafemeister, C., Papalexi, E., Mauck, W. M. Hao, Y., Stoeckius, M., Smibert, P. (2019). Comprehensive integration of single-cell data. Cell, 177: 1888–1902.e21

[6]

Stoeckius, M., Hafemeister, C., Stephenson, W., Houck-Loomis, B., Chattopadhyay, P. K., Swerdlow, H., Satija, R. (2017). Simultaneous epitope and transcriptome measurement in single cells. Nat. Methods, 14: 865–868

[7]

Clark, S. J., Argelaguet, R., Kapourani, C. A., Stubbs, T. M., Lee, H. J., Alda-Catalinas, C., Krueger, F., Sanguinetti, G., Kelsey, G., Marioni, J. C. . (2018). scNMT-seq enables joint profiling of chromatin accessibility DNA methylation and transcription in single cells. Nat. Commun., 9: 781

[8]

Argelaguet, R., Cuomo, A. S. E., Stegle, O. Marioni, J. (2021). Computational principles and challenges in single-cell data integration. Nat. Biotechnol., 39: 1202–1215

[9]

Rohart, F., Gautier, B., Singh, A. Cao, K. (2017). mixOmics: An R package for ‘omics feature selection and multiple data integration. PLOS Comput. Biol., 13: e1005752

[10]

Argelaguet, R., Velten, B., Arnol, D., Dietrich, S., Zenz, T., Marioni, J. C., Buettner, F., Huber, W. (2018). Multi-omics factor analysis−a framework for unsupervised integration of multi-omics data sets. Mol. Syst. Biol., 14: e8124

[11]

Lock, E. F., Hoadley, K. A., Marron, J. S. Nobel, A. (2013). Joint and individual variation explained (Jive) for integrated analysis of multiple data types. Ann. Appl. Stat., 7: 523–542

[12]

Teschendorff, A. E., Jing, H., Paul, D. S., Virta, J. (2018). Tensorial blind source separation for improved analysis of multi-omic data. Genome Biol., 19: 76

[13]

Gomez-Cabrero, D., Tarazona, S., s-Vidal, I., Ramirez, R. N., Company, C., Schmidt, A., Reijmers, T., Paul, V. V. S., Marabita, F., guez-Ubreva, J. . (2019). STATegra, a comprehensive multi-omics dataset of B-cell differentiation in mouse. Sci. Data, 6: 256

[14]

Stegle, O., Teichmann, S. A. Marioni, J. (2015). Computational and analytical challenges in single-cell transcriptomics. Nat. Rev. Genet., 16: 133–145

[15]

Perkel, J. (2021). Single-cell analysis enters the multiomics age. Nature, 595: 614–616

[16]

Marx, V. (2022). How single-cell multi-omics builds relationships. Nat. Methods, 19: 142–146

[17]

Argelaguet, R., Arnol, D., Bredikhin, D., Deloro, Y., Velten, B., Marioni, J. C. (2020). MOFA+: a statistical framework for comprehensive integration of multi-modal single-cell data. Genome Biol., 21: 111

[18]

Hao, Y., Hao, S., Andersen-Nissen, E., Mauck, W. M. Zheng, S., Butler, A., Lee, M. J., Wilk, A. J., Darby, C., Zager, M. . (2021). Integrated analysis of multimodal single-cell data. Cell, 184: 3573–3587.e29

[19]

Wu, K. E., Yost, K. E., Chang, H. Y. (2021). BABEL enables cross-modality translation between multiomic profiles at single-cell resolution. Proc. Natl. Acad. Sci. USA, 118: e2023070118

[20]

Fortelny, N. (2020). Knowledge-primed neural networks enable biologically interpretable deep learning on single-cell sequencing data. Genome Biol., 21: 190

[21]

RavindraN.,SehanobishA.,PappalardoJ. L.,HaflerD. A.. “Disease state prediction from single-cell data using graph attention networks,” ACM CHIL 2020 - Proc. 2020 ACM Conf. Heal. Inference, Learn., pp. 121–130, 2020

[22]

Kimmel, J. C. Kelley, D. (2021). Semisupervised adversarial neural networks for single-cell classification. Genome Res., 31: 1781–1793

[23]

Sargent, B., Jafari, M., Marquez, G., Mehta, A. S., Sun, Y. H., Yang, H. Y., Zhu, K., Isseroff, R. R., Zhao, M. (2022). A machine learning based model accurately predicts cellular response to electric fields in multiple cell types. Sci. Rep., 12: 9912

[24]

Yang, F., Wang, W., Wang, F., Fang, Y., Tang, D., Huang, J., Lu, H. (2022). scBERT as a large-scale pretrained deep language model for cell type annotation of single-cell RNA-seq data. Nat. Mach. Intell., 4: 852–866

[25]

cken, M. D., Burkhardt, D., Cannoodt, R., Lance, C., Agrawal, A., Aliee, H., Chen, A., Deconinck, L., Detweiler, A., . (2021). A sandbox for prediction and integration of DNA, RNA, and protein data in single cells. In: NeurIPS 2021 Track Datasets Benchmarks, pp. 1–13

[26]

Lockett, A. (2020). No free lunch theorems. Nat. Comput. Ser., 1: 287–322

[27]

Cho, K., nboer, B., Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H. (2014). Learning phrase representations using RNN encoder–decoder for statistical machine translation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1724–1734

[28]

Pedregosa, F., Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., (2011). Scikit-learn: machine learning in Python. J. Mach. Learn. Res., 12: 2825–2830

[29]

Xu, B., Wang, N., Chen, T. (2015). Empirical evaluation of rectified activations in convolutional network. arXiv, 1505.00853v2

[30]

Gayoso, A., Steier, Z., Lopez, R., Regier, J., Nazor, K. L., Streets, A. (2021). Joint probabilistic modeling of single-cell multi-omic data with totalVI. Nat. Methods, 18: 272–282

[31]

LotfollahiM.,LitinetskayaA.TheisF.. (2022) Multigrate: single-cell multi-omic data integration. bioRxiv, 2022.03.16.484643

[32]

AshuachT.,GabittoM. I.,JordanM. I.. (2021) MultiVI: deep generative model for the integration of multi-modal data. bioRxiv, 2021.08.20.457057

[33]

Luecken, M. D., ttner, M., Chaichoompu, K., Danese, A., Interlandi, M., Mueller, M. F., Strobl, D. C., Zappia, L., Dugas, M., . (2022). Benchmarking atlas-level data integration in single-cell genomics. Nat. Methods, 19: 41–50

[34]

Mimitou, E. P., Lareau, C. A., Chen, K. Y., Zorzetto-Fernandes, A. L., Hao, Y., Takeshima, Y., Luo, W., Huang, T. S., Yeung, B. Z., Papalexi, E. . (2021). Scalable, multimodal profiling of chromatin accessibility, gene expression and protein levels in single cells. Nat. Biotechnol., 39: 1246–1258

[35]

Rood, J. E., Maartens, A., Hupalowska, A., Teichmann, S. A. (2022). Impact of the Human Cell Atlas on medicine. Nat. Med., 28: 2486–2496

RIGHTS & PERMISSIONS

The Author(s). Published by Higher Education Press.

AI Summary AI Mindmap
PDF (7940KB)

Supplementary files

QB-22318-OF-GCD_suppl_1

QB-22318-OF-GCD_suppl_2

1999

Accesses

0

Citation

Detail

Sections
Recommended

AI思维导图

/