MEGAnnotator2: a pipeline for the assembly and annotation of microbial genomes

Gabriele Andrea Lugli , Federico Fontana , Chiara Tarracchini , Christian Milani , Leonardo Mancabelli , Francesca Turroni , Marco Ventura

Microbiome Research Reports ›› 2023, Vol. 2 ›› Issue (2) : 15

PDF
Microbiome Research Reports ›› 2023, Vol. 2 ›› Issue (2) :15 DOI: 10.20517/mrr.2022.21
Original Article

MEGAnnotator2: a pipeline for the assembly and annotation of microbial genomes

Author information +
History +
PDF

Abstract

The reconstruction of microbial genome sequences by bioinformatic pipelines and the consequent functional annotation of their genes’ repertoire are fundamental activities aiming at unveiling their biological mechanisms, such as metabolism, virulence factors, and antimicrobial resistances. Here, we describe the development of the MEGAnnotator2 pipeline able to manage all next-generation sequencing methodologies producing short- and long-read DNA sequences. Starting from raw sequencing data, the updated pipeline can manage multiple analyses leading to the assembly of high-quality genome sequences and the functional classification of their genetic repertoire, providing the user with a useful report constituting features and statistics related to the microbial genome. The updated pipeline is fully automated from the installation to the delivery of the output, thus requiring minimal bioinformatics knowledge to be executed.

Keywords

Genomics / bioinformatics / next-generation sequencing

Cite this article

Download citation ▾
Gabriele Andrea Lugli, Federico Fontana, Chiara Tarracchini, Christian Milani, Leonardo Mancabelli, Francesca Turroni, Marco Ventura. MEGAnnotator2: a pipeline for the assembly and annotation of microbial genomes. Microbiome Research Reports, 2023, 2(2): 15 DOI:10.20517/mrr.2022.21

登录浏览全文

4963

注册一个新账户 忘记密码

References

[1]

Fleischmann RD,White O.Whole-genome random sequencing and assembly of Haemophilus influenzae Rd.Science1995;269:496-512

[2]

Loman NJ.Twenty years of bacterial genome sequencing.Nat Rev Microbiol2015;13:787-94

[3]

Segerman B.The most frequently used sequencing technologies and assembly methods in different time segments of the bacterial surveillance and RefSeq genome databases.Front Cell Infect Microbiol2020;10:527102 PMCID:PMC7604302

[4]

Hu T,Monos D.Next-generation sequencing technologies: an overview.Hum Immunol2021;82:801-11

[5]

Slatko BE,Ausubel FM.Overview of next-generation sequencing technologies.Curr Protoc Mol Biol2018;122:e59 PMCID:PMC6020069

[6]

Dijk EL, Jaszczyszyn Y, Naquin D, Thermes C. The third revolution in sequencing technology.Trends Genet2018;34:666-81

[7]

Sohn JI.The present and future of de novo whole-genome assembly.Brief Bioinform2018;19:23-40

[8]

Kingsford C,Pop M.Assembly complexity of prokaryotic genomes using short reads.BMC Bioinform2010;11:21 PMCID:PMC2821320

[9]

Schmid M,Patrignani A.Pushing the limits of de novo genome assembly for complex prokaryotic genomes harboring very long, near identical repeats.Nucleic Acids Res2018;46:8953-65 PMCID:PMC6158609

[10]

Sallet E,Schiex T.EuGene-PP: a next-generation automated annotation pipeline for prokaryotic genomes.Bioinformatics2014;30:2659-61

[11]

Seemann T.Prokka: rapid prokaryotic genome annotation.Bioinformatics2014;30:2068-9

[12]

Tatusova T,Badretdin A.NCBI prokaryotic genome annotation pipeline.Nucleic Acids Res2016;44:6614-24 PMCID:PMC5001611

[13]

Ruiz-Perez CA,Konstantinidis KT.MicrobeAnnotator: a user-friendly, comprehensive functional annotation pipeline for microbial genomes.BMC Bioinform2021;22:11 PMCID:PMC7789693

[14]

Lugli GA,Mancabelli L,Ventura M.MEGAnnotator: a user-friendly pipeline for microbial genomes assembly and annotation.FEMS Microbiol Lett2016;363:fnw049

[15]

Wu Y,Wang S.Genetic divergence and functional convergence of gut bacteria between the Eastern honey bee Apis cerana and the Western honey bee Apis mellifera.J Adv Res2022;37:19-31 PMCID:PMC9039653

[16]

Ejigu GF.Review on the computational genome annotation of sequences obtained by next-generation sequencing.Biology2020;9:295 PMCID:PMC7565776

[17]

Fu L,Zhu Z,Li W.CD-HIT: accelerated for clustering the next-generation sequencing data.Bioinformatics2012;28:3150-2 PMCID:PMC3516142

[18]

Brown C, Irber L. sourmash: a library for MinHash sketching of DNA.JOSS2016;1:27

[19]

Quast C,Yilmaz P.The SILVA ribosomal RNA gene database project: improved data processing and web-based tools.Nucleic Acids Res2013;41:D590-6 PMCID:PMC3531112

[20]

Caspi R,Keseler IM.The MetaCyc database of metabolic pathways and enzymes - a 2019 update.Nucleic Acids Res2020;48:D445-53 PMCID:PMC6943030

[21]

Li H.Fast and accurate short read alignment with burrows-wheeler transform.Bioinformatics2009;25:1754-60 PMCID:PMC2705234

[22]

Bankevich A,Antipov D.SPAdes: a new genome assembly algorithm and its applications to single-cell sequencing.J Comput Biol2012;19:455-77 PMCID:PMC3342519

[23]

Koren S,Berlin K,Bergman NH.Canu: scalable and accurate long-read assembly via adaptive k-mer weighting and repeat separation.Genome Res2017;27:722-36 PMCID:PMC5411767

[24]

Wick RR.Polypolish: Short-read polishing of long-read bacterial genome assemblies.PLoS Comput Biol2022;18:e1009802 PMCID:PMC8812927

[25]

Chen Y,Zhang Y.High speed BLASTN: an accelerated MegaBLAST search tool.Nucleic Acids Res2015;43:7762-8 PMCID:PMC4652774

[26]

Jain C,Phillippy AM,Aluru S.High throughput ANI analysis of 90K prokaryotic genomes reveals clear species boundaries.Nat Commun2018;9:5114 PMCID:PMC6269478

[27]

Parks DH,Skennerton CT,Tyson GW.CheckM: assessing the quality of microbial genomes recovered from isolates, single cells, and metagenomes.Genome Res2015;25:1043-55 PMCID:PMC4484387

[28]

Rissman AI,Biehl BS,Glasner JD.Reordering contigs of draft genomes using the Mauve aligner.Bioinformatics2009;25:2071-3 PMCID:PMC2723005

[29]

Hyatt D,Locascio PF,Larimer FW.Prodigal: prokaryotic gene recognition and translation initiation site identification.BMC Bioinform2010;11:119 PMCID:PMC2848648

[30]

Dimonaco NJ,Kenobi K,Creevey CJ.No one tool to rule them all: prokaryotic gene prediction tool annotations are highly dependent on the organism of study.Bioinformatics2022;38:1198-207 PMCID:PMC8825762

[31]

Buchfink B,Huson DH.Fast and sensitive protein alignment using DIAMOND.Nat Methods2015;12:59-60

[32]

Jones P,Chang HY.InterProScan 5: genome-scale protein function classification.Bioinformatics2014;30:1236-40 PMCID:PMC3998142

[33]

Chan PP,Mak AJ.tRNAscan-SE 2.0: improved detection and functional classification of transfer RNA genes.Nucleic Acids Res2021;49:9077-96 PMCID:PMC8450103

[34]

Carver T,Berriman M,McQuillan JA.Artemis: an integrated platform for visualization and analysis of high-throughput sequence-based experimental data.Bioinformatics2012;28:464-9 PMCID:PMC3278759

[35]

Zhang P,Wang Y,Luo Y.Comparison of de novo assembly strategies for bacterial genomes.Int J Mol Sci2021;22:7668 PMCID:PMC8306402

AI Summary AI Mindmap
PDF

115

Accesses

0

Citation

Detail

Sections
Recommended

AI思维导图

/