fastp 1.0: An ultra-fast all-round tool for FASTQ data quality control and preprocessing

Shifu Chen

iMeta ›› 2025, Vol. 4 ›› Issue (5) : e70078

PDF
iMeta ›› 2025, Vol. 4 ›› Issue (5) :e70078 DOI: 10.1002/imt2.70078
RESEARCH ARTICLE
fastp 1.0: An ultra-fast all-round tool for FASTQ data quality control and preprocessing
Author information +
History +
PDF

Abstract

Fastp has been recognized as one of the most popular FASTQ file preprocessors because of its powerful functions and extreme performance. As its first major update, fastp 1.0 will be formally presented in this paper, including its new features and the implementation principles behind it. Two other popular FASTQ preprocessors, Trimmomatic and Cutadapt, will be compared to demonstrate the great advantages of fastp in terms of simplicity, efficiency, and versatility. Some modules, such as the batch processing scripts, will be introduced on how to apply fastp to process FASTQ files efficiently. Additionally, some software design principles will be highlighted to showcase how to develop a successful bioinformatics software.

Keywords

adapter / fastp / FASTQ / filtering / preprocessing / quality control

Cite this article

Download citation ▾
Shifu Chen. fastp 1.0: An ultra-fast all-round tool for FASTQ data quality control and preprocessing. iMeta, 2025, 4(5): e70078 DOI:10.1002/imt2.70078

登录浏览全文

4963

注册一个新账户 忘记密码

References

[1]

Wang, Duo, Yaqing Liu, Yuanfeng Zhang, Qingwang Chen, Yanxi Han, Wanwan Hou, Cong Liu, et al. 2024. “A Real-World Multi-Center RNA-seq Benchmarking Study Using the Quartet and MAQC Reference Materials.” Nature Communications 15: 6167. https://doi.org/10.1038/s41467-024-50420-y

[2]

Bonfiglio, Ferdinando, Andrea Legati, Vito Alessandro Lasorsa, Flavia Palombo, Giulia De Riso, Federica Isidori, Silvia Russo, et al. 2024. “Best Practices for Germline Variant and DNA Methylation Analysis of Second- and Third-Generation Sequencing Data.” Human Genomics 18: 120. https://doi.org/10.1186/s40246-024-00684-8

[3]

Bolger, Anthony M., Marc Lohse, and Bjoern Usadel. 2014. “Trimmomatic: A Flexible Trimmer for Illumina Sequence Data.” Bioinformatics 30: 2114-2120. https://doi.org/10.1093/bioinformatics/btu170

[4]

Martin, Marcel. 2011. “Cutadapt Removes Adapter Sequences From High-Throughput Sequencing Reads.” EMBnet.journal 17: 10. https://doi.org/10.14806/ej.17.1.200

[5]

Chen, Shifu. 2023. “Ultrafast One-Pass FASTQ Data Preprocessing, Quality Control, and Deduplication Using Fastp.” iMeta 2: e107. https://doi.org/10.1002/imt2.107

[6]

Brown, Joseph, Meg Pirrung, and Lee Ann McCue. 2017. “FQC Dashboard: Integrates FastQC Results into a Web-Based, Interactive, and Extensible FASTQ Quality Control Tool.” Bioinformatics 33: 3137-3139. https://doi.org/10.1093/bioinformatics/btx373

[7]

Ewels, Philip, Måns Magnusson, Sverker Lundin, and Max Käller. 2016. “MultiQC: Summarize Analysis Results for Multiple Tools and Samples in a Single Report.” Bioinformatics 32: 3047-3048. https://doi.org/10.1093/bioinformatics/btw354

[8]

Chen, Shifu, Yanqing Zhou, Yaru Chen, and Jia Gu. 2018. “Fastp: An Ultra-Fast All-In-One FASTQ Preprocessor.” Bioinformatics 34: i884-i890. https://doi.org/10.1093/bioinformatics/bty560

[9]

Piñeiro, César, and Juan C. Pichel. 2022. “BigSeqKit: A Parallel Big Data Toolkit to Process FASTA and FASTQ Files at Scale.” Gigascience 12: giad062. https://doi.org/10.1093/gigascience/giad062

[10]

Tsagiopoulou, Maria, Maria Christina Maniou, Nikolaos Pechlivanis, Anastasis Togkousidis, Michaela Kotrová, Tobias Hutzenlaub, Ilias Kappas, Anastasia Chatzidimitriou, and Fotis Psomopoulos. 2021. “UMIc: A Preprocessing Method for UMI Deduplication and Reads Correction.” Frontiers in Genetics 12: 660366. https://doi.org/10.3389/fgene.2021.660366

[11]

Pérez-Rubio, Paula, Claudio Lottaz, and Julia C. Engelmann. 2019. “FastqPuri: High-Performance Preprocessing of RNA-seq Data.” BMC Bioinformatics 20: 226. https://doi.org/10.1186/s12859-019-2799-0

[12]

Roser, Leandro Gabriel, Fernán Agüero, and Daniel Oscar Sánchez. 2019. “FastqCleaner: An Interactive Bioconductor Application for Quality-Control, Filtering and Trimming of FASTQ Files.” BMC Bioinformatics 20: 361. https://doi.org/10.1186/s12859-019-2961-8

[13]

Shen, Wei, Shuai Le, Yan Li, and Fuquan Hu. 2016. “SeqKit: A Cross-Platform and Ultrafast Toolkit for FASTA/Q File Manipulation.” PLoS One 11: e0163962. https://doi.org/10.1371/journal.pone.0163962

[14]

Shen, Wei, Botond Sipos, and Liuyang Zhao. 2024. “SeqKit2: A Swiss Army Knife for Sequence and Alignment Processing.” iMeta 3: e191. https://doi.org/10.1002/imt2.191

[15]

Li, Heng, and Nils Homer. 2010. “A Survey of Sequence Alignment Algorithms for Next-Generation Sequencing.” Briefings in Bioinformatics 11: 473-483. https://doi.org/10.1093/bib/bbq015

[16]

Bar-Yossef, Z., T. S. Jayram, R. Krauthgamer, and R. Kumar. 2004. “Approximating Edit Distance Efficiently.” 45th Annual IEEE Symposium on Foundations of Computer Science, 550-559.

RIGHTS & PERMISSIONS

2025 The Author(s). iMeta published by John Wiley & Sons Australia, Ltd on behalf of iMeta Science.

PDF

0

Accesses

0

Citation

Detail

Sections
Recommended

/