How error correction affects polymerase chain reaction deduplication: A survey based on unique molecular identifier datasets of short reads
Pengyao Ping , Tian Lan , Shuquan Su , Wei Liu , Jinyan Li
Quant. Biol. ›› 2025, Vol. 13 ›› Issue (3) : e99
How error correction affects polymerase chain reaction deduplication: A survey based on unique molecular identifier datasets of short reads
Next-generation sequencing data are widely utilised for various downstream applications in bioinformatics and numerous techniques have been developed for PCR-deduplication and error-correction to eliminate bias and errors introduced during the sequencing. This study first-time provides a joint overview of recent advances in PCR-deduplication and error-correction on short reads. In particular, we utilise UMI-based PCR-deduplication strategies and sequencing data to assess the performance of the solely-computational PCR-deduplication approaches and investigate how error correction affects the performance of PCR-deduplication. Our survey and comparative analysis reveal that the deduplicated reads generated by the solely-computational PCR-deduplication and error-correction methods exhibit substantial differences and divergence from the sets of reads obtained by the UMI-based deduplication methods. The existing solely-computational PCR-deduplication and error-correction tools can eliminate some errors but still leave hundreds of thousands of erroneous reads uncorrected. All the error-correction approaches raise thousands or more new sequences after correction which do not have any benefit to the PCR-deduplication process. Based on our findings, we discuss future research directions and make suggestions for improving existing computational approaches to enhance the quality of short-read sequencing data.
error correction / next generation sequencing (NGS) / PCR-deduplication / polymerase chain reaction (PCR) duplicates / unique molecular identifier (UMI)
| [1] |
|
| [2] |
|
| [3] |
|
| [4] |
|
| [5] |
|
| [6] |
|
| [7] |
|
| [8] |
|
| [9] |
|
| [10] |
|
| [11] |
|
| [12] |
|
| [13] |
|
| [14] |
|
| [15] |
|
| [16] |
|
| [17] |
|
| [18] |
|
| [19] |
|
| [20] |
|
| [21] |
|
| [22] |
|
| [23] |
|
| [24] |
|
| [25] |
|
| [26] |
|
| [27] |
|
| [28] |
|
| [29] |
Picard toolkit. 2019. |
| [30] |
|
| [31] |
|
| [32] |
|
| [33] |
|
| [34] |
|
| [35] |
|
| [36] |
|
| [37] |
|
| [38] |
|
| [39] |
|
| [40] |
|
| [41] |
|
| [42] |
|
| [43] |
|
| [44] |
|
| [45] |
|
| [46] |
|
| [47] |
|
| [48] |
|
| [49] |
|
| [50] |
|
| [51] |
|
| [52] |
|
| [53] |
|
| [54] |
|
| [55] |
|
| [56] |
|
| [57] |
|
| [58] |
|
| [59] |
|
| [60] |
|
| [61] |
|
| [62] |
|
| [63] |
|
| [64] |
|
| [65] |
|
| [66] |
|
| [67] |
|
| [68] |
|
| [69] |
|
| [70] |
|
| [71] |
|
| [72] |
|
| [73] |
|
| [74] |
|
| [75] |
|
The Author(s). Quantitative Biology published by John Wiley & Sons Australia, Ltd on behalf of Higher Education Press.
/
| 〈 |
|
〉 |