PhaseAll: a simple tool for read-based allele phasing

Peter M. Zhurbenko , Fedor N. Klimenko

Ecological Genetics ›› : 32 -32.

PDF (72KB)
Ecological Genetics ›› : 32 -32. DOI: 10.17816/ecogen112363
Genetically modified organism. The Нistory, Achivements, Social and Environmental Riscs
oration

PhaseAll: a simple tool for read-based allele phasing

Author information +
History +
PDF (72KB)

Abstract

The currently used genome assembly algorithms do not provide for allele phasing. This can lead to the loss of important information about the genotype of diploid and polyploid individuals. Here we introduce PhaseAll, a simple tool for allele phasing based on short reads obtained by second-generation sequencing. As input data, the tool takes paired reeds in SAM format. PhaseAll iterates sequentially through each alignment position. When a polymorphic position (SNP, insertion or deletion) is first encountered, a unique mutation is written to each allele. For each subsequent polymorphic position, a test is made to verify whether it is located on the same pair of reads (one DNA fragment) as the previous one. If two mutations are located on the same fragment, they are considered to belong to the same allele. If no fragments are found that connected at least one pair of neighboring polymorphic positions, an «X» is written in the allele sequences. This means that the alleles can swap at this position.

PhaseAll is written in python 3. SAM files are processed using the pysam library. PhaseAll is designed to separate only two alleles. To avoid possible sequencing errors, the user can set a read depth threshold below which the polymorphic position will be skipped. Some indels can cause errors in allele phasing, so PhaseAll has an option to skip indels for more accurate SNP reconstruction. The tool was tested on sequences of agrobacterial origin in the Camellia L. genome in more than 100 samples. PhaseAll is available for download on the GitHub: https://github.com/pzhurbenko/PhaseAll

The research was supported by RSF (project No. 21-14-00050).

Cite this article

Download citation ▾
Peter M. Zhurbenko, Fedor N. Klimenko. PhaseAll: a simple tool for read-based allele phasing. Ecological Genetics 32-32 DOI:10.17816/ecogen112363

登录浏览全文

4963

注册一个新账户 忘记密码

References

Funding

Russian Science Foundation(21-14-00050)

RIGHTS & PERMISSIONS

Eco-Vector

AI Summary AI Mindmap
PDF (72KB)

128

Accesses

0

Citation

Detail

Sections
Recommended

AI思维导图

/