Reconstruction of microbial haplotypes by integration of statistical and physical linkage in scaffolding
Morrissy, A Sorana
Sycuro, Laura K
Jeffares, Daniel C
Permanent link to this recordhttp://hdl.handle.net/10754/667320
MetadataShow full item record
AbstractAbstract DNA sequencing technologies provide unprecedented opportunities to analyze within-host evolution of microorganism populations. Often, within-host populations are analyzed via pooled sequencing of the population, which contains multiple individuals or ‘haplotypes’. However, current next-generation sequencing instruments, in conjunction with single-molecule barcoded linked-reads, cannot distinguish long haplotypes directly. Computational reconstruction of haplotypes from pooled sequencing has been attempted in virology, bacterial genomics, metagenomics and human genetics, using algorithms based on either cross-host genetic sharing or within-host genomic reads. Here we describe PoolHapX, a flexible computational approach that integrates information from both genetic sharing and genomic sequencing. We demonstrated that PoolHapX outperforms state-of-the-art tools tailored to specific organismal systems, and is robust to within-host evolution. Importantly, together with barcoded linked-reads, PoolHapX can infer whole-chromosome-scale haplotypes from 50 pools each containing 12 different haplotypes. By analyzing real data, we uncovered dynamic variations in the evolutionary processes of within-patient HIV populations previously unobserved in single position-based analysis.
CitationCao, C., He, J., Mak, L., Perera, D., Kwok, D., Wang, J., … Long, Q. (2021). Reconstruction of microbial haplotypes by integration of statistical and physical linkage in scaffolding. Molecular Biology and Evolution. doi:10.1093/molbev/msab037
SponsorsQ.L. is supported by an NSERC Discovery Grant (RGPIN-2017-04860), a Canada Foundation for Innovation JELF grant (36605), and an ACHRI Startup grant. C.C., M.L. and L.M. are supported by ACHRI scholarship. L.M. is supported by a QEII award. G.Y. is supported by an NSERC Discovery Grant (RGPIN/04246-2018). Funding for open access charge: NSERC Discovery Grant (RGPIN-2017-04860).
PublisherOxford University Press (OUP)
JournalMolecular Biology and Evolution
Except where otherwise noted, this item's license is described as This is an Open Access article distributed under the terms of the Creative Commons Attribution License which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
- Hap10: reconstructing accurate and long polyploid haplotypes using linked reads.
- Authors: Majidian S, Kahaei MH, de Ridder D
- Issue date: 2020 Jun 18
- HapCUT2: robust and accurate haplotype assembly for diverse sequencing technologies.
- Authors: Edge P, Bafna V, Bansal V
- Issue date: 2017 May
- Leveraging reads that span multiple single nucleotide polymorphisms for haplotype inference from sequencing data.
- Authors: Yang WY, Hormozdiari F, Wang Z, He D, Pasaniuc B, Eskin E
- Issue date: 2013 Sep 15
- Evaluation of haplotype callers for next-generation sequencing of viruses.
- Authors: Eliseev A, Gibson KM, Avdeyev P, Novik D, Bendall ML, Pérez-Losada M, Alexeev N, Crandall KA
- Issue date: 2020 Aug