De novo reconstruction of microbial haplotypes by integrating statistical and physical linkage
Type
PreprintAuthors
Cao, ChenHe, Jingni
Mak, Lauren
Perera, Deshan
Kwok, Devin
Wang, Jia
Li, Minghao
Mourier, Tobias
Gavriliuc, Stefan
Greenberg, Matthew
Morrissy, A. Sorana
Sycuro, Laura K.
YANG, GUANG
Jeffares, Daniel C.
Long, Quan
Date
2020-03-30Permanent link to this record
http://hdl.handle.net/10754/662468
Metadata
Show full item recordAbstract
AbstractDNA sequencing technologies provide unprecedented opportunities to analyze within-host evolution of microorganism populations. Often, within-host populations are analyzed via pooled sequencing of the population, which contains multiple individuals or ‘haplotypes’. However, current next-generation sequencing instruments, in conjunction with single-molecule barcoded linked-reads, cannot distinguish long haplotypes directly. Computational reconstruction of haplotypes from pooled sequencing has been attempted in virology, bacterial genomics, metagenomics and human genetics, using algorithms based on either cross-host genetic sharing or within-host genomic reads. Here we describe PoolHapX, a flexible computational approach that integrates information from both genetic sharing and genomic sequencing. We demonstrated that PoolHapX outperforms state-of-the-art tools in the above four fields, and is robust to within-host evolution. Importantly, together with barcoded linked-reads, PoolHapX can infer whole-chromosome-scale haplotypes from 50 pools each containing 12 different haplotypes. By analyzing real data, we uncovered dynamic variations in the evolutionary processes of HIV previously unobserved in single position-based analysis.Citation
Cao, C., He, J., Mak, L., Perera, D., Kwok, D., Wang, J., … Long, Q. (2020). De novo reconstruction of microbial haplotypes by integrating statistical and physical linkage. doi:10.1101/2020.03.29.014704Sponsors
Q.L. is supported by an NSERC Discovery Grant (RGPIN-2017-04860), a 540 Canada Foundation for Innovation JELF grant (36605), and an ACHRI Startup grant. C.C., 541 M.L. and L.M. are supported by ACHRI scholarship. L.M. is supported by a QEII award. G.Y. is 542 supported by an NSERC Discovery Grant (RGPIN/04246-2018)Publisher
Cold Spring Harbor LaboratoryAdditional Links
http://biorxiv.org/lookup/doi/10.1101/2020.03.29.014704ae974a485f413a2113503eed53cd6c53
10.1101/2020.03.29.014704