Heap: a highly sensitive and accurate SNP detection tool for low-coverage high-throughput sequencing data
Type
ArticleAuthors
Kobayashi, MasaakiOhyanagi, Hajime
Takanashi, Hideki
Asano, Satomi
Kudo, Toru
Kajiya-Kanegae, Hiromi
Nagano, Atsushi J.
Tainaka, Hitoshi
Tokunaga, Tsuyoshi
Sazuka, Takashi
Iwata, Hiroyoshi
Tsutsumi, Nobuhiro
Yano, Kentaro
KAUST Department
Biological and Environmental Sciences and Engineering (BESE) DivisionComputational Bioscience Research Center (CBRC)
Date
2017-05-11Online Publication Date
2017-05-11Print Publication Date
2017-08-01Permanent link to this record
http://hdl.handle.net/10754/623635
Metadata
Show full item recordAbstract
Recent availability of large-scale genomic resources enables us to conduct so called genome-wide association studies (GWAS) and genomic prediction (GP) studies, particularly with next-generation sequencing (NGS) data. The effectiveness of GWAS and GP depends on not only their mathematical models, but the quality and quantity of variants employed in the analysis. In NGS single nucleotide polymorphism (SNP) calling, conventional tools ideally require more reads for higher SNP sensitivity and accuracy. In this study, we aimed to develop a tool, Heap, that enables robustly sensitive and accurate calling of SNPs, particularly with a low coverage NGS data, which must be aligned to the reference genome sequences in advance. To reduce false positive SNPs, Heap determines genotypes and calls SNPs at each site except for sites at the both ends of reads or containing a minor allele supported by only one read. Performance comparison with existing tools showed that Heap achieved the highest F-scores with low coverage (7X) restriction-site associated DNA sequencing reads of sorghum and rice individuals. This will facilitate cost-effective GWAS and GP studies in this NGS era. Code and documentation of Heap are freely available from https://github.com/meiji-bioinf/heap (29 March 2017, date last accessed) and our web site (http://bioinf.mind.meiji.ac.jp/lab/en/tools.html (29 March 2017, date last accessed)).Citation
Kobayashi M, Ohyanagi H, Takanashi H, Asano S, Kudo T, et al. (2017) Heap: a highly sensitive and accurate SNP detection tool for low-coverage high-throughput sequencing data. DNA Research. Available: http://dx.doi.org/10.1093/dnares/dsx012.Sponsors
This work was partially supported by CREST, JST to NT, the Japan Society for the Promotion of Science (JSPS) KAENHI [Grants-in-Aid for Scientific Research on Innovative Areas (No. 26113716)], MEXT-Supported Program for the Strategic Research Foundation at Private Universities (2014–2018), and Research Funding for Computational Software Supporting Program from Meiji University to KY.Publisher
Oxford University Press (OUP)Journal
DNA ResearchPubMed ID
28498906ae974a485f413a2113503eed53cd6c53
10.1093/dnares/dsx012
Scopus Count
Except where otherwise noted, this item's license is described as This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
Related articles
- A Fast and Scalable Workflow for SNPs Detection in Genome Sequences Using Hadoop Map-Reduce.
- Authors: Tahir M, Sardaraz M
- Issue date: 2020 Feb 5
- ngs_backbone: a pipeline for read cleaning, mapping and SNP calling using next generation sequence.
- Authors: Blanca JM, Pascual L, Ziarsolo P, Nuez F, Cañizares J
- Issue date: 2011 Jun 2
- Review of alignment and SNP calling algorithms for next-generation sequencing data.
- Authors: Mielczarek M, Szyda J
- Issue date: 2016 Feb
- An investigation of causes of false positive single nucleotide polymorphisms using simulated reads from a small eukaryote genome.
- Authors: Ribeiro A, Golicz A, Hackett CA, Milne I, Stephen G, Marshall D, Flavell AJ, Bayer M
- Issue date: 2015 Nov 11
- Annotation-based genome-wide SNP discovery in the large and complex Aegilops tauschii genome using next-generation sequencing without a reference genome sequence.
- Authors: You FM, Huo N, Deal KR, Gu YQ, Luo MC, McGuire PE, Dvorak J, Anderson OD
- Issue date: 2011 Jan 25