Heap: a highly sensitive and accurate SNP detection tool for low-coverage high-throughput sequencing data

Handle URI:
http://hdl.handle.net/10754/623635
Title:
Heap: a highly sensitive and accurate SNP detection tool for low-coverage high-throughput sequencing data
Authors:
Kobayashi, Masaaki; Ohyanagi, Hajime; Takanashi, Hideki; Asano, Satomi; Kudo, Toru; Kajiya-Kanegae, Hiromi; Nagano, Atsushi J.; Tainaka, Hitoshi; Tokunaga, Tsuyoshi; Sazuka, Takashi; Iwata, Hiroyoshi; Tsutsumi, Nobuhiro; Yano, Kentaro
Abstract:
Recent availability of large-scale genomic resources enables us to conduct so called genome-wide association studies (GWAS) and genomic prediction (GP) studies, particularly with next-generation sequencing (NGS) data. The effectiveness of GWAS and GP depends on not only their mathematical models, but the quality and quantity of variants employed in the analysis. In NGS single nucleotide polymorphism (SNP) calling, conventional tools ideally require more reads for higher SNP sensitivity and accuracy. In this study, we aimed to develop a tool, Heap, that enables robustly sensitive and accurate calling of SNPs, particularly with a low coverage NGS data, which must be aligned to the reference genome sequences in advance. To reduce false positive SNPs, Heap determines genotypes and calls SNPs at each site except for sites at the both ends of reads or containing a minor allele supported by only one read. Performance comparison with existing tools showed that Heap achieved the highest F-scores with low coverage (7X) restriction-site associated DNA sequencing reads of sorghum and rice individuals. This will facilitate cost-effective GWAS and GP studies in this NGS era. Code and documentation of Heap are freely available from https://github.com/meiji-bioinf/heap (29 March 2017, date last accessed) and our web site (http://bioinf.mind.meiji.ac.jp/lab/en/tools.html (29 March 2017, date last accessed)).
KAUST Department:
Computational Bioscience Research Center (CBRC)
Citation:
Kobayashi M, Ohyanagi H, Takanashi H, Asano S, Kudo T, et al. (2017) Heap: a highly sensitive and accurate SNP detection tool for low-coverage high-throughput sequencing data. DNA Research. Available: http://dx.doi.org/10.1093/dnares/dsx012.
Publisher:
Oxford University Press (OUP)
Journal:
DNA Research
Issue Date:
20-Apr-2017
DOI:
10.1093/dnares/dsx012
Type:
Article
ISSN:
1340-2838; 1756-1663
Sponsors:
This work was partially supported by CREST, JST to NT, the Japan Society for the Promotion of Science (JSPS) KAENHI [Grants-in-Aid for Scientific Research on Innovative Areas (No. 26113716)], MEXT-Supported Program for the Strategic Research Foundation at Private Universities (2014–2018), and Research Funding for Computational Software Supporting Program from Meiji University to KY.
Additional Links:
https://academic.oup.com/dnaresearch/article-lookup/doi/10.1093/dnares/dsx012
Appears in Collections:
Articles; Computational Bioscience Research Center (CBRC)

Full metadata record

DC FieldValue Language
dc.contributor.authorKobayashi, Masaakien
dc.contributor.authorOhyanagi, Hajimeen
dc.contributor.authorTakanashi, Hidekien
dc.contributor.authorAsano, Satomien
dc.contributor.authorKudo, Toruen
dc.contributor.authorKajiya-Kanegae, Hiromien
dc.contributor.authorNagano, Atsushi J.en
dc.contributor.authorTainaka, Hitoshien
dc.contributor.authorTokunaga, Tsuyoshien
dc.contributor.authorSazuka, Takashien
dc.contributor.authorIwata, Hiroyoshien
dc.contributor.authorTsutsumi, Nobuhiroen
dc.contributor.authorYano, Kentaroen
dc.date.accessioned2017-05-17T07:41:39Z-
dc.date.available2017-05-17T07:41:39Z-
dc.date.issued2017-04-20en
dc.identifier.citationKobayashi M, Ohyanagi H, Takanashi H, Asano S, Kudo T, et al. (2017) Heap: a highly sensitive and accurate SNP detection tool for low-coverage high-throughput sequencing data. DNA Research. Available: http://dx.doi.org/10.1093/dnares/dsx012.en
dc.identifier.issn1340-2838en
dc.identifier.issn1756-1663en
dc.identifier.doi10.1093/dnares/dsx012en
dc.identifier.urihttp://hdl.handle.net/10754/623635-
dc.description.abstractRecent availability of large-scale genomic resources enables us to conduct so called genome-wide association studies (GWAS) and genomic prediction (GP) studies, particularly with next-generation sequencing (NGS) data. The effectiveness of GWAS and GP depends on not only their mathematical models, but the quality and quantity of variants employed in the analysis. In NGS single nucleotide polymorphism (SNP) calling, conventional tools ideally require more reads for higher SNP sensitivity and accuracy. In this study, we aimed to develop a tool, Heap, that enables robustly sensitive and accurate calling of SNPs, particularly with a low coverage NGS data, which must be aligned to the reference genome sequences in advance. To reduce false positive SNPs, Heap determines genotypes and calls SNPs at each site except for sites at the both ends of reads or containing a minor allele supported by only one read. Performance comparison with existing tools showed that Heap achieved the highest F-scores with low coverage (7X) restriction-site associated DNA sequencing reads of sorghum and rice individuals. This will facilitate cost-effective GWAS and GP studies in this NGS era. Code and documentation of Heap are freely available from https://github.com/meiji-bioinf/heap (29 March 2017, date last accessed) and our web site (http://bioinf.mind.meiji.ac.jp/lab/en/tools.html (29 March 2017, date last accessed)).en
dc.description.sponsorshipThis work was partially supported by CREST, JST to NT, the Japan Society for the Promotion of Science (JSPS) KAENHI [Grants-in-Aid for Scientific Research on Innovative Areas (No. 26113716)], MEXT-Supported Program for the Strategic Research Foundation at Private Universities (2014–2018), and Research Funding for Computational Software Supporting Program from Meiji University to KY.en
dc.publisherOxford University Press (OUP)en
dc.relation.urlhttps://academic.oup.com/dnaresearch/article-lookup/doi/10.1093/dnares/dsx012en
dc.rightsThis is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.comen
dc.rights.urihttp://creativecommons.org/licenses/by-nc/4.0/en
dc.subjectSingle Nucleotide Polymorphism (SNP)en
dc.subjectGenome-wide Association Studies (Gwas)en
dc.subjectNext-generation Sequencing (Ngs)en
dc.subjectRestriction-site Associated Dna Sequencing (Rad-seq)en
dc.subjectGenomic Prediction (Gp)en
dc.titleHeap: a highly sensitive and accurate SNP detection tool for low-coverage high-throughput sequencing dataen
dc.typeArticleen
dc.contributor.departmentComputational Bioscience Research Center (CBRC)en
dc.identifier.journalDNA Researchen
dc.eprint.versionPublisher's Version/PDFen
dc.contributor.institutionBioinformatics Laboratory, Department of Life Sciences, School of Agriculture, Meiji University, Kanagawa 214-8571, Japan.en
dc.contributor.institutionGraduate School of Agricultural and Life Sciences, The University of Tokyo, Tokyo 113-8657, Japan.en
dc.contributor.institutionCenter for Ecological Research, Kyoto University, Shiga 520-2113, Japan.en
dc.contributor.institutionEARTHNOTE Co. Ltd., Nago, Okinawa 905-1152, Japan.en
dc.contributor.institutionBioscience and Biotechnology Center, Nagoya University, Aichi 464-8601, Japan.en
kaust.authorOhyanagi, Hajimeen
All Items in KAUST are protected by copyright, with all rights reserved, unless otherwise indicated.