Handle URI:
http://hdl.handle.net/10754/596766
Title:
An effective filter for IBD detection in large data sets.
Authors:
Huang, Lin; Bercovici, Sivan; Rodriguez, Jesse M; Batzoglou, Serafim
Abstract:
Identity by descent (IBD) inference is the task of computationally detecting genomic segments that are shared between individuals by means of common familial descent. Accurate IBD detection plays an important role in various genomic studies, ranging from mapping disease genes to exploring ancient population histories. The majority of recent work in the field has focused on improving the accuracy of inference, targeting shorter genomic segments that originate from a more ancient common ancestor. The accuracy of these methods, however, is achieved at the expense of high computational cost, resulting in a prohibitively long running time when applied to large cohorts. To enable the study of large cohorts, we introduce SpeeDB, a method that facilitates fast IBD detection in large unphased genotype data sets. Given a target individual and a database of individuals that potentially share IBD segments with the target, SpeeDB applies an efficient opposite-homozygous filter, which excludes chromosomal segments from the database that are highly unlikely to be IBD with the corresponding segments from the target individual. The remaining segments can then be evaluated by any IBD detection method of choice. When examining simulated individuals sharing 4 cM IBD regions, SpeeDB filtered out 99.5% of genomic regions from consideration while retaining 99% of the true IBD segments. Applying the SpeeDB filter prior to detecting IBD in simulated fourth cousins resulted in an overall running time that was 10,000x faster than inferring IBD without the filter and retained 99% of the true IBD segments in the output.
Citation:
Huang L, Bercovici S, Rodriguez JM, Batzoglou S (2014) An Effective Filter for IBD Detection in Large Data Sets. PLoS ONE 9: e92713. Available: http://dx.doi.org/10.1371/journal.pone.0092713.
Publisher:
Public Library of Science (PLoS)
Journal:
PLoS ONE
Issue Date:
25-Mar-2014
DOI:
10.1371/journal.pone.0092713
PubMed ID:
24667521
PubMed Central ID:
PMC3965454
Type:
Article
ISSN:
1932-6203
Sponsors:
LH is supported by a Pierre and Christine Lamond Stanford Graduate Fellowship. This material is also based upon work supported by the National Science Foundation Graduate Research Fellowship under Grant No. DGE-1147470. This work is also supported by a grant from the Stanford-KAUST alliance for academic excellence. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Appears in Collections:
Publications Acknowledging KAUST Support

Full metadata record

DC FieldValue Language
dc.contributor.authorHuang, Linen
dc.contributor.authorBercovici, Sivanen
dc.contributor.authorRodriguez, Jesse Men
dc.contributor.authorBatzoglou, Serafimen
dc.date.accessioned2016-02-21T08:50:14Zen
dc.date.available2016-02-21T08:50:14Zen
dc.date.issued2014-03-25en
dc.identifier.citationHuang L, Bercovici S, Rodriguez JM, Batzoglou S (2014) An Effective Filter for IBD Detection in Large Data Sets. PLoS ONE 9: e92713. Available: http://dx.doi.org/10.1371/journal.pone.0092713.en
dc.identifier.issn1932-6203en
dc.identifier.pmid24667521en
dc.identifier.doi10.1371/journal.pone.0092713en
dc.identifier.urihttp://hdl.handle.net/10754/596766en
dc.description.abstractIdentity by descent (IBD) inference is the task of computationally detecting genomic segments that are shared between individuals by means of common familial descent. Accurate IBD detection plays an important role in various genomic studies, ranging from mapping disease genes to exploring ancient population histories. The majority of recent work in the field has focused on improving the accuracy of inference, targeting shorter genomic segments that originate from a more ancient common ancestor. The accuracy of these methods, however, is achieved at the expense of high computational cost, resulting in a prohibitively long running time when applied to large cohorts. To enable the study of large cohorts, we introduce SpeeDB, a method that facilitates fast IBD detection in large unphased genotype data sets. Given a target individual and a database of individuals that potentially share IBD segments with the target, SpeeDB applies an efficient opposite-homozygous filter, which excludes chromosomal segments from the database that are highly unlikely to be IBD with the corresponding segments from the target individual. The remaining segments can then be evaluated by any IBD detection method of choice. When examining simulated individuals sharing 4 cM IBD regions, SpeeDB filtered out 99.5% of genomic regions from consideration while retaining 99% of the true IBD segments. Applying the SpeeDB filter prior to detecting IBD in simulated fourth cousins resulted in an overall running time that was 10,000x faster than inferring IBD without the filter and retained 99% of the true IBD segments in the output.en
dc.description.sponsorshipLH is supported by a Pierre and Christine Lamond Stanford Graduate Fellowship. This material is also based upon work supported by the National Science Foundation Graduate Research Fellowship under Grant No. DGE-1147470. This work is also supported by a grant from the Stanford-KAUST alliance for academic excellence. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.en
dc.publisherPublic Library of Science (PLoS)en
dc.rightsThis is an open-access article distributed under the terms of the , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.en
dc.subject.meshDatabases, Nucleic Aciden
dc.titleAn effective filter for IBD detection in large data sets.en
dc.typeArticleen
dc.identifier.journalPLoS ONEen
dc.identifier.pmcidPMC3965454en
dc.contributor.institutionDepartment of Computer Science, Stanford University, Stanford, California, United States of America.en
kaust.grant.programAcademic Excellence Alliance (AEA)en

Related articles on PubMed

All Items in KAUST are protected by copyright, with all rights reserved, unless otherwise indicated.