An Accurate Method for Inferring Relatedness in Large Datasets of Unphased Genotypes via an Embedded Likelihood-Ratio Test

Handle URI:
http://hdl.handle.net/10754/597505
Title:
An Accurate Method for Inferring Relatedness in Large Datasets of Unphased Genotypes via an Embedded Likelihood-Ratio Test
Authors:
Rodriguez, Jesse M.; Batzoglou, Serafim; Bercovici, Sivan
Abstract:
Studies that map disease genes rely on accurate annotations that indicate whether individuals in the studied cohorts are related to each other or not. For example, in genome-wide association studies, the cohort members are assumed to be unrelated to one another. Investigators can correct for individuals in a cohort with previously-unknown shared familial descent by detecting genomic segments that are shared between them, which are considered to be identical by descent (IBD). Alternatively, elevated frequencies of IBD segments near a particular locus among affected individuals can be indicative of a disease-associated gene. As genotyping studies grow to use increasingly large sample sizes and meta-analyses begin to include many data sets, accurate and efficient detection of hidden relatedness becomes a challenge. To enable disease-mapping studies of increasingly large cohorts, a fast and accurate method to detect IBD segments is required. We present PARENTE, a novel method for detecting related pairs of individuals and shared haplotypic segments within these pairs. PARENTE is a computationally-efficient method based on an embedded likelihood ratio test. As demonstrated by the results of our simulations, our method exhibits better accuracy than the current state of the art, and can be used for the analysis of large genotyped cohorts. PARENTE's higher accuracy becomes even more significant in more challenging scenarios, such as detecting shorter IBD segments or when an extremely low false-positive rate is required. PARENTE is publicly and freely available at http://parente.stanford.edu/. © 2013 Springer-Verlag.
Citation:
Rodriguez JM, Batzoglou S, Bercovici S (2013) An Accurate Method for Inferring Relatedness in Large Datasets of Unphased Genotypes via an Embedded Likelihood-Ratio Test. Research in Computational Molecular Biology: 212–229. Available: http://dx.doi.org/10.1007/978-3-642-37195-0_18.
Publisher:
Springer Science + Business Media
Journal:
Research in Computational Molecular Biology
Issue Date:
2013
DOI:
10.1007/978-3-642-37195-0_18
Type:
Book Chapter
ISSN:
0302-9743; 1611-3349
Sponsors:
This material is based upon work supported by the NationalScience Foundation Graduate Research Fellowship under Grant No. DGE-1147470. Any opinions, findings, and conclusions or recommendations expressedin this material are those of the authors and do not necessarily reflect the viewsof the National Science Foundation. This work is also supported by a grantfrom the Stanford-KAUST alliance for academic excellence. We would like tothank Kelly Gilbert for helpful feedback in preparing the manuscript and twoanonymous reviewers for many helpful comments.
Appears in Collections:
Publications Acknowledging KAUST Support

Full metadata record

DC FieldValue Language
dc.contributor.authorRodriguez, Jesse M.en
dc.contributor.authorBatzoglou, Serafimen
dc.contributor.authorBercovici, Sivanen
dc.date.accessioned2016-02-25T12:41:03Zen
dc.date.available2016-02-25T12:41:03Zen
dc.date.issued2013en
dc.identifier.citationRodriguez JM, Batzoglou S, Bercovici S (2013) An Accurate Method for Inferring Relatedness in Large Datasets of Unphased Genotypes via an Embedded Likelihood-Ratio Test. Research in Computational Molecular Biology: 212–229. Available: http://dx.doi.org/10.1007/978-3-642-37195-0_18.en
dc.identifier.issn0302-9743en
dc.identifier.issn1611-3349en
dc.identifier.doi10.1007/978-3-642-37195-0_18en
dc.identifier.urihttp://hdl.handle.net/10754/597505en
dc.description.abstractStudies that map disease genes rely on accurate annotations that indicate whether individuals in the studied cohorts are related to each other or not. For example, in genome-wide association studies, the cohort members are assumed to be unrelated to one another. Investigators can correct for individuals in a cohort with previously-unknown shared familial descent by detecting genomic segments that are shared between them, which are considered to be identical by descent (IBD). Alternatively, elevated frequencies of IBD segments near a particular locus among affected individuals can be indicative of a disease-associated gene. As genotyping studies grow to use increasingly large sample sizes and meta-analyses begin to include many data sets, accurate and efficient detection of hidden relatedness becomes a challenge. To enable disease-mapping studies of increasingly large cohorts, a fast and accurate method to detect IBD segments is required. We present PARENTE, a novel method for detecting related pairs of individuals and shared haplotypic segments within these pairs. PARENTE is a computationally-efficient method based on an embedded likelihood ratio test. As demonstrated by the results of our simulations, our method exhibits better accuracy than the current state of the art, and can be used for the analysis of large genotyped cohorts. PARENTE's higher accuracy becomes even more significant in more challenging scenarios, such as detecting shorter IBD segments or when an extremely low false-positive rate is required. PARENTE is publicly and freely available at http://parente.stanford.edu/. © 2013 Springer-Verlag.en
dc.description.sponsorshipThis material is based upon work supported by the NationalScience Foundation Graduate Research Fellowship under Grant No. DGE-1147470. Any opinions, findings, and conclusions or recommendations expressedin this material are those of the authors and do not necessarily reflect the viewsof the National Science Foundation. This work is also supported by a grantfrom the Stanford-KAUST alliance for academic excellence. We would like tothank Kelly Gilbert for helpful feedback in preparing the manuscript and twoanonymous reviewers for many helpful comments.en
dc.publisherSpringer Science + Business Mediaen
dc.subjectIBDen
dc.subjectPopulation geneticsen
dc.subjectrelatednessen
dc.titleAn Accurate Method for Inferring Relatedness in Large Datasets of Unphased Genotypes via an Embedded Likelihood-Ratio Testen
dc.typeBook Chapteren
dc.identifier.journalResearch in Computational Molecular Biologyen
dc.contributor.institutionStanford University, Palo Alto, United Statesen
All Items in KAUST are protected by copyright, with all rights reserved, unless otherwise indicated.