PRISM offers a comprehensive genomic approach to transcription factor function prediction

Handle URI:
http://hdl.handle.net/10754/599403
Title:
PRISM offers a comprehensive genomic approach to transcription factor function prediction
Authors:
Wenger, A. M.; Clarke, S. L.; Guturu, H.; Chen, J.; Schaar, B. T.; McLean, C. Y.; Bejerano, G.
Abstract:
The human genome encodes 1500-2000 different transcription factors (TFs). ChIP-seq is revealing the global binding profiles of a fraction of TFs in a fraction of their biological contexts. These data show that the majority of TFs bind directly next to a large number of context-relevant target genes, that most binding is distal, and that binding is context specific. Because of the effort and cost involved, ChIP-seq is seldom used in search of novel TF function. Such exploration is instead done using expression perturbation and genetic screens. Here we propose a comprehensive computational framework for transcription factor function prediction. We curate 332 high-quality nonredundant TF binding motifs that represent all major DNA binding domains, and improve cross-species conserved binding site prediction to obtain 3.3 million conserved, mostly distal, binding site predictions. We combine these with 2.4 million facts about all human and mouse gene functions, in a novel statistical framework, in search of enrichments of particular motifs next to groups of target genes of particular functions. Rigorous parameter tuning and a harsh null are used to minimize false positives. Our novel PRISM (predicting regulatory information from single motifs) approach obtains 2543 TF function predictions in a large variety of contexts, at a false discovery rate of 16%. The predictions are highly enriched for validated TF roles, and 45 of 67 (67%) tested binding site regions in five different contexts act as enhancers in functionally matched cells.
Citation:
Wenger AM, Clarke SL, Guturu H, Chen J, Schaar BT, et al. (2013) PRISM offers a comprehensive genomic approach to transcription factor function prediction. Genome Research 23: 889–904. Available: http://dx.doi.org/10.1101/gr.139071.112.
Publisher:
Cold Spring Harbor Laboratory Press
Journal:
Genome Research
Issue Date:
4-Feb-2013
DOI:
10.1101/gr.139071.112
PubMed ID:
23382538
PubMed Central ID:
PMC3638144
Type:
Article
ISSN:
1088-9051
Sponsors:
We thank Seung Kim for providing us mPAC cells; Tom Cramer for freeing the PRISM Stanford domain name; Ravi Parikh for improving the user interface of the PRISM resource; Michael Hiller for the mouse 44-way alignment; and Will Talbot, Nadav Ahituv, Betty Booker, and the Bejerano laboratory for helpful comments. This work was supported by a Stanford Graduate Fellowship (A.M.W.), a Bio-X Stanford Interdisciplinary Graduate Fellowship (A.M.W.), an HHMI Gilliam Fellowship (S.L.C.), a National Science Foundation Fellowship DGE-1147470 (H.G.), a Bio-X Graduate Fellowship (C.Y.M.), NIH grants R01HG005058 and R01HD059862, NSF Center for Science of Information (CSoI) grant CCF-0939370, and KAUST (all to G.B.). G.B. is a Packard Fellow and Microsoft Research Fellow.
Appears in Collections:
Publications Acknowledging KAUST Support

Full metadata record

DC FieldValue Language
dc.contributor.authorWenger, A. M.en
dc.contributor.authorClarke, S. L.en
dc.contributor.authorGuturu, H.en
dc.contributor.authorChen, J.en
dc.contributor.authorSchaar, B. T.en
dc.contributor.authorMcLean, C. Y.en
dc.contributor.authorBejerano, G.en
dc.date.accessioned2016-02-28T05:50:28Zen
dc.date.available2016-02-28T05:50:28Zen
dc.date.issued2013-02-04en
dc.identifier.citationWenger AM, Clarke SL, Guturu H, Chen J, Schaar BT, et al. (2013) PRISM offers a comprehensive genomic approach to transcription factor function prediction. Genome Research 23: 889–904. Available: http://dx.doi.org/10.1101/gr.139071.112.en
dc.identifier.issn1088-9051en
dc.identifier.pmid23382538en
dc.identifier.doi10.1101/gr.139071.112en
dc.identifier.urihttp://hdl.handle.net/10754/599403en
dc.description.abstractThe human genome encodes 1500-2000 different transcription factors (TFs). ChIP-seq is revealing the global binding profiles of a fraction of TFs in a fraction of their biological contexts. These data show that the majority of TFs bind directly next to a large number of context-relevant target genes, that most binding is distal, and that binding is context specific. Because of the effort and cost involved, ChIP-seq is seldom used in search of novel TF function. Such exploration is instead done using expression perturbation and genetic screens. Here we propose a comprehensive computational framework for transcription factor function prediction. We curate 332 high-quality nonredundant TF binding motifs that represent all major DNA binding domains, and improve cross-species conserved binding site prediction to obtain 3.3 million conserved, mostly distal, binding site predictions. We combine these with 2.4 million facts about all human and mouse gene functions, in a novel statistical framework, in search of enrichments of particular motifs next to groups of target genes of particular functions. Rigorous parameter tuning and a harsh null are used to minimize false positives. Our novel PRISM (predicting regulatory information from single motifs) approach obtains 2543 TF function predictions in a large variety of contexts, at a false discovery rate of 16%. The predictions are highly enriched for validated TF roles, and 45 of 67 (67%) tested binding site regions in five different contexts act as enhancers in functionally matched cells.en
dc.description.sponsorshipWe thank Seung Kim for providing us mPAC cells; Tom Cramer for freeing the PRISM Stanford domain name; Ravi Parikh for improving the user interface of the PRISM resource; Michael Hiller for the mouse 44-way alignment; and Will Talbot, Nadav Ahituv, Betty Booker, and the Bejerano laboratory for helpful comments. This work was supported by a Stanford Graduate Fellowship (A.M.W.), a Bio-X Stanford Interdisciplinary Graduate Fellowship (A.M.W.), an HHMI Gilliam Fellowship (S.L.C.), a National Science Foundation Fellowship DGE-1147470 (H.G.), a Bio-X Graduate Fellowship (C.Y.M.), NIH grants R01HG005058 and R01HD059862, NSF Center for Science of Information (CSoI) grant CCF-0939370, and KAUST (all to G.B.). G.B. is a Packard Fellow and Microsoft Research Fellow.en
dc.publisherCold Spring Harbor Laboratory Pressen
dc.subject.meshComputational Biologyen
dc.subject.meshSoftwareen
dc.titlePRISM offers a comprehensive genomic approach to transcription factor function predictionen
dc.typeArticleen
dc.identifier.journalGenome Researchen
dc.identifier.pmcidPMC3638144en
dc.contributor.institutionDepartment of Computer Science, Stanford University, Stanford, California 94305, USA.en
All Items in KAUST are protected by copyright, with all rights reserved, unless otherwise indicated.