SignalSpider: Probabilistic pattern discovery on multiple normalized ChIP-Seq signal profiles

Handle URI:
http://hdl.handle.net/10754/563751
Title:
SignalSpider: Probabilistic pattern discovery on multiple normalized ChIP-Seq signal profiles
Authors:
Wong, Kachun; Li, Yue; Peng, Chengbin ( 0000-0002-7445-2638 ) ; Zhang, Zhaolei
Abstract:
Motivation: Chromatin immunoprecipitation (ChIP) followed by high-throughput sequencing (ChIP-Seq) measures the genome-wide occupancy of transcription factors in vivo. Different combinations of DNA-binding protein occupancies may result in a gene being expressed in different tissues or at different developmental stages. To fully understand the functions of genes, it is essential to develop probabilistic models on multiple ChIP-Seq profiles to decipher the combinatorial regulatory mechanisms by multiple transcription factors. Results: In this work, we describe a probabilistic model (SignalSpider) to decipher the combinatorial binding events of multiple transcription factors. Comparing with similar existing methods, we found SignalSpider performs better in clustering promoter and enhancer regions. Notably, SignalSpider can learn higher-order combinatorial patterns from multiple ChIP-Seq profiles. We have applied SignalSpider on the normalized ChIP-Seq profiles from the ENCODE consortium and learned model instances. We observed different higher-order enrichment and depletion patterns across sets of proteins. Those clustering patterns are supported by Gene Ontology (GO) enrichment, evolutionary conservation and chromatin interaction enrichment, offering biological insights for further focused studies. We also proposed a specific enrichment map visualization method to reveal the genome-wide transcription factor combinatorial patterns from the models built, which extend our existing fine-scale knowledge on gene regulation to a genome-wide level. Availability and implementation: The matrix-algebra-optimized executables and source codes are available at the authors' websites: http://www.cs.toronto.edu/∼wkc/SignalSpider. Contact: Supplementary information: Supplementary data are available at Bioinformatics online.
KAUST Department:
Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division; Computer Science Program
Publisher:
Oxford University Press (OUP)
Journal:
Bioinformatics
Issue Date:
5-Sep-2014
DOI:
10.1093/bioinformatics/btu604
Type:
Article
ISSN:
13674803
Sponsors:
Discovery Grant from Natural Sciences and Engineering Research Council, Canada (NSERC), grant number [327612-2009 RGPIN to Z.Z.]; Acres Inc. - Joseph Yonan Memorial Fellowship, Kwok Sau Po Scholarship, and International Research and Teaching Assistantship from University of Toronto (to K.W.).
Appears in Collections:
Articles; Computer Science Program; Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division

Full metadata record

DC FieldValue Language
dc.contributor.authorWong, Kachunen
dc.contributor.authorLi, Yueen
dc.contributor.authorPeng, Chengbinen
dc.contributor.authorZhang, Zhaoleien
dc.date.accessioned2015-08-03T12:08:53Zen
dc.date.available2015-08-03T12:08:53Zen
dc.date.issued2014-09-05en
dc.identifier.issn13674803en
dc.identifier.doi10.1093/bioinformatics/btu604en
dc.identifier.urihttp://hdl.handle.net/10754/563751en
dc.description.abstractMotivation: Chromatin immunoprecipitation (ChIP) followed by high-throughput sequencing (ChIP-Seq) measures the genome-wide occupancy of transcription factors in vivo. Different combinations of DNA-binding protein occupancies may result in a gene being expressed in different tissues or at different developmental stages. To fully understand the functions of genes, it is essential to develop probabilistic models on multiple ChIP-Seq profiles to decipher the combinatorial regulatory mechanisms by multiple transcription factors. Results: In this work, we describe a probabilistic model (SignalSpider) to decipher the combinatorial binding events of multiple transcription factors. Comparing with similar existing methods, we found SignalSpider performs better in clustering promoter and enhancer regions. Notably, SignalSpider can learn higher-order combinatorial patterns from multiple ChIP-Seq profiles. We have applied SignalSpider on the normalized ChIP-Seq profiles from the ENCODE consortium and learned model instances. We observed different higher-order enrichment and depletion patterns across sets of proteins. Those clustering patterns are supported by Gene Ontology (GO) enrichment, evolutionary conservation and chromatin interaction enrichment, offering biological insights for further focused studies. We also proposed a specific enrichment map visualization method to reveal the genome-wide transcription factor combinatorial patterns from the models built, which extend our existing fine-scale knowledge on gene regulation to a genome-wide level. Availability and implementation: The matrix-algebra-optimized executables and source codes are available at the authors' websites: http://www.cs.toronto.edu/∼wkc/SignalSpider. Contact: Supplementary information: Supplementary data are available at Bioinformatics online.en
dc.description.sponsorshipDiscovery Grant from Natural Sciences and Engineering Research Council, Canada (NSERC), grant number [327612-2009 RGPIN to Z.Z.]; Acres Inc. - Joseph Yonan Memorial Fellowship, Kwok Sau Po Scholarship, and International Research and Teaching Assistantship from University of Toronto (to K.W.).en
dc.publisherOxford University Press (OUP)en
dc.titleSignalSpider: Probabilistic pattern discovery on multiple normalized ChIP-Seq signal profilesen
dc.typeArticleen
dc.contributor.departmentComputer, Electrical and Mathematical Sciences and Engineering (CEMSE) Divisionen
dc.contributor.departmentComputer Science Programen
dc.identifier.journalBioinformaticsen
dc.contributor.institutionDepartment of Computer Science, University of TorontoToronto, ON, Canadaen
dc.contributor.institutionTerrence Donnelly Centre for Cellular and Biomolecular Research, University of TorontoToronto, ON, Canadaen
dc.contributor.institutionBanting and Best Department of Medical Research, University of TorontoToronto, ON, Canadaen
dc.contributor.institutionDepartment of Molecular Genetics, University of TorontoToronto, ON, Canadaen
kaust.authorPeng, Chengbinen
All Items in KAUST are protected by copyright, with all rights reserved, unless otherwise indicated.