SignalSpider: Probabilistic pattern discovery on multiple normalized ChIP-Seq signal profiles
Type
ArticleKAUST Department
Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) DivisionComputer Science Program
Date
2014-09-05Online Publication Date
2014-09-05Print Publication Date
2015-01-01Permanent link to this record
http://hdl.handle.net/10754/563751
Metadata
Show full item recordAbstract
Motivation: Chromatin immunoprecipitation (ChIP) followed by high-throughput sequencing (ChIP-Seq) measures the genome-wide occupancy of transcription factors in vivo. Different combinations of DNA-binding protein occupancies may result in a gene being expressed in different tissues or at different developmental stages. To fully understand the functions of genes, it is essential to develop probabilistic models on multiple ChIP-Seq profiles to decipher the combinatorial regulatory mechanisms by multiple transcription factors. Results: In this work, we describe a probabilistic model (SignalSpider) to decipher the combinatorial binding events of multiple transcription factors. Comparing with similar existing methods, we found SignalSpider performs better in clustering promoter and enhancer regions. Notably, SignalSpider can learn higher-order combinatorial patterns from multiple ChIP-Seq profiles. We have applied SignalSpider on the normalized ChIP-Seq profiles from the ENCODE consortium and learned model instances. We observed different higher-order enrichment and depletion patterns across sets of proteins. Those clustering patterns are supported by Gene Ontology (GO) enrichment, evolutionary conservation and chromatin interaction enrichment, offering biological insights for further focused studies. We also proposed a specific enrichment map visualization method to reveal the genome-wide transcription factor combinatorial patterns from the models built, which extend our existing fine-scale knowledge on gene regulation to a genome-wide level. Availability and implementation: The matrix-algebra-optimized executables and source codes are available at the authors' websites: http://www.cs.toronto.edu/∼wkc/SignalSpider. Contact: Supplementary information: Supplementary data are available at Bioinformatics online.Citation
Wong, K.-C., Li, Y., Peng, C., & Zhang, Z. (2014). SignalSpider: probabilistic pattern discovery on multiple normalized ChIP-Seq signal profiles. Bioinformatics, 31(1), 17–24. doi:10.1093/bioinformatics/btu604Sponsors
Discovery Grant from Natural Sciences and Engineering Research Council, Canada (NSERC), grant number [327612-2009 RGPIN to Z.Z.]; Acres Inc. - Joseph Yonan Memorial Fellowship, Kwok Sau Po Scholarship, and International Research and Teaching Assistantship from University of Toronto (to K.W.).Publisher
Oxford University Press (OUP)Journal
BioinformaticsPubMed ID
25192742ae974a485f413a2113503eed53cd6c53
10.1093/bioinformatics/btu604
Scopus Count
Related articles
- Probabilistic Inference on Multiple Normalized Signal Profiles from Next Generation Sequencing: Transcription Factor Binding Sites.
- Authors: Wong KC, Peng C, Li Y
- Issue date: 2015 Nov-Dec
- iTAR: a web server for identifying target genes of transcription factors using ChIP-seq or ChIP-chip data.
- Authors: Yang CC, Andrews EH, Chen MH, Wang WY, Chen JJ, Gerstein M, Liu CC, Cheng C
- Issue date: 2016 Aug 12
- A novel statistical method for quantitative comparison of multiple ChIP-seq datasets.
- Authors: Chen L, Wang C, Qin ZS, Wu H
- Issue date: 2015 Jun 15
- Using combined evidence from replicates to evaluate ChIP-seq peaks.
- Authors: Jalili V, Matteucci M, Masseroli M, Morelli MJ
- Issue date: 2015 Sep 1
- Software for rapid time dependent ChIP-sequencing analysis (TDCA).
- Authors: Myschyshyn M, Farren-Dai M, Chuang TJ, Vocadlo D
- Issue date: 2017 Nov 25