A Comparison Study for DNA Motif Modeling on Protein Binding Microarray

Handle URI:
http://hdl.handle.net/10754/584252
Title:
A Comparison Study for DNA Motif Modeling on Protein Binding Microarray
Authors:
Wong, Ka-Chun; Li, Yue; Peng, Chengbin ( 0000-0002-7445-2638 ) ; Wong, Hau-San
Abstract:
Transcription Factor Binding Sites (TFBSs) are relatively short (5-15 bp) and degenerate. Identifying them is a computationally challenging task. In particular, Protein Binding Microarray (PBM) is a high-throughput platform that can measure the DNA binding preference of a protein in a comprehensive and unbiased manner; for instance, a typical PBM experiment can measure binding signal intensities of a protein to all possible DNA k-mers (k=810). Since proteins can often bind to DNA with different binding intensities, one of the major challenges is to build motif models which can fully capture the quantitative binding affinity data. To learn DNA motif models from the non-convex objective function landscape, several optimization methods are compared and applied to the PBM motif model building problem. In particular, representative methods from different optimization paradigms have been chosen for modeling performance comparison on hundreds of PBM datasets. The results suggest that the multimodal optimization methods are very effective for capturing the binding preference information from PBM data. In particular, we observe a general performance improvement using di-nucleotide modeling over mono-nucleotide modeling. In addition, the models learned by the best-performing method are applied to two independent applications: PBM probe rotation testing and ChIP-Seq peak sequence prediction, demonstrating its biological applicability.
KAUST Department:
Extreme Computing Research Center
Citation:
A Comparison Study for DNA Motif Modeling on Protein Binding Microarray 2015:1 IEEE/ACM Transactions on Computational Biology and Bioinformatics
Publisher:
Institute of Electrical and Electronics Engineers (IEEE)
Journal:
IEEE/ACM Transactions on Computational Biology and Bioinformatics
Issue Date:
11-Jun-2015
DOI:
10.1109/TCBB.2015.2443782
Type:
Article
ISSN:
1545-5963
Additional Links:
http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=7122289
Appears in Collections:
Articles; Extreme Computing Research Center

Full metadata record

DC FieldValue Language
dc.contributor.authorWong, Ka-Chunen
dc.contributor.authorLi, Yueen
dc.contributor.authorPeng, Chengbinen
dc.contributor.authorWong, Hau-Sanen
dc.date.accessioned2015-12-21T08:26:26Zen
dc.date.available2015-12-21T08:26:26Zen
dc.date.issued2015-06-11en
dc.identifier.citationA Comparison Study for DNA Motif Modeling on Protein Binding Microarray 2015:1 IEEE/ACM Transactions on Computational Biology and Bioinformaticsen
dc.identifier.issn1545-5963en
dc.identifier.doi10.1109/TCBB.2015.2443782en
dc.identifier.urihttp://hdl.handle.net/10754/584252en
dc.description.abstractTranscription Factor Binding Sites (TFBSs) are relatively short (5-15 bp) and degenerate. Identifying them is a computationally challenging task. In particular, Protein Binding Microarray (PBM) is a high-throughput platform that can measure the DNA binding preference of a protein in a comprehensive and unbiased manner; for instance, a typical PBM experiment can measure binding signal intensities of a protein to all possible DNA k-mers (k=810). Since proteins can often bind to DNA with different binding intensities, one of the major challenges is to build motif models which can fully capture the quantitative binding affinity data. To learn DNA motif models from the non-convex objective function landscape, several optimization methods are compared and applied to the PBM motif model building problem. In particular, representative methods from different optimization paradigms have been chosen for modeling performance comparison on hundreds of PBM datasets. The results suggest that the multimodal optimization methods are very effective for capturing the binding preference information from PBM data. In particular, we observe a general performance improvement using di-nucleotide modeling over mono-nucleotide modeling. In addition, the models learned by the best-performing method are applied to two independent applications: PBM probe rotation testing and ChIP-Seq peak sequence prediction, demonstrating its biological applicability.en
dc.language.isoenen
dc.publisherInstitute of Electrical and Electronics Engineers (IEEE)en
dc.relation.urlhttp://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=7122289en
dc.rights(c) 2015 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other users, including reprinting/ republishing this material for advertising or promotional purposes, creating new collective works for resale or redistribution to servers or lists, or reuse of any copyrighted components of this work in other works.en
dc.subjectCrowdingen
dc.subjectGenetic Algorithmen
dc.subjectProtein Binding Microarrayen
dc.subjectRankingen
dc.subjectTranscription Factor Binding Siteen
dc.titleA Comparison Study for DNA Motif Modeling on Protein Binding Microarrayen
dc.typeArticleen
dc.contributor.departmentExtreme Computing Research Centeren
dc.identifier.journalIEEE/ACM Transactions on Computational Biology and Bioinformaticsen
dc.eprint.versionPost-printen
dc.contributor.institutionDepartment of Computer Science, City University of Hong Kong, Hong Kongen
dc.contributor.institutionComputer Science and Artificial Intelligence Laboratory (CSAIL) at Massachusetts Institute of Technology, Cambridge, MA 02139, United States of Americaen
dc.contributor.affiliationKing Abdullah University of Science and Technology (KAUST)en
kaust.authorPeng, Chengbinen
All Items in KAUST are protected by copyright, with all rights reserved, unless otherwise indicated.