Generalizing and learning protein-DNA binding sequence representations by an evolutionary algorithm

Handle URI:
http://hdl.handle.net/10754/561713
Title:
Generalizing and learning protein-DNA binding sequence representations by an evolutionary algorithm
Authors:
Wong, Ka Chun; Peng, Chengbin ( 0000-0002-7445-2638 ) ; Wong, Manhon; Leung, Kwongsak
Abstract:
Protein-DNA bindings are essential activities. Understanding them forms the basis for further deciphering of biological and genetic systems. In particular, the protein-DNA bindings between transcription factors (TFs) and transcription factor binding sites (TFBSs) play a central role in gene transcription. Comprehensive TF-TFBS binding sequence pairs have been found in a recent study. However, they are in one-to-one mappings which cannot fully reflect the many-to-many mappings within the bindings. An evolutionary algorithm is proposed to learn generalized representations (many-to-many mappings) from the TF-TFBS binding sequence pairs (one-to-one mappings). The generalized pairs are shown to be more meaningful than the original TF-TFBS binding sequence pairs. Some representative examples have been analyzed in this study. In particular, it shows that the TF-TFBS binding sequence pairs are not presumably in one-to-one mappings. They can also exhibit many-to-many mappings. The proposed method can help us extract such many-to-many information from the one-to-one TF-TFBS binding sequence pairs found in the previous study, providing further knowledge in understanding the bindings between TFs and TFBSs. © 2011 Springer-Verlag.
KAUST Department:
Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division; Computer Science Program
Publisher:
Springer Verlag
Journal:
Soft Computing
Issue Date:
5-Feb-2011
DOI:
10.1007/s00500-011-0692-5
Type:
Article
ISSN:
14327643
Sponsors:
The authors are grateful to the anonymous reviewers for their valuable comments. They would like to thank Tak-Ming Chan for his help on surveying the related works. This research is partially supported by the grants from the Research Grants Council of the Hong Kong Special Administrative Region, China (Project Nos. 414107 and 414708).
Appears in Collections:
Articles; Computer Science Program; Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division

Full metadata record

DC FieldValue Language
dc.contributor.authorWong, Ka Chunen
dc.contributor.authorPeng, Chengbinen
dc.contributor.authorWong, Manhonen
dc.contributor.authorLeung, Kwongsaken
dc.date.accessioned2015-08-03T09:02:55Zen
dc.date.available2015-08-03T09:02:55Zen
dc.date.issued2011-02-05en
dc.identifier.issn14327643en
dc.identifier.doi10.1007/s00500-011-0692-5en
dc.identifier.urihttp://hdl.handle.net/10754/561713en
dc.description.abstractProtein-DNA bindings are essential activities. Understanding them forms the basis for further deciphering of biological and genetic systems. In particular, the protein-DNA bindings between transcription factors (TFs) and transcription factor binding sites (TFBSs) play a central role in gene transcription. Comprehensive TF-TFBS binding sequence pairs have been found in a recent study. However, they are in one-to-one mappings which cannot fully reflect the many-to-many mappings within the bindings. An evolutionary algorithm is proposed to learn generalized representations (many-to-many mappings) from the TF-TFBS binding sequence pairs (one-to-one mappings). The generalized pairs are shown to be more meaningful than the original TF-TFBS binding sequence pairs. Some representative examples have been analyzed in this study. In particular, it shows that the TF-TFBS binding sequence pairs are not presumably in one-to-one mappings. They can also exhibit many-to-many mappings. The proposed method can help us extract such many-to-many information from the one-to-one TF-TFBS binding sequence pairs found in the previous study, providing further knowledge in understanding the bindings between TFs and TFBSs. © 2011 Springer-Verlag.en
dc.description.sponsorshipThe authors are grateful to the anonymous reviewers for their valuable comments. They would like to thank Tak-Ming Chan for his help on surveying the related works. This research is partially supported by the grants from the Research Grants Council of the Hong Kong Special Administrative Region, China (Project Nos. 414107 and 414708).en
dc.publisherSpringer Verlagen
dc.subjectBioinformaticsen
dc.subjectCrowdingen
dc.subjectDNAen
dc.subjectGene transcriptionen
dc.subjectPDBen
dc.subjectProteinen
dc.subjectSequenceen
dc.subjectTRANSFACen
dc.titleGeneralizing and learning protein-DNA binding sequence representations by an evolutionary algorithmen
dc.typeArticleen
dc.contributor.departmentComputer, Electrical and Mathematical Sciences and Engineering (CEMSE) Divisionen
dc.contributor.departmentComputer Science Programen
dc.identifier.journalSoft Computingen
dc.contributor.institutionDepartment of Computer Science and Engineering, The Chinese University of Hong Kong, Shatin, Hong Kongen
kaust.authorPeng, Chengbinen
kaust.authorWong, Ka Chunen
All Items in KAUST are protected by copyright, with all rights reserved, unless otherwise indicated.