HOCOMOCO: towards a complete collection of transcription factor binding models for human and mouse via large-scale ChIP-Seq analysis

Handle URI:
http://hdl.handle.net/10754/626157
Title:
HOCOMOCO: towards a complete collection of transcription factor binding models for human and mouse via large-scale ChIP-Seq analysis
Authors:
Kulakovskiy, Ivan V.; Vorontsov, Ilya E.; Yevshin, Ivan S.; Sharipov, Ruslan N. ( 0000-0003-2182-5493 ) ; Fedorova, Alla D.; Rumynskiy, Eugene I.; Medvedeva, Yulia A.; Magana-Mora, Arturo ( 0000-0001-8696-7068 ) ; Bajic, Vladimir B. ( 0000-0001-5435-4750 ) ; Papatsenko, Dmitry A.; Kolpakov, Fedor A.; Makeev, Vsevolod J.
Abstract:
We present a major update of the HOCOMOCO collection that consists of patterns describing DNA binding specificities for human and mouse transcription factors. In this release, we profited from a nearly doubled volume of published in vivo experiments on transcription factor (TF) binding to expand the repertoire of binding models, replace low-quality models previously based on in vitro data only and cover more than a hundred TFs with previously unknown binding specificities. This was achieved by systematic motif discovery from more than five thousand ChIP-Seq experiments uniformly processed within the BioUML framework with several ChIP-Seq peak calling tools and aggregated in the GTRD database. HOCOMOCO v11 contains binding models for 453 mouse and 680 human transcription factors and includes 1302 mononucleotide and 576 dinucleotide position weight matrices, which describe primary binding preferences of each transcription factor and reliable alternative binding specificities. An interactive interface and bulk downloads are available on the web: http://hocomoco.autosome.ru and http://www.cbrc.kaust.edu.sa/hocomoco11. In this release, we complement HOCOMOCO by MoLoTool (Motif Location Toolbox, http://molotool.autosome.ru) that applies HOCOMOCO models for visualization of binding sites in short DNA sequences.
KAUST Department:
Computational Bioscience Research Center (CBRC)
Citation:
Kulakovskiy IV, Vorontsov IE, Yevshin IS, Sharipov RN, Fedorova AD, et al. (2017) HOCOMOCO: towards a complete collection of transcription factor binding models for human and mouse via large-scale ChIP-Seq analysis. Nucleic Acids Research. Available: http://dx.doi.org/10.1093/nar/gkx1106.
Publisher:
Oxford University Press (OUP)
Journal:
Nucleic Acids Research
KAUST Grant Number:
BAS/1/1606-01-01
Issue Date:
31-Oct-2017
DOI:
10.1093/nar/gkx1106
Type:
Article
ISSN:
0305-1048; 1362-4962
Sponsors:
The project was primarily supported by Russian Science Foundation [17-74-10188 to I.V.K.]; A.M.M. and V.B.B. were supported by King Abdullah University of Science and Technology (KAUST) [baseline fund BAS/1/1606-01-01 of V.B.B.]; I.E.V. was personally supported by the Skoltech Systems Biology Fellowship. Funding for open access charge: Russian Science Foundation [17–74–10188 to I.V.K.].
Additional Links:
https://academic.oup.com/nar/article/doi/10.1093/nar/gkx1106/4616875
Appears in Collections:
Articles; Computational Bioscience Research Center (CBRC)

Full metadata record

DC FieldValue Language
dc.contributor.authorKulakovskiy, Ivan V.en
dc.contributor.authorVorontsov, Ilya E.en
dc.contributor.authorYevshin, Ivan S.en
dc.contributor.authorSharipov, Ruslan N.en
dc.contributor.authorFedorova, Alla D.en
dc.contributor.authorRumynskiy, Eugene I.en
dc.contributor.authorMedvedeva, Yulia A.en
dc.contributor.authorMagana-Mora, Arturoen
dc.contributor.authorBajic, Vladimir B.en
dc.contributor.authorPapatsenko, Dmitry A.en
dc.contributor.authorKolpakov, Fedor A.en
dc.contributor.authorMakeev, Vsevolod J.en
dc.date.accessioned2017-11-14T12:46:06Z-
dc.date.available2017-11-14T12:46:06Z-
dc.date.issued2017-10-31en
dc.identifier.citationKulakovskiy IV, Vorontsov IE, Yevshin IS, Sharipov RN, Fedorova AD, et al. (2017) HOCOMOCO: towards a complete collection of transcription factor binding models for human and mouse via large-scale ChIP-Seq analysis. Nucleic Acids Research. Available: http://dx.doi.org/10.1093/nar/gkx1106.en
dc.identifier.issn0305-1048en
dc.identifier.issn1362-4962en
dc.identifier.doi10.1093/nar/gkx1106en
dc.identifier.urihttp://hdl.handle.net/10754/626157-
dc.description.abstractWe present a major update of the HOCOMOCO collection that consists of patterns describing DNA binding specificities for human and mouse transcription factors. In this release, we profited from a nearly doubled volume of published in vivo experiments on transcription factor (TF) binding to expand the repertoire of binding models, replace low-quality models previously based on in vitro data only and cover more than a hundred TFs with previously unknown binding specificities. This was achieved by systematic motif discovery from more than five thousand ChIP-Seq experiments uniformly processed within the BioUML framework with several ChIP-Seq peak calling tools and aggregated in the GTRD database. HOCOMOCO v11 contains binding models for 453 mouse and 680 human transcription factors and includes 1302 mononucleotide and 576 dinucleotide position weight matrices, which describe primary binding preferences of each transcription factor and reliable alternative binding specificities. An interactive interface and bulk downloads are available on the web: http://hocomoco.autosome.ru and http://www.cbrc.kaust.edu.sa/hocomoco11. In this release, we complement HOCOMOCO by MoLoTool (Motif Location Toolbox, http://molotool.autosome.ru) that applies HOCOMOCO models for visualization of binding sites in short DNA sequences.en
dc.description.sponsorshipThe project was primarily supported by Russian Science Foundation [17-74-10188 to I.V.K.]; A.M.M. and V.B.B. were supported by King Abdullah University of Science and Technology (KAUST) [baseline fund BAS/1/1606-01-01 of V.B.B.]; I.E.V. was personally supported by the Skoltech Systems Biology Fellowship. Funding for open access charge: Russian Science Foundation [17–74–10188 to I.V.K.].en
dc.publisherOxford University Press (OUP)en
dc.relation.urlhttps://academic.oup.com/nar/article/doi/10.1093/nar/gkx1106/4616875en
dc.rightsThis is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.comen
dc.rights.urihttp://creativecommons.org/licenses/by-nc/4.0/en
dc.titleHOCOMOCO: towards a complete collection of transcription factor binding models for human and mouse via large-scale ChIP-Seq analysisen
dc.typeArticleen
dc.contributor.departmentComputational Bioscience Research Center (CBRC)en
dc.identifier.journalNucleic Acids Researchen
dc.eprint.versionPublisher's Version/PDFen
dc.contributor.institutionCenter for Data-Intensive Biomedicine and Biotechnology, Skolkovo Institute of Science and Technology, 143026 Moscow, Russiaen
dc.contributor.institutionVavilov Institute of General Genetics, Russian Academy of Sciences, 119991, GSP-1, Gubkina 3, Moscow, Russiaen
dc.contributor.institutionEngelhardt Institute of Molecular Biology, Russian Academy of Sciences, 119991, GSP-1, Vavilova 32, Moscow, Russiaen
dc.contributor.institutionBIOSOFT.RU Ltd, 630058, Russkaya 41/1, Novosibirsk, Russiaen
dc.contributor.institutionNovosibirsk State University, 630090, Pirogova 2, Novosibirsk, Russiaen
dc.contributor.institutionInstitute of Computational Technologies, Siberian Branch of the Russian Academy of Sciences, 630090, Akad. Rzhanova 6, Novosibirsk, Russiaen
dc.contributor.institutionFaculty of Bioengineering and Bioinformatics, Lomonosov Moscow State University, 119234, Leninskiye Gory 1–73, Moscow, Russiaen
dc.contributor.institutionMoscow Institute of Physics and Technology (State University), 141700, 9 Institutskiy per, Dolgoprudny, Russiaen
dc.contributor.institutionInstitute of Bioengineering, Research Center of Biotechnology of the Russian Academy of Sciences, 119071, 2 Leninsky Ave. 33, Moscow, Russiaen
dc.contributor.institutionNational Institute of Advanced Industrial Science and Technology (AIST), Com. Bio Big-Data Open Innovation Lab. (CBBD-OIL), AIST Tokyo Waterfront Main Bldg. #323, 2-3-26 Aomi, Tokyo 135-0064, Japanen
kaust.authorMagana-Mora, Arturoen
kaust.authorBajic, Vladimir B.en
kaust.grant.numberBAS/1/1606-01-01en
All Items in KAUST are protected by copyright, with all rights reserved, unless otherwise indicated.