Structure-aided prediction of mammalian transcription factor complexes in conserved non-coding elements
MetadataShow full item record
AbstractMapping the DNA-binding preferences of transcription factor (TF) complexes is critical for deciphering the functions of cis-regulatory elements. Here, we developed a computational method that compares co-occurring motif spacings in conserved versus unconserved regions of the human genome to detect evolutionarily constrained binding sites of rigid TF complexes. Structural data were used to estimate TF complex physical plausibility, explore overlapping motif arrangements seldom tackled by non-structure-aware methods, and generate and analyse three-dimensional models of the predicted complexes bound to DNA. Using this approach, we predicted 422 physically realistic TF complex motifs at 18% false discovery rate, the majority of which (326, 77%) contain some sequence overlap between binding sites. The set of mostly novel complexes is enriched in known composite motifs, predictive of binding site configurations in TF-TF-DNA crystal structures, and supported by ChIP-seq datasets. Structural modelling revealed three cooperativity mechanisms: direct protein-protein interactions, potentially indirect interactions and 'through-DNA' interactions. Indeed, 38% of the predicted complexes were found to contain four or more bases in which TF pairs appear to synergize through overlapping binding to the same DNA base pairs in opposite grooves or strands. Our TF complex and associated binding site predictions are available as a web resource at http://bejerano.stanford.edu/complex.
CitationGuturu H, Doxey AC, Wenger AM, Bejerano G (2013) Structure-aided prediction of mammalian transcription factor complexes in conserved non-coding elements. Philosophical Transactions of the Royal Society B: Biological Sciences 368: 20130029–20130029. Available: http://dx.doi.org/10.1098/rstb.2013.0029.
SponsorsH.G. is supported by National Science Foundation Fellowship DGE-1147470. A.C.D. is supported by a Natural Sciences and Engineering Research Council of Canada Postdoctoral Fellowship (PDF). A.M.W. is supported by a Bio-X Stanford Interdisciplinary Graduate Fellowship. G.B. is supported by NIH grants R01HG005058 and R01HD059862 and KAUST. G.B. is a Packard Fellow and Microsoft Research Fellow. The funders had no role in study design, data collection and analysis, decision to publish or preparation of the manuscript.
PublisherThe Royal Society
PubMed Central IDPMC3826502
CollectionsPublications Acknowledging KAUST Support
- A biophysical model for analysis of transcription factor interaction and binding site arrangement from genome-wide binding data.
- Authors: He X, Chen CC, Hong F, Fang F, Sinha S, Ng HH, Zhong S
- Issue date: 2009 Dec 1
- Non-targeted transcription factors motifs are a systemic component of ChIP-seq datasets.
- Authors: Worsley Hunt R, Wasserman WW
- Issue date: 2014 Jul 29
- De novo prediction of cis-regulatory elements and modules through integrative analysis of a large number of ChIP datasets.
- Authors: Niu M, Tabari ES, Su Z
- Issue date: 2014 Dec 2
- PhyloGibbs: a Gibbs sampling motif finder that incorporates phylogeny.
- Authors: Siddharthan R, Siggia ED, van Nimwegen E
- Issue date: 2005 Dec
- An equilibrium partitioning model connecting gene expression and cis-motif content.
- Authors: Mellor J, DeLisi C
- Issue date: 2006 Jul 15