Structural analysis and dimerization profile of the SCAN domain of the pluripotency factor Zfp206
Huimei Hong, Felicia
Stanton, Lawrence W.
Kolatkar, Prasanna R.
KAUST DepartmentBioscience Core Lab
Permanent link to this recordhttp://hdl.handle.net/10754/334484
MetadataShow full item record
AbstractZfp206 (also named as Zscan10) belongs to the subfamily of C2H2 zinc finger transcription factors, which is characterized by the N-terminal SCAN domain. The SCAN domain mediates self-association and association between the members of SCAN family transcription factors, but the structural basis and selectivity determinants for complex formation is unknown. Zfp206 is important for maintaining the pluripotency of embryonic stem cells presumably by combinatorial assembly of itself or other SCAN family members on enhancer regions. To gain insights into the folding topology and selectivity determinants for SCAN dimerization, we solved the 1.85 crystal structure of the SCAN domain of Zfp206. In vitro binding studies using a panel of 20 SCAN proteins indicate that the SCAN domain Zfp206 can selectively associate with other members of SCAN family transcription factors. Deletion mutations showed that the N-terminal helix 1 is critical for heterodimerization. Double mutations and multiple mutations based on the Zfp206SCAN-Zfp110SCAN model suggested that domain swapped topology is a possible preference for Zfp206SCAN-Zfp110SCAN heterodimer. Together, we demonstrate that the Zfp206SCAN constitutes a protein module that enables C2H2 transcription factor dimerization in a highly selective manner using a domain-swapped interface architecture and identify novel partners for Zfp206 during embryonal development. 2012 The Author(s).
CitationLiang Y, Huimei Hong F, Ganesan P, Jiang S, Jauch R, et al. (2012) Structural analysis and dimerization profile of the SCAN domain of the pluripotency factor Zfp206. Nucleic Acids Research 40: 8721-8732. doi:10.1093/nar/gks611.
PublisherOxford University Press (OUP)
JournalNucleic Acids Research
PubMed Central IDPMC3458555
The following license files are associated with this item:
Except where otherwise noted, this item's license is described as This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
- Crystal optimization and preliminary diffraction data analysis of the SCAN domain of Zfp206.
- Authors: Liang Y, Choo SH, Rossbach M, Baburajendran N, Palasingam P, Kolatkar PR
- Issue date: 2012 Apr 1
- Zfp206 is a transcription factor that controls pluripotency of embryonic stem cells.
- Authors: Wang ZX, Kueh JL, Teh CH, Rossbach M, Lim L, Li P, Wong KY, Lufkin T, Robson P, Stanton LW
- Issue date: 2007 Sep
- Zfp206, Oct4, and Sox2 are integrated components of a transcriptional regulatory network in embryonic stem cells.
- Authors: Yu HB, Kunarso G, Hong FH, Stanton LW
- Issue date: 2009 Nov 6
- The structures of transcription factor CGL2947 from Corynebacterium glutamicum in two crystal forms: a novel homodimer assembling and the implication for effector-binding mode.
- Authors: Gao YG, Yao M, Itou H, Zhou Y, Tanaka I
- Issue date: 2007 Sep
- The dimerization domain of HNF-1alpha: structure and plasticity of an intertwined four-helix bundle with application to diabetes mellitus.
- Authors: Narayana N, Hua Q, Weiss MA
- Issue date: 2001 Jul 13
Showing items related by title, author, creator and subject.
CMsearch: simultaneous exploration of protein sequence space and structure space improves not only protein homology detection but also protein structure predictionCui, Xuefeng; Lu, Zhiwu; wang, sheng; Wang, Jim Jing-Yan; Gao, Xin (Bioinformatics, Oxford University Press (OUP), 2016-06-15) [Article]Motivation: Protein homology detection, a fundamental problem in computational biology, is an indispensable step toward predicting protein structures and understanding protein functions. Despite the advances in recent decades on sequence alignment, threading and alignment-free methods, protein homology detection remains a challenging open problem. Recently, network methods that try to find transitive paths in the protein structure space demonstrate the importance of incorporating network information of the structure space. Yet, current methods merge the sequence space and the structure space into a single space, and thus introduce inconsistency in combining different sources of information. Method: We present a novel network-based protein homology detection method, CMsearch, based on cross-modal learning. Instead of exploring a single network built from the mixture of sequence and structure space information, CMsearch builds two separate networks to represent the sequence space and the structure space. It then learns sequence–structure correlation by simultaneously taking sequence information, structure information, sequence space information and structure space information into consideration. Results: We tested CMsearch on two challenging tasks, protein homology detection and protein structure prediction, by querying all 8332 PDB40 proteins. Our results demonstrate that CMsearch is insensitive to the similarity metrics used to define the sequence and the structure spaces. By using HMM–HMM alignment as the sequence similarity metric, CMsearch clearly outperforms state-of-the-art homology detection methods and the CASP-winning template-based protein structure prediction methods.
Protein secondary structure appears to be robust under in silico evolution while protein disorder appears not to be.Schaefer, Christian; Schlessinger, Avner; Rost, Burkhard (Bioinformatics, Oxford University Press (OUP), 2010-01-16) [Article]MOTIVATION: The mutation of amino acids often impacts protein function and structure. Mutations without negative effect sustain evolutionary pressure. We study a particular aspect of structural robustness with respect to mutations: regular protein secondary structure and natively unstructured (intrinsically disordered) regions. Is the formation of regular secondary structure an intrinsic feature of amino acid sequences, or is it a feature that is lost upon mutation and is maintained by evolution against the odds? Similarly, is disorder an intrinsic sequence feature or is it difficult to maintain? To tackle these questions, we in silico mutated native protein sequences into random sequence-like ensembles and monitored the change in predicted secondary structure and disorder. RESULTS: We established that by our coarse-grained measures for change, predictions and observations were similar, suggesting that our results were not biased by prediction mistakes. Changes in secondary structure and disorder predictions were linearly proportional to the change in sequence. Surprisingly, neither the content nor the length distribution for the predicted secondary structure changed substantially. Regions with long disorder behaved differently in that significantly fewer such regions were predicted after a few mutation steps. Our findings suggest that the formation of regular secondary structure is an intrinsic feature of random amino acid sequences, while the formation of long-disordered regions is not an intrinsic feature of proteins with disordered regions. Put differently, helices and strands appear to be maintained easily by evolution, whereas maintaining disordered regions appears difficult. Neutral mutations with respect to disorder are therefore very unlikely.
Applications in computational structural biology: the generation of a protein modelling pipeline and the structural analysis of patient-derived mutationsGuzmán-Vega, Francisco J. (2019-04) [Thesis]
Advisor: Arold, Stefan T.
Committee members: Gao, Xin; Jaremko, ŁukaszBesides helping us advance the understanding of the physicochemical principles governing the three-dimensional folding of proteins and their mechanisms of action, the ability to build, evaluate, and optimize reliable 3D protein models has provided valuable tools for the development of different applications in the fields of biotechnology, medicine, and synthetic biology. The development of automated algorithms has made many of the current methodologies for protein modelling and visualization available to researchers from all backgrounds, without the need to be familiarized with the inner workings of their statistical and biophysical principles. However, there is still a lack in some areas where the learning curves are too steep for the methods to be widely used by the average non-programmer molecular biologist, or the implementation of the methods lacks key features to improve the interpretability and impact of their results. Throughout this work, I will focus on two different applications in the field of structural biology where computational methods provide useful tools to aid in synthetic biology or medical research. The first application is the implementation of a pipeline to build models of protein complexes by joining structured domains with disordered linkers, in individual or multiple chains, and with the possibility of building symmetric structures. Its capabilities and performance for the generation of complex constructs are evaluated, and possible areas of improvement described. The second application, but not less important, involves the structural analysis of patient-derived protein mutants using protein modelling techniques and visualization tools, to elucidate the potential molecular basis for the patient’s phenotype. The methodology for these analyses is described, along with the results and observations from 22 such cases in 13 different proteins. Finally, the need for a dedicated pipeline for the structure-based prediction of the effect of different types of mutations on the stability and function of proteins, complementary to available sequence-based approaches, is highlighted.