Simplified method to predict mutual interactions of human transcription factors based on their primary structure
Article - Full Text
Supplemental File 1
Supplemental File 2
Supplemental File 3
KAUST DepartmentComputational Bioscience Research Center (CBRC)
Applied Mathematics and Computational Science Program
Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division
Permanent link to this recordhttp://hdl.handle.net/10754/325289
MetadataShow full item record
AbstractBackground: Physical interactions between transcription factors (TFs) are necessary for forming regulatory protein complexes and thus play a crucial role in gene regulation. Currently, knowledge about the mechanisms of these TF interactions is incomplete and the number of known TF interactions is limited. Computational prediction of such interactions can help identify potential new TF interactions as well as contribute to better understanding the complex machinery involved in gene regulation. Methodology: We propose here such a method for the prediction of TF interactions. The method uses only the primary sequence information of the interacting TFs, resulting in a much greater simplicity of the prediction algorithm. Through an advanced feature selection process, we determined a subset of 97 model features that constitute the optimized model in the subset we considered. The model, based on quadratic discriminant analysis, achieves a prediction accuracy of 85.39% on a blind set of interactions. This result is achieved despite the selection for the negative data set of only those TF from the same type of proteins, i.e. TFs that function in the same cellular compartment (nucleus) and in the same type of molecular process (transcription initiation). Such selection poses significant challenges for developing models with high specificity, but at the same time better reflects real-world problems. Conclusions: The performance of our predictor compares well to those of much more complex approaches for predicting TF and general protein-protein interactions, particularly when taking the reduced complexity of model utilisation into account. © 2011 Schmeier et al.
CitationSchmeier S, Jankovic B, Bajic VB (2011) Simplified Method to Predict Mutual Interactions of Human Transcription Factors Based on Their Primary Structure. PLoS ONE 6: e21887. doi:10.1371/journal.pone.0021887.
PublisherPublic Library of Science (PLoS)
PubMed Central IDPMC3130058
- Simplified method for predicting a functional class of proteins in transcription factor complexes.
- Authors: Piatek MJ, Schramm MC, Burra DD, Binshbreen A, Jankovic BR, Chowdhary R, Archer JA, Bajic VB
- Issue date: 2013
- Prediction of TF target sites based on atomistic models of protein-DNA complexes.
- Authors: Angarica VE, Pérez AG, Vasconcelos AT, Collado-Vides J, Contreras-Moreira B
- Issue date: 2008 Oct 16
- Identifying cooperative transcription factors in yeast using multiple data sources.
- Authors: Lai FJ, Jhu MH, Chiu CC, Huang YM, Wu WS
- Issue date: 2014
- Genome-wide prediction and characterization of interactions between transcription factors in Saccharomyces cerevisiae.
- Authors: Yu X, Lin J, Masuda T, Esumi N, Zack DJ, Qian J
- Issue date: 2006
Showing items related by title, author, creator and subject.
TcoF-DB: dragon database for human transcription co-factors and transcription factor interacting proteinsSchaefer, Ulf; Schmeier, Sebastian; Bajic, Vladimir B. (Nucleic Acids Research, Oxford University Press (OUP), 2010-10-21) [Article]The initiation and regulation of transcription in eukaryotes is complex and involves a large number of transcription factors (TFs), which are known to bind to the regulatory regions of eukaryotic DNA. Apart from TF-DNA binding, protein-protein interaction involving TFs is an essential component of the machinery facilitating transcriptional regulation. Proteins that interact with TFs in the context of transcription regulation but do not bind to the DNA themselves, we consider transcription co-factors (TcoFs). The influence of TcoFs on transcriptional regulation and initiation, although indirect, has been shown to be significant with the functionality of TFs strongly influenced by the presence of TcoFs. While the role of TFs and their interaction with regulatory DNA regions has been well-studied, the association between TFs and TcoFs has so far been given less attention. Here, we present a resource that is comprised of a collection of human TFs and the TcoFs with which they interact. Other proteins that have a proven interaction with a TF, but are not considered TcoFs are also included. Our database contains 157 high-confidence TcoFs and additionally 379 hypothetical TcoFs. These have been identified and classified according to the type of available evidence for their involvement in transcriptional regulation and their presence in the cell nucleus. We have divided TcoFs into four groups, one of which contains high-confidence TcoFs and three others contain TcoFs which are hypothetical to different extents. We have developed the Dragon Database for Human Transcription Co-Factors and Transcription Factor Interacting Proteins (TcoF-DB). A web-based interface for this resource can be freely accessed at http://cbrc.kaust.edu.sa/tcof/ and http://apps.sanbi.ac.za/tcof/. © The Author(s) 2010.
A Protein Complex Required for Polymerase V Transcripts and RNA- Directed DNA Methylation in ArabidopsisLaw, Julie A.; Ausín, Israel; Johnson, Lianna M.; Vashisht, Ajay A Amar; Zhu, Jian-Kang; Wohlschlegel, James A A.; Jacobsen, Steven E. (Current Biology, Elsevier BV, 2010-05) [Article]DNA methylation is an epigenetic modification associated with gene silencing. In Arabidopsis, DNA methylation is established by DOMAINS REARRANGED METHYLTRANSFERASE 2 (DRM2), which is targeted by small interfering RNAs through a pathway termed RNA-directed DNA methylation (RdDM) [1, 2]. Recently, RdDM was shown to require intergenic noncoding (IGN) transcripts that are dependent on the Pol V polymerase. These transcripts are proposed to function as scaffolds for the recruitment of downstream RdDM proteins, including DRM2, to loci that produce both siRNAs and IGN transcripts . However, the mechanism(s) through which Pol V is targeted to specific genomic loci remains largely unknown. Through affinity purification of two known RdDM components, DEFECTIVE IN RNA-DIRECTED DNA METHYLATION 1 (DRD1)  and DEFECTIVE IN MERISTEM SILENCING 3 (DMS3) [5, 6], we found that they copurify with each other and with a novel protein, RNA-DIRECTED DNA METHYLATION 1 (RDM1), forming a complex we term DDR. We also found that DRD1 copurified with Pol V subunits and that RDM1, like DRD1  and DMS3 , is required for the production of Pol V-dependent transcripts. These results suggest that the DDR complex acts in RdDM at a step upstream of the recruitment or activation of Pol V. © 2010 Elsevier Ltd. All rights reserved.
Evolving Transcription Factor Binding Site Models From Protein Binding Microarray DataWong, Ka-Chun; Peng, Chengbin; Li, Yue (IEEE Transactions on Cybernetics, Institute of Electrical and Electronics Engineers (IEEE), 2016-02-02) [Article]Protein binding microarray (PBM) is a high-throughput platform that can measure the DNA binding preference of a protein in a comprehensive and unbiased manner. In this paper, we describe the PBM motif model building problem. We apply several evolutionary computation methods and compare their performance with the interior point method, demonstrating their performance advantages. In addition, given the PBM domain knowledge, we propose and describe a novel method called kmerGA which makes domain-specific assumptions to exploit PBM data properties to build more accurate models than the other models built. The effectiveness and robustness of kmerGA is supported by comprehensive performance benchmarking on more than 200 datasets, time complexity analysis, convergence analysis, parameter analysis, and case studies. To demonstrate its utility further, kmerGA is applied to two real world applications: 1) PBM rotation testing and 2) ChIP-Seq peak sequence prediction. The results support the biological relevance of the models learned by kmerGA, and thus its real world applicability.