Protein domain recurrence and order can enhance prediction of protein functions
KAUST DepartmentComputer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division
Computational Bioscience Research Center (CBRC)
MetadataShow full item record
AbstractMotivation: Burgeoning sequencing technologies have generated massive amounts of genomic and proteomic data. Annotating the functions of proteins identified in this data has become a big and crucial problem. Various computational methods have been developed to infer the protein functions based on either the sequences or domains of proteins. The existing methods, however, ignore the recurrence and the order of the protein domains in this function inference. Results: We developed two new methods to infer protein functions based on protein domain recurrence and domain order. Our first method, DRDO, calculates the posterior probability of the Gene Ontology terms based on domain recurrence and domain order information, whereas our second method, DRDO-NB, relies on the nave Bayes methodology using the same domain architecture information. Our large-scale benchmark comparisons show strong improvements in the accuracy of the protein function inference achieved by our new methods, demonstrating that domain recurrence and order can provide important information for inference of protein functions. The Author(s) 2012. Published by Oxford University Press.
CitationMessih MA, Chitale M, Bajic VB, Kihara D, Gao X (2012) Protein domain recurrence and order can enhance prediction of protein functions. Bioinformatics 28: i444-i450. doi:10.1093/bioinformatics/bts398.
PublisherOxford University Press (OUP)
PubMed Central IDPMC3436825
The following license files are associated with this item:
Except where otherwise noted, this item's license is described as This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
- CMsearch: simultaneous exploration of protein sequence space and structure space improves not only protein homology detection but also protein structure prediction.
- Authors: Cui X, Lu Z, Wang S, Jing-Yan Wang J, Gao X
- Issue date: 2016 Jun 15
- SECOM: a novel hash seed and community detection based-approach for genome-scale protein domain identification.
- Authors: Fan M, Wong KC, Ryu T, Ravasi T, Gao X
- Issue date: 2012
- FuncPatch: a web server for the fast Bayesian inference of conserved functional patches in protein 3D structures.
- Authors: Huang YF, Golding GB
- Issue date: 2015 Feb 15
- DeepGO: predicting protein functions from sequence and interactions using a deep ontology-aware classifier.
- Authors: Kulmanov M, Khan MA, Hoehndorf R, Wren J
- Issue date: 2018 Feb 15
- A naive Bayes model to predict coupling between seven transmembrane domain receptors and G-proteins.
- Authors: Cao J, Panetta R, Yue S, Steyaert A, Young-Bellido M, Ahmad S
- Issue date: 2003 Jan 22
Showing items related by title, author, creator and subject.
Interaction between the triglyceride lipase ATGL and the arf1 activator GBF1Ellong, Emy Njoh; Soni, Krishnakant G.; Bui, Quynh-Trang; Sougrat, Rachid; Golinelli-Cohen, Marie-Pierre; Jackson, Catherine L. (Public Library of Science (PLoS), 2011-07-18)The Arf1 exchange factor GBF1 (Golgi Brefeldin A resistance factor 1) and its effector COPI are required for delivery of ATGL (adipose triglyceride lipase) to lipid droplets (LDs). Using yeast two hybrid, co-immunoprecipitation in mammalian cells and direct protein binding approaches, we report here that GBF1 and ATGL interact directly and in cells, through multiple contact sites on each protein. The C-terminal region of ATGL interacts with N-terminal domains of GBF1, including the catalytic Sec7 domain, but not with full-length GBF1 or its entire N-terminus. The N-terminal lipase domain of ATGL (called the patatin domain) interacts with two C-terminal domains of GBF1, HDS (Homology downstream of Sec7) 1 and HDS2. These two domains of GBF1 localize to lipid droplets when expressed alone in cells, but not to the Golgi, unlike the full-length GBF1 protein, which localizes to both. We suggest that interaction of GBF1 with ATGL may be involved in the membrane trafficking pathway mediated by GBF1, Arf1 and COPI that contributes to the localization of ATGL to lipid droplets.
Dissecting the interactions of SERRATE with RNA and DICER-LIKE 1 in Arabidopsis microRNA precursor processingIwata, Yuji; Takahashi, Masateru; Fedoroff, Nina V.; Hamdan, Samir (Oxford University Press (OUP), 2013-08-05)Efficient and precise microRNA (miRNA) biogenesis in Arabidopsis is mediated by the RNaseIII-family enzyme DICER-LIKE 1 (DCL1), double-stranded RNA-binding protein HYPONASTIC LEAVES 1 and the zinc-finger (ZnF) domain-containing protein SERRATE (SE). In the present study, we examined primary miRNA precursor (pri-miRNA) processing by highly purified recombinant DCL1 and SE proteins and found that SE is integral to pri-miRNA processing by DCL1. SE stimulates DCL1 cleavage of the pri-miRNA in an ionic strength-dependent manner. SE uses its N-terminal domain to bind to RNA and requires both N-terminal and ZnF domains to bind to DCL1. However, when DCL1 is bound to RNA, the interaction with the ZnF domain of SE becomes indispensible and stimulates the activity of DCL1 without requiring SE binding to RNA. Our results suggest that the interactions among SE, DCL1 and RNA are a potential point for regulating pri-miRNA processing. 2013 The Author(s) 2013.
Solution Structure of the Tandem Acyl Carrier Protein Domains from a Polyunsaturated Fatty Acid Synthase Reveals Beads-on-a-String ConfigurationTrujillo, Uldaeliz; Vázquez-Rosa, Edwin; Oyola-Robles, Delise; Stagg, Loren J.; Vassallo, David A.; Vega, Irving E.; Arold, Stefan T.; Baerga-Ortiz, Abel (Public Library of Science (PLoS), 2013-02-28)The polyunsaturated fatty acid (PUFA) synthases from deep-sea bacteria invariably contain multiple acyl carrier protein (ACP) domains in tandem. This conserved tandem arrangement has been implicated in both amplification of fatty acid production (additive effect) and in structural stabilization of the multidomain protein (synergistic effect). While the more accepted model is one in which domains act independently, recent reports suggest that ACP domains may form higher oligomers. Elucidating the three-dimensional structure of tandem arrangements may therefore give important insights into the functional relevance of these structures, and hence guide bioengineering strategies. In an effort to elucidate the three-dimensional structure of tandem repeats from deep-sea anaerobic bacteria, we have expressed and purified a fragment consisting of five tandem ACP domains from the PUFA synthase from Photobacterium profundum. Analysis of the tandem ACP fragment by analytical gel filtration chromatography showed a retention time suggestive of a multimeric protein. However, small angle X-ray scattering (SAXS) revealed that the multi-ACP fragment is an elongated monomer which does not form a globular unit. Stokes radii calculated from atomic monomeric SAXS models were comparable to those measured by analytical gel filtration chromatography, showing that in the gel filtration experiment, the molecular weight was overestimated due to the elongated protein shape. Thermal denaturation monitored by circular dichroism showed that unfolding of the tandem construct was not cooperative, and that the tandem arrangement did not stabilize the protein. Taken together, these data are consistent with an elongated beads-on-a-string arrangement of the tandem ACP domains in PUFA synthases, and speak against synergistic biocatalytic effects promoted by quaternary structuring. Thus, it is possible to envision bioengineering strategies which simply involve the artificial linking of multiple ACP domains for increasing the yield of fatty acids in bacterial cultures. 2013 Trujillo et al.