Blind prediction of homo- and hetero- protein complexes: The CASP13-CAPRI experiment.
AuthorsLensink, Marc F.
Chaleil, Raphaël A G
Bates, Paul A
Rodríguez-Lumbreras, Luis Angel
Raghavendra Maddhuri Venkata Subraman, Sai
Moal, Iain H
Ritchie, David W
Chauvot de Beauchêne, Isaure
Echartea, Maria Elisa Ruiz
Barradas Bautista, Didier
Kundrotas, Petras J
Badal, Varsha D
Vakser, Ilya A
Guest, Johnathan D
Pierce, Brian G
Ryan Merideth, Benjamin
Koukos, Panos I
Trellet, Mikael E
Melquiond, Adrien S J
van Noort, Charlotte W
Honorato, Rodrigo V
Bonvin, Alexandre M.J.J.
Wodak, Shoshana J
KAUST DepartmentKAUST Catalysis Center (KCC)
Chemical Science Program
Physical Sciences and Engineering (PSE) Division
Embargo End Date2020-10-16
Permanent link to this recordhttp://hdl.handle.net/10754/658652
MetadataShow full item record
AbstractWe present the results for CAPRI Round 46, the 3rd joint CASP-CAPRI protein assembly prediction challenge. The Round comprised a total of 20 targets including 14 homo-oligomers and 6 hetero-complexes. Eight of the homo-oligomer targets and one hetero-dimer comprised proteins that could be readily modeled using templates from the Protein Data Bank, often available for the full assembly. The remaining 11 targets comprised 5 homo-dimers, 3 hetero-dimers and two higher-order assemblies. These were more difficult to model, as their prediction mainly involved 'ab-initio' docking of subunit models derived from distantly related templates. A total of ~30 CAPRI groups, including 9 automatic servers, submitted on average ~2000 models per target. About 17 groups participated in the CAPRI scoring rounds, offered for most targets, submitting ~170 models per target. The prediction performance, measured by the fraction of models of acceptable quality or higher submitted across all predictors groups, was very good to excellent for the 9 easy targets. Poorer performance was achieved by predictors for the 11 difficult targets, with medium and high quality models submitted for only 3 of these targets. A similar performance 'gap' was displayed by scorer groups, highlighting yet again the unmet challenge of modeling the conformational changes of the protein components that occur upon binding or that must be accounted for in template-based modeling. Our analysis also indicates that residues in binding interfaces were less well predicted in this set of targets than in previous Rounds, providing useful insights for directions of future improvements. This article is protected by copyright. All rights reserved.
CitationLensink, M. F., Brysbaert, G., Nadzirin, N., Velankar, S., Chaleil, R. A. G., Gerguri, T., … Grudinin, S. (2019). Blind prediction of homo- and hetero- protein complexes: The CASP13-CAPRI experiment. Proteins: Structure, Function, and Bioinformatics. doi:10.1002/prot.25838
SponsorsWe thank the CASP Management and in particular Andriy Kryshtafovych, for valuable help and support in running the assembly prediction challenge. We also express gratitude to the structural biologists who provided the targets for this challenge and to the CAPRI management team and predictor groups for stimulating discussion, valuable input and cooperation.
Showing items related by title, author, creator and subject.
ProDis-ContSHC: Learning protein dissimilarity measures and hierarchical context coherently for protein-protein comparison in protein database retrievalWang, Jim Jing-Yan; Gao, Xin; Wang, Quanquan; Li, Yongping (BMC Bioinformatics, Springer Nature, 2012-05-08) [Article]Background: The need to retrieve or classify protein molecules using structure or sequence-based similarity measures underlies a wide range of biomedical applications. Traditional protein search methods rely on a pairwise dissimilarity/similarity measure for comparing a pair of proteins. This kind of pairwise measures suffer from the limitation of neglecting the distribution of other proteins and thus cannot satisfy the need for high accuracy of the retrieval systems. Recent work in the machine learning community has shown that exploiting the global structure of the database and learning the contextual dissimilarity/similarity measures can improve the retrieval performance significantly. However, most existing contextual dissimilarity/similarity learning algorithms work in an unsupervised manner, which does not utilize the information of the known class labels of proteins in the database.Results: In this paper, we propose a novel protein-protein dissimilarity learning algorithm, ProDis-ContSHC. ProDis-ContSHC regularizes an existing dissimilarity measure dij by considering the contextual information of the proteins. The context of a protein is defined by its neighboring proteins. The basic idea is, for a pair of proteins (i, j), if their context N (i) and N (j) is similar to each other, the two proteins should also have a high similarity. We implement this idea by regularizing dij by a factor learned from the context N (i) and N (j). Moreover, we divide the context to hierarchial sub-context and get the contextual dissimilarity vector for each protein pair. Using the class label information of the proteins, we select the relevant (a pair of proteins that has the same class labels) and irrelevant (with different labels) protein pairs, and train an SVM model to distinguish between their contextual dissimilarity vectors. The SVM model is further used to learn a supervised regularizing factor. Finally, with the new Supervised learned Dissimilarity measure, we update the Protein Hierarchial Context Coherently in an iterative algorithm--ProDis-ContSHC.We test the performance of ProDis-ContSHC on two benchmark sets, i.e., the ASTRAL 1.73 database and the FSSP/DALI database. Experimental results demonstrate that plugging our supervised contextual dissimilarity measures into the retrieval systems significantly outperforms the context-free dissimilarity/similarity measures and other unsupervised contextual dissimilarity measures that do not use the class label information.Conclusions: Using the contextual proteins with their class labels in the database, we can improve the accuracy of the pairwise dissimilarity/similarity measures dramatically for the protein retrieval tasks. In this work, for the first time, we propose the idea of supervised contextual dissimilarity learning, resulting in the ProDis-ContSHC algorithm. Among different contextual dissimilarity learning approaches that can be used to compare a pair of proteins, ProDis-ContSHC provides the highest accuracy. Finally, ProDis-ContSHC compares favorably with other methods reported in the recent literature. 2012 Wang et al.; licensee BioMed Central Ltd.
Auxin efflux by PIN-FORMED proteins is activated by two different protein kinases, D6 PROTEIN KINASE and PINOIDZourelidou, Melina; Absmanner, Birgit; Weller, Benjamin; Barbosa, Inês CR; Willige, Björn C; Fastner, Astrid; Streit, Verena; Port, Sarah A; Colcombet, Jean; de la Fuente van Bentem, Sergio; Hirt, Heribert; Kuster, Bernhard; Schulze, Waltraud X; Hammes, Ulrich Z; Schwechheimer, Claus (eLife, eLife Sciences Publications, Ltd, 2014-06-19) [Article]The development and morphology of vascular plants is critically determined by synthesis and proper distribution of the phytohormone auxin. The directed cell-to-cell distribution of auxin is achieved through a system of auxin influx and efflux transporters. PIN-FORMED (PIN) proteins are proposed auxin efflux transporters, and auxin fluxes can seemingly be predicted based on the-in many cells-asymmetric plasma membrane distribution of PINs. Here, we show in a heterologous Xenopus oocyte system as well as in Arabidopsis thaliana inflorescence stems that PIN-mediated auxin transport is directly activated by D6 PROTEIN KINASE (D6PK) and PINOID (PID)/WAG kinases of the Arabidopsis AGCVIII kinase family. At the same time, we reveal that D6PKs and PID have differential phosphosite preferences. Our study suggests that PIN activation by protein kinases is a crucial component of auxin transport control that must be taken into account to understand auxin distribution within the plant.
CMsearch: simultaneous exploration of protein sequence space and structure space improves not only protein homology detection but also protein structure predictionCui, Xuefeng; Lu, Zhiwu; wang, sheng; Wang, Jim Jing-Yan; Gao, Xin (Bioinformatics, Oxford University Press (OUP), 2016-06-15) [Article]Motivation: Protein homology detection, a fundamental problem in computational biology, is an indispensable step toward predicting protein structures and understanding protein functions. Despite the advances in recent decades on sequence alignment, threading and alignment-free methods, protein homology detection remains a challenging open problem. Recently, network methods that try to find transitive paths in the protein structure space demonstrate the importance of incorporating network information of the structure space. Yet, current methods merge the sequence space and the structure space into a single space, and thus introduce inconsistency in combining different sources of information. Method: We present a novel network-based protein homology detection method, CMsearch, based on cross-modal learning. Instead of exploring a single network built from the mixture of sequence and structure space information, CMsearch builds two separate networks to represent the sequence space and the structure space. It then learns sequence–structure correlation by simultaneously taking sequence information, structure information, sequence space information and structure space information into consideration. Results: We tested CMsearch on two challenging tasks, protein homology detection and protein structure prediction, by querying all 8332 PDB40 proteins. Our results demonstrate that CMsearch is insensitive to the similarity metrics used to define the sequence and the structure spaces. By using HMM–HMM alignment as the sequence similarity metric, CMsearch clearly outperforms state-of-the-art homology detection methods and the CASP-winning template-based protein structure prediction methods.