Semi-Supervised Transductive Hot Spot Predictor Working on Multiple Assumptions

Handle URI:
http://hdl.handle.net/10754/556612
Title:
Semi-Supervised Transductive Hot Spot Predictor Working on Multiple Assumptions
Authors:
Wang, Jim Jing-Yan; Almasri, Islam; Shi, Yuexiang; Gao, Xin ( 0000-0002-7108-3574 )
Abstract:
Protein-protein interactions are critically dependent on just a few residues (“hot spots”) at the interfaces. Hot spots make a dominant contribution to the binding free energy and if mutated they can disrupt the interaction. As mutagenesis studies require significant experimental efforts, there exists a need for accurate and reliable computational hot spot prediction methods. Compared to the supervised hot spot prediction algorithms, the semi-supervised prediction methods can take into consideration both the labeled and unlabeled residues in the dataset during the prediction procedure. The transductive support vector machine has been utilized for this task and demonstrated a better prediction performance. To the best of our knowledge, however, none of the transductive semi-supervised algorithms takes all the three semisupervised assumptions, i.e., smoothness, cluster and manifold assumptions, together into account during learning. In this paper, we propose a novel semi-supervised method for hot spot residue prediction, by considering all the three semisupervised assumptions using nonlinear models. Our algorithm, IterPropMCS, works in an iterative manner. In each iteration, the algorithm first propagates the labels of the labeled residues to the unlabeled ones, along the shortest path between them on a graph, assuming that they lie on a nonlinear manifold. Then it selects the most confident residues as the labeled ones for the next iteration, according to the cluster and smoothness criteria, which is implemented by a nonlinear density estimator. Experiments on a benchmark dataset, using protein structure-based features, demonstrate that our approach is effective in predicting hot spots and compares favorably to other available methods. The results also show that our method outperforms the state-of-the-art transductive learning methods.
KAUST Department:
Computational Bioscience Research Center (CBRC); Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division
Citation:
Semi-Supervised Transductive Hot Spot Predictor Working on Multiple Assumptions 2014, 9 (3):258 Current Bioinformatics
Publisher:
Bentham Science Publishers Ltd.
Journal:
Current Bioinformatics
Issue Date:
23-May-2014
DOI:
10.2174/1574893609999140523124421
Type:
Article
ISSN:
15748936
Additional Links:
http://www.eurekaselect.com/openurl/content.php?genre=article&issn=1574-8936&volume=9&issue=3&spage=258
Appears in Collections:
Articles; Computational Bioscience Research Center (CBRC); Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division

Full metadata record

DC FieldValue Language
dc.contributor.authorWang, Jim Jing-Yanen
dc.contributor.authorAlmasri, Islamen
dc.contributor.authorShi, Yuexiangen
dc.contributor.authorGao, Xinen
dc.date.accessioned2015-06-10T11:13:02Zen
dc.date.available2015-06-10T11:13:02Zen
dc.date.issued2014-05-23en
dc.identifier.citationSemi-Supervised Transductive Hot Spot Predictor Working on Multiple Assumptions 2014, 9 (3):258 Current Bioinformaticsen
dc.identifier.issn15748936en
dc.identifier.doi10.2174/1574893609999140523124421en
dc.identifier.urihttp://hdl.handle.net/10754/556612en
dc.description.abstractProtein-protein interactions are critically dependent on just a few residues (“hot spots”) at the interfaces. Hot spots make a dominant contribution to the binding free energy and if mutated they can disrupt the interaction. As mutagenesis studies require significant experimental efforts, there exists a need for accurate and reliable computational hot spot prediction methods. Compared to the supervised hot spot prediction algorithms, the semi-supervised prediction methods can take into consideration both the labeled and unlabeled residues in the dataset during the prediction procedure. The transductive support vector machine has been utilized for this task and demonstrated a better prediction performance. To the best of our knowledge, however, none of the transductive semi-supervised algorithms takes all the three semisupervised assumptions, i.e., smoothness, cluster and manifold assumptions, together into account during learning. In this paper, we propose a novel semi-supervised method for hot spot residue prediction, by considering all the three semisupervised assumptions using nonlinear models. Our algorithm, IterPropMCS, works in an iterative manner. In each iteration, the algorithm first propagates the labels of the labeled residues to the unlabeled ones, along the shortest path between them on a graph, assuming that they lie on a nonlinear manifold. Then it selects the most confident residues as the labeled ones for the next iteration, according to the cluster and smoothness criteria, which is implemented by a nonlinear density estimator. Experiments on a benchmark dataset, using protein structure-based features, demonstrate that our approach is effective in predicting hot spots and compares favorably to other available methods. The results also show that our method outperforms the state-of-the-art transductive learning methods.en
dc.publisherBentham Science Publishers Ltd.en
dc.relation.urlhttp://www.eurekaselect.com/openurl/content.php?genre=article&issn=1574-8936&volume=9&issue=3&spage=258en
dc.rightsArchived with thanks to Current Bioinformaticsen
dc.subjectHot spot predictionen
dc.subjectmultiple semi-supervised assumptionsen
dc.subjectnonlinear density estimatoren
dc.subjectnonlinear manifolden
dc.subjectsemisupervised learningen
dc.titleSemi-Supervised Transductive Hot Spot Predictor Working on Multiple Assumptionsen
dc.typeArticleen
dc.contributor.departmentComputational Bioscience Research Center (CBRC)en
dc.contributor.departmentComputer, Electrical and Mathematical Sciences and Engineering (CEMSE) Divisionen
dc.identifier.journalCurrent Bioinformaticsen
dc.eprint.versionPost-printen
dc.contributor.institutionInformation Engineering School, Xiangtan University, Xiangtan 411105, Chinaen
kaust.authorWang, Jim Jing-Yanen
kaust.authorGao, Xinen
kaust.authorAlmasri, Islamen
All Items in KAUST are protected by copyright, with all rights reserved, unless otherwise indicated.