Accurate prediction of hot spot residues through physicochemical characteristics of amino acid sequences
Type
ArticleKAUST Department
Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) DivisionComputational Bioscience Research Center (CBRC)
Computer Science Program
Structural and Functional Bioinformatics Group
KAUST Grant Number
GRP-CF-2011-19-P-Gao-HuangKUS-CI-016-04
Date
2013-07-23Online Publication Date
2013-07-23Print Publication Date
2013-08Permanent link to this record
http://hdl.handle.net/10754/562868
Metadata
Show full item recordAbstract
Hot spot residues of proteins are fundamental interface residues that help proteins perform their functions. Detecting hot spots by experimental methods is costly and time-consuming. Sequential and structural information has been widely used in the computational prediction of hot spots. However, structural information is not always available. In this article, we investigated the problem of identifying hot spots using only physicochemical characteristics extracted from amino acid sequences. We first extracted 132 relatively independent physicochemical features from a set of the 544 properties in AAindex1, an amino acid index database. Each feature was utilized to train a classification model with a novel encoding schema for hot spot prediction by the IBk algorithm, an extension of the K-nearest neighbor algorithm. The combinations of the individual classifiers were explored and the classifiers that appeared frequently in the top performing combinations were selected. The hot spot predictor was built based on an ensemble of these classifiers and to work in a voting manner. Experimental results demonstrated that our method effectively exploited the feature space and allowed flexible weights of features for different queries. On the commonly used hot spot benchmark sets, our method significantly outperformed other machine learning algorithms and state-of-the-art hot spot predictors. The program is available at http://sfb.kaust.edu.sa/pages/software.aspx. © 2013 Wiley Periodicals, Inc.Sponsors
Grant sponsor: King Abdullah University of Science and Technology (KAUST); Grand numbers: KUS-CI-016-04; GRP-CF-2011-19-P-Gao-Huang.Publisher
WileyPubMed ID
23504705ae974a485f413a2113503eed53cd6c53
10.1002/prot.24278
Scopus Count
Related articles
- Protein binding hot spots prediction from sequence only by a new ensemble learning method.
- Authors: Hu SS, Chen P, Wang B, Li J
- Issue date: 2017 Oct
- APIS: accurate prediction of hot spots in protein interfaces by combining protrusion index with solvent accessibility.
- Authors: Xia JF, Zhao XM, Song J, Huang DS
- Issue date: 2010 Apr 8
- Prediction of hot spots in protein interfaces using a random forest model with hybrid features.
- Authors: Wang L, Liu ZP, Zhang XS, Chen L
- Issue date: 2012 Mar
- Predicting hot spots in protein interfaces based on protrusion index, pseudo hydrophobicity and electron-ion interaction pseudopotential features.
- Authors: Xia J, Yue Z, Di Y, Zhu X, Zheng CH
- Issue date: 2016 Apr 5
- Prediction of hot spots in protein interfaces using extreme learning machines with the information of spatial neighbour residues.
- Authors: Wang L, Zhang W, Gao Q, Xiong C
- Issue date: 2014 Aug