RSARF: Prediction of residue solvent accessibility from protein sequence using random forest method

Handle URI:
http://hdl.handle.net/10754/562051
Title:
RSARF: Prediction of residue solvent accessibility from protein sequence using random forest method
Authors:
Ganesan, Pugalenthi; Kandaswamy, Krishna Kumar Umar; Chou -, Kuochen; Vivekanandan, Saravanan; Kolatkar, Prasanna R.
Abstract:
Prediction of protein structure from its amino acid sequence is still a challenging problem. The complete physicochemical understanding of protein folding is essential for the accurate structure prediction. Knowledge of residue solvent accessibility gives useful insights into protein structure prediction and function prediction. In this work, we propose a random forest method, RSARF, to predict residue accessible surface area from protein sequence information. The training and testing was performed using 120 proteins containing 22006 residues. For each residue, buried and exposed state was computed using five thresholds (0%, 5%, 10%, 25%, and 50%). The prediction accuracy for 0%, 5%, 10%, 25%, and 50% thresholds are 72.9%, 78.25%, 78.12%, 77.57% and 72.07% respectively. Further, comparison of RSARF with other methods using a benchmark dataset containing 20 proteins shows that our approach is useful for prediction of residue solvent accessibility from protein sequence without using structural information. The RSARF program, datasets and supplementary data are available at http://caps.ncbs.res.in/download/pugal/RSARF/. - See more at: http://www.eurekaselect.com/89216/article#sthash.pwVGFUjq.dpuf
KAUST Department:
Biosciences Core Lab; Core Labs
Publisher:
Bentham Science Publishers Ltd.
Journal:
Protein & Peptide Letters
Issue Date:
1-Jan-2012
DOI:
10.2174/092986612798472875
PubMed ID:
21919860
Type:
Article
ISSN:
09298665
Appears in Collections:
Articles; Biosciences Core Lab

Full metadata record

DC FieldValue Language
dc.contributor.authorGanesan, Pugalenthien
dc.contributor.authorKandaswamy, Krishna Kumar Umaren
dc.contributor.authorChou -, Kuochenen
dc.contributor.authorVivekanandan, Saravananen
dc.contributor.authorKolatkar, Prasanna R.en
dc.date.accessioned2015-08-03T09:43:37Zen
dc.date.available2015-08-03T09:43:37Zen
dc.date.issued2012-01-01en
dc.identifier.issn09298665en
dc.identifier.pmid21919860en
dc.identifier.doi10.2174/092986612798472875en
dc.identifier.urihttp://hdl.handle.net/10754/562051en
dc.description.abstractPrediction of protein structure from its amino acid sequence is still a challenging problem. The complete physicochemical understanding of protein folding is essential for the accurate structure prediction. Knowledge of residue solvent accessibility gives useful insights into protein structure prediction and function prediction. In this work, we propose a random forest method, RSARF, to predict residue accessible surface area from protein sequence information. The training and testing was performed using 120 proteins containing 22006 residues. For each residue, buried and exposed state was computed using five thresholds (0%, 5%, 10%, 25%, and 50%). The prediction accuracy for 0%, 5%, 10%, 25%, and 50% thresholds are 72.9%, 78.25%, 78.12%, 77.57% and 72.07% respectively. Further, comparison of RSARF with other methods using a benchmark dataset containing 20 proteins shows that our approach is useful for prediction of residue solvent accessibility from protein sequence without using structural information. The RSARF program, datasets and supplementary data are available at http://caps.ncbs.res.in/download/pugal/RSARF/. - See more at: http://www.eurekaselect.com/89216/article#sthash.pwVGFUjq.dpufen
dc.publisherBentham Science Publishers Ltd.en
dc.titleRSARF: Prediction of residue solvent accessibility from protein sequence using random forest methoden
dc.typeArticleen
dc.contributor.departmentBiosciences Core Laben
dc.contributor.departmentCore Labsen
dc.identifier.journalProtein & Peptide Lettersen
dc.contributor.institutionStem Cell and Developmental Biology, Genome Institute of Singapore, 60 Biopolis Street, Singapore 138672, Singaporeen
dc.contributor.institutionInstitute for Neuro and Bioinformatics, University of Lübeck, 23538 Lübeck, Germanyen
dc.contributor.institutionGordon Life Science Institute, San Diego, CA 92130, United Statesen
kaust.authorGanesan, Pugalenthien

Related articles on PubMed

All Items in KAUST are protected by copyright, with all rights reserved, unless otherwise indicated.