Show simple item record

dc.contributor.authorChikalov, Igor
dc.contributor.authorYao, Peggy
dc.contributor.authorMoshkov, Mikhail
dc.contributor.authorLatombe, Jean-Claude
dc.date.accessioned2014-08-27T09:52:39Z
dc.date.available2014-08-27T09:52:39Z
dc.date.issued2011-02-15
dc.identifier.citationChikalov I, Yao P, Moshkov M, Latombe J-C (2011) Learning probabilistic models of hydrogen bond stability from molecular dynamics simulation trajectories. BMC Bioinformatics 12: S34. doi:10.1186/1471-2105-12-S1-S34.
dc.identifier.issn14712105
dc.identifier.pmid21342565
dc.identifier.doi10.1186/1471-2105-12-S1-S34
dc.identifier.urihttp://hdl.handle.net/10754/325468
dc.description.abstractBackground: Hydrogen bonds (H-bonds) play a key role in both the formation and stabilization of protein structures. They form and break while a protein deforms, for instance during the transition from a non-functional to a functional state. The intrinsic strength of an individual H-bond has been studied from an energetic viewpoint, but energy alone may not be a very good predictor.Methods: This paper describes inductive learning methods to train protein-independent probabilistic models of H-bond stability from molecular dynamics (MD) simulation trajectories of various proteins. The training data contains 32 input attributes (predictors) that describe an H-bond and its local environment in a conformation c and the output attribute is the probability that the H-bond will be present in an arbitrary conformation of this protein achievable from c within a time duration ?. We model dependence of the output variable on the predictors by a regression tree.Results: Several models are built using 6 MD simulation trajectories containing over 4000 distinct H-bonds (millions of occurrences). Experimental results demonstrate that such models can predict H-bond stability quite well. They perform roughly 20% better than models based on H-bond energy alone. In addition, they can accurately identify a large fraction of the least stable H-bonds in a conformation. In most tests, about 80% of the 10% H-bonds predicted as the least stable are actually among the 10% truly least stable. The important attributes identified during the tree construction are consistent with previous findings.Conclusions: We use inductive learning methods to build protein-independent probabilistic models to study H-bond stability, and demonstrate that the models perform better than H-bond energy alone. 2011 Chikalov et al; licensee BioMed Central Ltd.
dc.language.isoen
dc.publisherSpringer Nature
dc.rightsThis is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
dc.rights.urihttp://creativecommons.org/licenses/by/2.0
dc.subjectInductive learning methods
dc.subjectIntrinsic strength
dc.subjectLocal environments
dc.subjectMolecular dynamics simulations
dc.subjectOutput variables
dc.subjectProbabilistic models
dc.subjectProtein structures
dc.subjectTree construction
dc.subjectRegression trees
dc.subjectForestry
dc.subjectHydrogen bonds
dc.subjectLearning systems
dc.subjectMolecular dynamics
dc.subjectProteins
dc.subjectStability
dc.subjectTrajectories
dc.subjectBioinformatics
dc.subjectComputer simulation
dc.subjectprotein
dc.subjectalgorithm
dc.subjectchemistry
dc.subjecthydrogen bond
dc.subjectmolecular dynamics
dc.subjectprotein secondary structure
dc.subjectprotein stability
dc.subjectstatistical model
dc.subjectAlgorithms
dc.subjectHydrogen Bonding
dc.subjectModels, Statistical
dc.subjectMolecular Dynamics Simulation
dc.subjectProtein Stability
dc.subjectProtein Structure, Secondary
dc.subjectProteins
dc.titleLearning probabilistic models of hydrogen bond stability from molecular dynamics simulation trajectories
dc.typeArticle
dc.contributor.departmentComputer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division
dc.identifier.journalBMC Bioinformatics
dc.identifier.pmcidPMC3044290
dc.eprint.versionPublisher's Version/PDF
dc.contributor.institutionBiomedical Informatics, Stanford University, Stanford, CA 94305, United States
dc.contributor.institutionComputer Science Department, Stanford University, Stanford, CA 94305, United States
dc.contributor.affiliationKing Abdullah University of Science and Technology (KAUST)
kaust.personChikalov, Igor
kaust.personMoshkov, Mikhail
refterms.dateFOA2018-06-13T15:30:15Z


Files in this item

Thumbnail
Name:
Article-BMC_Bioinf-Learning_p-2011.pdf
Size:
279.8Kb
Format:
PDF
Description:
Article - Full Text

This item appears in the following Collection(s)

Show simple item record

This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Except where otherwise noted, this item's license is described as This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.