Assessment of the assessment: Evaluation of the model quality estimates in CASP10

Handle URI:
http://hdl.handle.net/10754/597614
Title:
Assessment of the assessment: Evaluation of the model quality estimates in CASP10
Authors:
Kryshtafovych, Andriy; Barbato, Alessandro; Fidelis, Krzysztof; Monastyrskyy, Bohdan; Schwede, Torsten; Tramontano, Anna
Abstract:
The article presents an assessment of the ability of the thirty-seven model quality assessment (MQA) methods participating in CASP10 to provide an a priori estimation of the quality of structural models, and of the 67 tertiary structure prediction groups to provide confidence estimates for their predicted coordinates. The assessment of MQA predictors is based on the methods used in previous CASPs, such as correlation between the predicted and observed quality of the models (both at the global and local levels), accuracy of methods in distinguishing between good and bad models as well as good and bad regions within them, and ability to identify the best models in the decoy sets. Several numerical evaluations were used in our analysis for the first time, such as comparison of global and local quality predictors with reference (baseline) predictors and a ROC analysis of the predictors' ability to differentiate between the well and poorly modeled regions. For the evaluation of the reliability of self-assessment of the coordinate errors, we used the correlation between the predicted and observed deviations of the coordinates and a ROC analysis of correctly identified errors in the models. A modified two-stage procedure for testing MQA methods in CASP10 whereby a small number of models spanning the whole range of model accuracy was released first followed by the release of a larger number of models of more uniform quality, allowed a more thorough analysis of abilities and inabilities of different types of methods. Clustering methods were shown to have an advantage over the single- and quasi-single- model methods on the larger datasets. At the same time, the evaluation revealed that the size of the dataset has smaller influence on the global quality assessment scores (for both clustering and nonclustering methods), than its diversity. Narrowing the quality range of the assessed models caused significant decrease in accuracy of ranking for global quality predictors but essentially did not change the results for local predictors. Self-assessment error estimates submitted by the majority of groups were poor overall, with two research groups showing significantly better results than the remaining ones.
Citation:
Kryshtafovych A, Barbato A, Fidelis K, Monastyrskyy B, Schwede T, et al. (2013) Assessment of the assessment: Evaluation of the model quality estimates in CASP10. Proteins: Structure, Function, and Bioinformatics 82: 112–126. Available: http://dx.doi.org/10.1002/prot.24347.
Publisher:
Wiley-Blackwell
Journal:
Proteins: Structure, Function, and Bioinformatics
KAUST Grant Number:
KUKI1-012-43
Issue Date:
31-Aug-2013
DOI:
10.1002/prot.24347
PubMed ID:
23780644
PubMed Central ID:
PMC4406045
Type:
Article
ISSN:
0887-3585
Sponsors:
Grant sponsor: US National Institute of General Medical Sciences (NIGMS/NIH); Grant number: R01GM100482; Grant sponsor: KAUST Award; Grant numbers: KUKI1-012-43; EMBO.
Appears in Collections:
Publications Acknowledging KAUST Support

Full metadata record

DC FieldValue Language
dc.contributor.authorKryshtafovych, Andriyen
dc.contributor.authorBarbato, Alessandroen
dc.contributor.authorFidelis, Krzysztofen
dc.contributor.authorMonastyrskyy, Bohdanen
dc.contributor.authorSchwede, Torstenen
dc.contributor.authorTramontano, Annaen
dc.date.accessioned2016-02-25T12:43:05Zen
dc.date.available2016-02-25T12:43:05Zen
dc.date.issued2013-08-31en
dc.identifier.citationKryshtafovych A, Barbato A, Fidelis K, Monastyrskyy B, Schwede T, et al. (2013) Assessment of the assessment: Evaluation of the model quality estimates in CASP10. Proteins: Structure, Function, and Bioinformatics 82: 112–126. Available: http://dx.doi.org/10.1002/prot.24347.en
dc.identifier.issn0887-3585en
dc.identifier.pmid23780644en
dc.identifier.doi10.1002/prot.24347en
dc.identifier.urihttp://hdl.handle.net/10754/597614en
dc.description.abstractThe article presents an assessment of the ability of the thirty-seven model quality assessment (MQA) methods participating in CASP10 to provide an a priori estimation of the quality of structural models, and of the 67 tertiary structure prediction groups to provide confidence estimates for their predicted coordinates. The assessment of MQA predictors is based on the methods used in previous CASPs, such as correlation between the predicted and observed quality of the models (both at the global and local levels), accuracy of methods in distinguishing between good and bad models as well as good and bad regions within them, and ability to identify the best models in the decoy sets. Several numerical evaluations were used in our analysis for the first time, such as comparison of global and local quality predictors with reference (baseline) predictors and a ROC analysis of the predictors' ability to differentiate between the well and poorly modeled regions. For the evaluation of the reliability of self-assessment of the coordinate errors, we used the correlation between the predicted and observed deviations of the coordinates and a ROC analysis of correctly identified errors in the models. A modified two-stage procedure for testing MQA methods in CASP10 whereby a small number of models spanning the whole range of model accuracy was released first followed by the release of a larger number of models of more uniform quality, allowed a more thorough analysis of abilities and inabilities of different types of methods. Clustering methods were shown to have an advantage over the single- and quasi-single- model methods on the larger datasets. At the same time, the evaluation revealed that the size of the dataset has smaller influence on the global quality assessment scores (for both clustering and nonclustering methods), than its diversity. Narrowing the quality range of the assessed models caused significant decrease in accuracy of ranking for global quality predictors but essentially did not change the results for local predictors. Self-assessment error estimates submitted by the majority of groups were poor overall, with two research groups showing significantly better results than the remaining ones.en
dc.description.sponsorshipGrant sponsor: US National Institute of General Medical Sciences (NIGMS/NIH); Grant number: R01GM100482; Grant sponsor: KAUST Award; Grant numbers: KUKI1-012-43; EMBO.en
dc.publisherWiley-Blackwellen
dc.subjectProtein structure modelingen
dc.subjectProtein Structure Predictionen
dc.subjectQaen
dc.subjectCaspen
dc.subjectModel Quality Assessmenten
dc.subject.meshModels, Statisticalen
dc.subject.meshProtein Conformationen
dc.titleAssessment of the assessment: Evaluation of the model quality estimates in CASP10en
dc.typeArticleen
dc.identifier.journalProteins: Structure, Function, and Bioinformaticsen
dc.identifier.pmcidPMC4406045en
dc.contributor.institutionGenome Center, University of California, Davis, 95616 California, USA.en
kaust.grant.numberKUKI1-012-43en

Related articles on PubMed

All Items in KAUST are protected by copyright, with all rights reserved, unless otherwise indicated.