Prediction of Metabolic Pathway Involvement in Prokaryotic UniProtKB Data by Association Rule Mining

Handle URI:
http://hdl.handle.net/10754/617797
Title:
Prediction of Metabolic Pathway Involvement in Prokaryotic UniProtKB Data by Association Rule Mining
Authors:
Boudellioua, Imane; Saidi, Rabie; Hoehndorf, Robert ( 0000-0001-8149-5890 ) ; Martin, Maria J.; Solovyev, Victor
Abstract:
The widening gap between known proteins and their functions has encouraged the development of methods to automatically infer annotations. Automatic functional annotation of proteins is expected to meet the conflicting requirements of maximizing annotation coverage, while minimizing erroneous functional assignments. This trade-off imposes a great challenge in designing intelligent systems to tackle the problem of automatic protein annotation. In this work, we present a system that utilizes rule mining techniques to predict metabolic pathways in prokaryotes. The resulting knowledge represents predictive models that assign pathway involvement to UniProtKB entries. We carried out an evaluation study of our system performance using cross-validation technique. We found that it achieved very promising results in pathway identification with an F1-measure of 0.982 and an AUC of 0.987. Our prediction models were then successfully applied to 6.2 million UniProtKB/TrEMBL reference proteome entries of prokaryotes. As a result, 663,724 entries were covered, where 436,510 of them lacked any previous pathway annotations.
KAUST Department:
Computational Bioscience Research Center (CBRC); Computer, Electrical and Mathematical Science and Engineering (CEMSE) Division
Citation:
Prediction of Metabolic Pathway Involvement in Prokaryotic UniProtKB Data by Association Rule Mining 2016, 11 (7):e0158896 PLOS ONE
Publisher:
Public Library of Science (PLoS)
Journal:
PLoS ONE
Issue Date:
8-Jul-2016
DOI:
10.1371/journal.pone.0158896
Type:
Article
ISSN:
1932-6203
Sponsors:
IB, RH and VS were supported by funding provided by the King Abdullah University of Science and Technology.
Additional Links:
http://dx.plos.org/10.1371/journal.pone.0158896
Appears in Collections:
Articles

Full metadata record

DC FieldValue Language
dc.contributor.authorBoudellioua, Imaneen
dc.contributor.authorSaidi, Rabieen
dc.contributor.authorHoehndorf, Roberten
dc.contributor.authorMartin, Maria J.en
dc.contributor.authorSolovyev, Victoren
dc.date.accessioned2016-08-01T10:33:26Z-
dc.date.available2016-08-01T10:33:26Z-
dc.date.issued2016-07-08-
dc.identifier.citationPrediction of Metabolic Pathway Involvement in Prokaryotic UniProtKB Data by Association Rule Mining 2016, 11 (7):e0158896 PLOS ONEen
dc.identifier.issn1932-6203-
dc.identifier.doi10.1371/journal.pone.0158896-
dc.identifier.urihttp://hdl.handle.net/10754/617797-
dc.description.abstractThe widening gap between known proteins and their functions has encouraged the development of methods to automatically infer annotations. Automatic functional annotation of proteins is expected to meet the conflicting requirements of maximizing annotation coverage, while minimizing erroneous functional assignments. This trade-off imposes a great challenge in designing intelligent systems to tackle the problem of automatic protein annotation. In this work, we present a system that utilizes rule mining techniques to predict metabolic pathways in prokaryotes. The resulting knowledge represents predictive models that assign pathway involvement to UniProtKB entries. We carried out an evaluation study of our system performance using cross-validation technique. We found that it achieved very promising results in pathway identification with an F1-measure of 0.982 and an AUC of 0.987. Our prediction models were then successfully applied to 6.2 million UniProtKB/TrEMBL reference proteome entries of prokaryotes. As a result, 663,724 entries were covered, where 436,510 of them lacked any previous pathway annotations.en
dc.description.sponsorshipIB, RH and VS were supported by funding provided by the King Abdullah University of Science and Technology.en
dc.language.isoenen
dc.publisherPublic Library of Science (PLoS)en
dc.relation.urlhttp://dx.plos.org/10.1371/journal.pone.0158896en
dc.rightsThis is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. http://creativecommons.org/licenses/by/4.0/en
dc.titlePrediction of Metabolic Pathway Involvement in Prokaryotic UniProtKB Data by Association Rule Miningen
dc.typeArticleen
dc.contributor.departmentComputational Bioscience Research Center (CBRC)en
dc.contributor.departmentComputer, Electrical and Mathematical Science and Engineering (CEMSE) Divisionen
dc.identifier.journalPLoS ONEen
dc.eprint.versionPublisher's Version/PDFen
dc.contributor.institutionEuropean Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Cambridge, United Kingdomen
dc.contributor.institutionSoftberry Inc., 116 Radio Circle, Suite 400, Mount Kisco, NY 10549, United States of Americaen
dc.contributor.affiliationKing Abdullah University of Science and Technology (KAUST)en
kaust.authorBoudellioua, Imaneen
kaust.authorSaidi, Rabieen
kaust.authorHoehndorf, Roberten
All Items in KAUST are protected by copyright, with all rights reserved, unless otherwise indicated.