Rule Mining Techniques to Predict Prokaryotic Metabolic Pathways

Handle URI:
http://hdl.handle.net/10754/625999
Title:
Rule Mining Techniques to Predict Prokaryotic Metabolic Pathways
Authors:
Saidi, Rabie; Boudellioua, Imene; Martin, Maria J.; Solovyev, Victor
Abstract:
It is becoming more evident that computational methods are needed for the identification and the mapping of pathways in new genomes. We introduce an automatic annotation system (ARBA4Path Association Rule-Based Annotator for Pathways) that utilizes rule mining techniques to predict metabolic pathways across wide range of prokaryotes. It was demonstrated that specific combinations of protein domains (recorded in our rules) strongly determine pathways in which proteins are involved and thus provide information that let us very accurately assign pathway membership (with precision of 0.999 and recall of 0.966) to proteins of a given prokaryotic taxon. Our system can be used to enhance the quality of automatically generated annotations as well as annotating proteins with unknown function. The prediction models are represented in the form of human-readable rules, and they can be used effectively to add absent pathway information to many proteins in UniProtKB/TrEMBL database.
KAUST Department:
Computational Bioscience Research Center (CBRC); Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division
Citation:
Saidi R, Boudellioua I, Martin MJ, Solovyev V (2017) Rule Mining Techniques to Predict Prokaryotic Metabolic Pathways. Biological Networks and Pathway Analysis: 311–331. Available: http://dx.doi.org/10.1007/978-1-4939-7027-8_12.
Publisher:
Springer New York
Journal:
Biological Networks and Pathway Analysis
Issue Date:
28-Aug-2017
DOI:
10.1007/978-1-4939-7027-8_12
Type:
Book Chapter
ISSN:
1064-3745; 1940-6029
Sponsors:
The second author conducted this work as part of a research internship at the European Bioinformatics Institute, UniProt team. The funding for this internship was provided by King Abdullah University of Science and Technology. The authors would also like to thank UniProt Consortium for their valuable support and feedback on the development of this work.
Additional Links:
https://link.springer.com/protocol/10.1007%2F978-1-4939-7027-8_12
Appears in Collections:
Computational Bioscience Research Center (CBRC); Book Chapters; Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division

Full metadata record

DC FieldValue Language
dc.contributor.authorSaidi, Rabieen
dc.contributor.authorBoudellioua, Imeneen
dc.contributor.authorMartin, Maria J.en
dc.contributor.authorSolovyev, Victoren
dc.date.accessioned2017-10-30T08:39:49Z-
dc.date.available2017-10-30T08:39:49Z-
dc.date.issued2017-08-28en
dc.identifier.citationSaidi R, Boudellioua I, Martin MJ, Solovyev V (2017) Rule Mining Techniques to Predict Prokaryotic Metabolic Pathways. Biological Networks and Pathway Analysis: 311–331. Available: http://dx.doi.org/10.1007/978-1-4939-7027-8_12.en
dc.identifier.issn1064-3745en
dc.identifier.issn1940-6029en
dc.identifier.doi10.1007/978-1-4939-7027-8_12en
dc.identifier.urihttp://hdl.handle.net/10754/625999-
dc.description.abstractIt is becoming more evident that computational methods are needed for the identification and the mapping of pathways in new genomes. We introduce an automatic annotation system (ARBA4Path Association Rule-Based Annotator for Pathways) that utilizes rule mining techniques to predict metabolic pathways across wide range of prokaryotes. It was demonstrated that specific combinations of protein domains (recorded in our rules) strongly determine pathways in which proteins are involved and thus provide information that let us very accurately assign pathway membership (with precision of 0.999 and recall of 0.966) to proteins of a given prokaryotic taxon. Our system can be used to enhance the quality of automatically generated annotations as well as annotating proteins with unknown function. The prediction models are represented in the form of human-readable rules, and they can be used effectively to add absent pathway information to many proteins in UniProtKB/TrEMBL database.en
dc.description.sponsorshipThe second author conducted this work as part of a research internship at the European Bioinformatics Institute, UniProt team. The funding for this internship was provided by King Abdullah University of Science and Technology. The authors would also like to thank UniProt Consortium for their valuable support and feedback on the development of this work.en
dc.publisherSpringer New Yorken
dc.relation.urlhttps://link.springer.com/protocol/10.1007%2F978-1-4939-7027-8_12en
dc.subjectAutomatic annotationen
dc.subjectFunctional genomicsen
dc.subjectMachine learningen
dc.subjectPathway predictionen
dc.subjectProteomicsen
dc.subjectRule miningen
dc.titleRule Mining Techniques to Predict Prokaryotic Metabolic Pathwaysen
dc.typeBook Chapteren
dc.contributor.departmentComputational Bioscience Research Center (CBRC)en
dc.contributor.departmentComputer, Electrical and Mathematical Sciences and Engineering (CEMSE) Divisionen
dc.identifier.journalBiological Networks and Pathway Analysisen
dc.contributor.institutionEuropean Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Cambridge, CB10 1SD, , , United Kingdomen
dc.contributor.institutionSoftberry Inc., 116 Radio Circle, Suite 400, Mount Kisco, NY, 10549, , United Statesen
kaust.authorBoudellioua, Imeneen
kaust.authorMartin, Maria J.en
All Items in KAUST are protected by copyright, with all rights reserved, unless otherwise indicated.