DEEPre: sequence-based enzyme EC number prediction by deep learning
Type
ArticleKAUST Department
Computational Bioscience Research Center (CBRC)Computer Science Program
Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division
KAUST Grant Number
URF/1/1976-04URF/1/3007-01
Date
2017-10-23Online Publication Date
2017-10-23Print Publication Date
2018-03-01Permanent link to this record
http://hdl.handle.net/10754/625965
Metadata
Show full item recordAbstract
Annotation of enzyme function has a broad range of applications, such as metagenomics, industrial biotechnology, and diagnosis of enzyme deficiency-caused diseases. However, the time and resource required make it prohibitively expensive to experimentally determine the function of every enzyme. Therefore, computational enzyme function prediction has become increasingly important. In this paper, we develop such an approach, determining the enzyme function by predicting the Enzyme Commission number.We propose an end-to-end feature selection and classification model training approach, as well as an automatic and robust feature dimensionality uniformization method, DEEPre, in the field of enzyme function prediction. Instead of extracting manuallycrafted features from enzyme sequences, our model takes the raw sequence encoding as inputs, extracting convolutional and sequential features from the raw encoding based on the classification result to directly improve the prediction performance. The thorough cross-fold validation experiments conducted on two large-scale datasets show that DEEPre improves the prediction performance over the previous state-of-the-art methods. In addition, our server outperforms five other servers in determining the main class of enzymes on a separate low-homology dataset. Two case studies demonstrate DEEPre's ability to capture the functional difference of enzyme isoforms.The server could be accessed freely at http://www.cbrc.kaust.edu.sa/DEEPre.Citation
Li Y, Wang S, Umarov R, Xie B, Fan M, et al. (2017) DEEPre: sequence-based enzyme EC number prediction by deep learning. Bioinformatics. Available: http://dx.doi.org/10.1093/bioinformatics/btx680.Sponsors
We would like to thank Prof. Kuo-Chen Chou for kindly providing the KNN dataset. This publication is based upon work supported by the King Abdullah University of Science and Technology (KAUST) Office of Sponsored Research (OSR) under Award No URF/1/1976-04 and URF/1/3007-01, National Natural Science Foundation of China (61401131 and 61731008).Publisher
Oxford University Press (OUP)Journal
BioinformaticsPubMed ID
29069344Additional Links
https://academic.oup.com/bioinformatics/article/doi/10.1093/bioinformatics/btx680/4562505/DEEPre-sequencebased-enzyme-EC-number-predictionae974a485f413a2113503eed53cd6c53
10.1093/bioinformatics/btx680
Scopus Count
Except where otherwise noted, this item's license is described as This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
Related articles
- deepNEC: a novel alignment-free tool for the identification and classification of nitrogen biochemical network-related enzymes using deep learning.
- Authors: Duhan N, Norton JM, Kaundal R
- Issue date: 2022 May 13
- ECPred: a tool for the prediction of the enzymatic functions of protein sequences based on the EC nomenclature.
- Authors: Dalkiran A, Rifaioglu AS, Martin MJ, Cetin-Atalay R, Atalay V, Doğan T
- Issue date: 2018 Sep 21
- GOLabeler: improving sequence-based large-scale protein function prediction by learning to rank.
- Authors: You R, Zhang Z, Xiong Y, Sun F, Mamitsuka H, Zhu S
- Issue date: 2018 Jul 15
- CMsearch: simultaneous exploration of protein sequence space and structure space improves not only protein homology detection but also protein structure prediction.
- Authors: Cui X, Lu Z, Wang S, Jing-Yan Wang J, Gao X
- Issue date: 2016 Jun 15
- Hum-mPLoc 3.0: prediction enhancement of human protein subcellular localization through modeling the hidden correlations of gene ontology and functional domain features.
- Authors: Zhou H, Yang Y, Shen HB
- Issue date: 2017 Mar 15