Type
ArticleKAUST Department
Computational Bioscience Research Center (CBRC)Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division
Computer Science Program
Applied Mathematics and Computational Science Program
Date
2014-11-05Online Publication Date
2014-11-05Print Publication Date
2015-01-09Permanent link to this record
http://hdl.handle.net/10754/333891
Metadata
Show full item recordAbstract
Transcription regulation in multicellular eukaryotes is orchestrated by a number of DNA functional elements located at gene regulatory regions. Some regulatory regions (e.g. enhancers) are located far away from the gene they affect. Identification of distal regulatory elements is a challenge for the bioinformatics research. Although existing methodologies increased the number of computationally predicted enhancers, performance inconsistency of computational models across different cell-lines, class imbalance within the learning sets and ad hoc rules for selecting enhancer candidates for supervised learning, are some key questions that require further examination. In this study we developed DEEP, a novel ensemble prediction framework. DEEP integrates three components with diverse characteristics that streamline the analysis of enhancer's properties in a great variety of cellular conditions. In our method we train many individual classification models that we combine to classify DNA regions as enhancers or non-enhancers. DEEP uses features derived from histone modification marks or attributes coming from sequence characteristics. Experimental results indicate that DEEP performs better than four state-of-the-art methods on the ENCODE data. We report the first computational enhancer prediction results on FANTOM5 data where DEEP achieves 90.2% accuracy and 90% geometric mean (GM) of specificity and sensitivity across 36 different tissues. We further present results derived using in vivo-derived enhancer data from VISTA database. DEEP-VISTA, when tested on an independent test set, achieved GM of 80.1% and accuracy of 89.64%. DEEP framework is publicly available at http://cbrc.kaust.edu.sa/deep/.Citation
Dimitrios Kleftogiannis, Panos Kalnis, and Vladimir B. Bajic DEEP: a general computational framework for predicting enhancers Nucl. Acids Res. doi:10.1093/nar/gku1058Sponsors
KAUST Research Funds via AEA KAUST-Stanford Round 3 Global Collaborative Research Program [to P.K., V.B.B.]; KAUST Base Research Funds [to P.K., V.B.B.]. Funding for open access charge: KAUST Research Funds via AEA KAUST-Stanford Round 3 Global Collaborative Research Program [to V.B.B.].Publisher
Oxford University Press (OUP)Journal
Nucleic Acids ResearchPubMed ID
25378307PubMed Central ID
PMC4288148Additional Links
http://nar.oxfordjournals.org/lookup/doi/10.1093/nar/gku1058ae974a485f413a2113503eed53cd6c53
10.1093/nar/gku1058
Scopus Count
The following license files are associated with this item:
Related articles
- Opening up the blackbox: an interpretable deep neural network-based classifier for cell-type specific enhancer predictions.
- Authors: Kim SG, Theera-Ampornpunt N, Fang CH, Harwani M, Grama A, Chaterji S
- Issue date: 2016 Aug 1
- Enhancer prediction in the human genome by probabilistic modelling of the chromatin feature patterns.
- Authors: Osmala M, Lähdesmäki H
- Issue date: 2020 Jul 20
- Genome-wide enhancer prediction from epigenetic signatures using genetic algorithm-optimized support vector machines.
- Authors: Fernández M, Miranda-Saavedra D
- Issue date: 2012 May
- TELS: A Novel Computational Framework for Identifying Motif Signatures of Transcribed Enhancers.
- Authors: Kleftogiannis D, Ashoor H, Bajic VB
- Issue date: 2018 Oct
- Active enhancer positions can be accurately predicted from chromatin marks and collective sequence motif data.
- Authors: Podsiadło A, Wrzesień M, Paja W, Rudnicki W, Wilczyński B
- Issue date: 2013