DeepGSR: An optimized deep-learning structure for the recognition of genomic signals and regions
KAUST DepartmentComputational Bioscience Research Center (CBRC)
Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division
Applied Mathematics and Computational Science Program
Permanent link to this recordhttp://hdl.handle.net/10754/656591
MetadataShow full item record
DescriptionRecognition of different genomic signals and regions (GSRs) in the DNA is helpful in gaining knowledge to understand genome organization and gene regulation as well as gene function. Accurate recognition of GSRs enables better genome and gene annotation. Although many methods have been developed to recognize GSRs, their pure computational identification remains challenging. Moreover, various GSRs usually require a specialized set of features for developing robust recognition models. Recently, deep learning (DL) methods have been shown to generate more accurate prediction models than the ‘shallow’ methods without the need to develop specialized features for the problems in question. Here, we explore the potential use of DL for the recognition of GSRs. We developed DeepGSR, an optimized DL architecture for the prediction of different types of GSRs. The performance of the DeepGSR structure is evaluated on the recognition of polyadenylation signals (PAS) and translation initiation sites (TIS) of different organisms: human, mouse, bovine and fruit fly. The results show that DeepGSR outperformed the state-of-the-art methods, reducing the classification error rate of the PAS and TIS prediction in the human genome by up to 29% and 86%, respectively. Moreover, the cross-organisms and genome-wide analyses we performed, confirmed the robustness of DeepGSR and provided new insights into the conservation of examined GSRs across species.
CitationKalkatawi, Manal, Magana-Mora, Arturo, Jankovic, Boris, & Bajic, Vladimir. (2017). DeepGSR: An optimized deep-learning structure for the recognition of genomic signals and regions [Software]. Zenodo. http://doi.org/10.5281/zenodo.1117159
SponsorsThis work was supported by the King Abdullah University of Science and Technology (KAUST) through the baseline research fund BAS/1/1606-01-01 to Vladimir B. Bajic.
The following license files are associated with this item:
Except where otherwise noted, this item's license is described as Attribution-NonCommercial-NoDerivs 3.0 United States