Show simple item record

dc.contributor.authorXia, Zhihao
dc.contributor.authorLi, Yu
dc.contributor.authorZhang, Bin
dc.contributor.authorLi, Zhongxiao
dc.contributor.authorHu, Yuhui
dc.contributor.authorChen, Wei
dc.contributor.authorGao, Xin
dc.date.accessioned2018-12-05T08:07:56Z
dc.date.available2018-12-05T08:07:56Z
dc.date.issued2018-11-29
dc.identifier.citationXia Z, Li Y, Zhang B, Li Z, Hu Y, et al. (2018) DeeReCT-PolyA: a robust and generic deep learning method for PAS identification. Bioinformatics. Available: http://dx.doi.org/10.1093/bioinformatics/bty991.
dc.identifier.issn1367-4803
dc.identifier.issn1460-2059
dc.identifier.doi10.1093/bioinformatics/bty991
dc.identifier.urihttp://hdl.handle.net/10754/630192
dc.description.abstractMotivation \nPolyadenylation is a critical step for gene expression regulation during the maturation of mRNA. An accurate and robust method for poly(A) signals (PAS) identification is not only desired for the purpose of better transcripts’ end annotation, but can also help us gain a deeper insight of the underlying regulatory mechanism. Although many methods have been proposed for PAS recognition, most of them are PAS motif-specific and human-specific, which leads to high risks of overfitting, low generalization power, and inability to reveal the connections between the underlying mechanisms of different mammals. \nResults \nIn this work, we propose a robust, PAS motif agnostic, and highly interpretable and transferrable deep learning model for accurate PAS recognition, which requires no prior knowledge or human-designed features. We show that our single model trained over all human PAS motifs not only outperforms the state-of-theart methods trained on specific motifs, but can also be generalized well to two mouse data sets. Moreover, we further increase the prediction accuracy by transferring the deep learning model trained on the data of one species to the data of a different species. Several novel underlying poly(A) patterns are revealed through the visualization of important oligomers and positions in our trained models. Finally, we interpret the deep learning models by converting the convolutional filters into sequence logos and quantitatively compare the sequence logos between human and mouse datasets.
dc.description.sponsorshipWe would like to thank Jeffery Jung and Min Zhang for insightful discussion. This work was supported by the King Abdullah University of Science and Technology (KAUST) Office of Sponsored Research (OSR) under Awards No. FCC/1/1976-04, URF/1/2602-01, URF/1/3007-01, URF/1/3412-01, URF/1/3450-01 and URF/1/3454-01. Y.H. was supported by the International Cooperation Research Grant (No. GJHZ20170310161947503) from Science and Technology Innovation Commission of Shenzhen Municipal Government. W.C. was supported by Basic Research Grant (JCYJ20170307105752508) from Science and Technology Innovation Commission of Shenzhen Municipal Government.
dc.publisherOxford University Press (OUP)
dc.relation.urlhttps://academic.oup.com/bioinformatics/advance-article/doi/10.1093/bioinformatics/bty991/5221014
dc.rightsThis is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
dc.rights.urihttp://creativecommons.org/licenses/by-nc/4.0/
dc.titleDeeReCT-PolyA: a robust and generic deep learning method for PAS identification
dc.typeArticle
dc.contributor.departmentComputer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division
dc.contributor.departmentComputer Science Program
dc.contributor.departmentComputational Bioscience Research Center (CBRC)
dc.identifier.journalBioinformatics
dc.eprint.versionPost-print
dc.contributor.institutionDepartment of Computer Science and Engineering (CSE), Washington University in St. Louis, St. Louis, MO, USA
dc.contributor.institutionDepartment of Biology, Southern University of Science and Technology (SUSTC), Shenzhen, China
kaust.personLi, Yu
kaust.personLi, Zhongxiao
kaust.personGao, Xin
kaust.grant.numberFCC/1/1976-04
kaust.grant.numberURF/1/2602-01
kaust.grant.numberURF/1/3007-01
kaust.grant.numberURF/1/3412-01
kaust.grant.numberURF/1/3450-01
kaust.grant.numberURF/1/3454-01
refterms.dateFOA2018-12-05T08:15:20Z


Files in this item

Thumbnail
Name:
bty991.pdf
Size:
2.871Mb
Format:
PDF
Description:
Accepted Manuscript
Thumbnail
Name:
bty991_supp.pdf
Size:
2.234Mb
Format:
PDF
Description:
Supplemental files

This item appears in the following Collection(s)

Show simple item record

This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
Except where otherwise noted, this item's license is described as This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com