Show simple item record

dc.contributor.authorUmarov, Ramzan
dc.contributor.authorLi, Yu
dc.contributor.authorArakawa, Takahiro
dc.contributor.authorTakizawa, Satoshi
dc.contributor.authorGao, Xin
dc.contributor.authorArner, Erik
dc.date.accessioned2021-09-08T11:08:23Z
dc.date.available2021-09-08T11:08:23Z
dc.date.issued2021-09-07
dc.date.submitted2021-06-22
dc.identifier.citationUmarov, R., Li, Y., Arakawa, T., Takizawa, S., Gao, X., & Arner, E. (2021). ReFeaFi: Genome-wide prediction of regulatory elements driving transcription initiation. PLOS Computational Biology, 17(9), e1009376. doi:10.1371/journal.pcbi.1009376
dc.identifier.issn1553-7358
dc.identifier.doi10.1371/journal.pcbi.1009376
dc.identifier.urihttp://hdl.handle.net/10754/671122
dc.description.abstractRegulatory elements control gene expression through transcription initiation (promoters) and by enhancing transcription at distant regions (enhancers). Accurate identification of regulatory elements is fundamental for annotating genomes and understanding gene expression patterns. While there are many attempts to develop computational promoter and enhancer identification methods, reliable tools to analyze long genomic sequences are still lacking. Prediction methods often perform poorly on the genome-wide scale because the number of negatives is much higher than that in the training sets. To address this issue, we propose a dynamic negative set updating scheme with a two-model approach, using one model for scanning the genome and the other one for testing candidate positions. The developed method achieves good genome-level performance and maintains robust performance when applied to other vertebrate species, without re-training. Moreover, the unannotated predicted regulatory regions made on the human genome are enriched for disease-associated variants, suggesting them to be potentially true regulatory elements rather than false positives. We validated high scoring “false positive” predictions using reporter assay and all tested candidates were successfully validated, demonstrating the ability of our method to discover novel human regulatory regions.
dc.description.sponsorshipThe author(s) received no specific funding for this work.
dc.description.sponsorshipWe would like to thank Andrew Tae-Jun Kwon and Bogumil Kaczkowski for insightful comments on the manuscript.
dc.publisherPublic Library of Science (PLoS)
dc.relation.urlhttps://dx.plos.org/10.1371/journal.pcbi.1009376
dc.rightsThis is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
dc.rights.urihttp://creativecommons.org/licenses/by/4.0/
dc.titleReFeaFi: Genome-wide prediction of regulatory elements driving transcription initiation
dc.typeArticle
dc.contributor.departmentComputational Bioscience Research Center (CBRC)
dc.contributor.departmentComputer Science Program
dc.contributor.departmentComputer, Electrical and Mathematical Science and Engineering (CEMSE) Division
dc.contributor.departmentStructural and Functional Bioinformatics Group
dc.identifier.journalPLOS Computational Biology
dc.eprint.versionPublisher's Version/PDF
dc.contributor.institutionGraduate School of Integrated Sciences for Life, Hiroshima University, Higashi-Hiroshima, Japan.
dc.contributor.institutionDepartment of Computer Science and Engineering (CSE), The Chinese University of Hong Kong (CUHK),Hong Kong, People’s Republic of China.
dc.contributor.institutionLaboratory for Applied Regulatory Genomics Network Analysis,RIKEN Center for Integrative Medical Sciences, Yokohama, Kanagawa, Japan.
dc.identifier.volume17
dc.identifier.issue9
dc.identifier.pagese1009376
kaust.personGao, Xin
dc.date.accepted2021-08-23
refterms.dateFOA2021-09-08T11:09:31Z


Files in this item

Thumbnail
Name:
journal.pcbi.1009376.pdf
Size:
7.726Mb
Format:
PDF
Description:
Publisher's version

This item appears in the following Collection(s)

Show simple item record

This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Except where otherwise noted, this item's license is described as This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.