Show simple item record

dc.contributor.authorKafkas, Senay
dc.contributor.authorHoehndorf, Robert
dc.date.accessioned2019-03-11T07:13:54Z
dc.date.available2019-03-11T07:13:54Z
dc.date.issued2019-02-27
dc.identifier.citationKafkas Ş, Hoehndorf R (2019) Ontology based text mining of gene-phenotype associations: application to candidate gene prediction. Database 2019. Available: http://dx.doi.org/10.1093/database/baz019.
dc.identifier.issn1758-0463
dc.identifier.doi10.1093/database/baz019
dc.identifier.urihttp://hdl.handle.net/10754/631523
dc.description.abstractGene–phenotype associations play an important role in understanding the disease mechanisms which is a requirement for treatment development. A portion of gene–phenotype associations are observed mainly experimentally and made publicly available through several standard resources such as MGI. However, there is still a vast amount of gene–phenotype associations buried in the biomedical literature. Given the large amount of literature data, we need automated text mining tools to alleviate the burden in manual curation of gene–phenotype associations and to develop comprehensive resources. In this study, we present an ontology-based approach in combination with statistical methods to text mine gene–phenotype associations from the literature. Our method achieved AUC values of 0.90 and 0.75 in recovering known gene–phenotype associations from HPO and MGI respectively. We posit that candidate genes and their relevant diseases should be expressed with similar phenotypes in publications. Thus, we demonstrate the utility of our approach by predicting disease candidate genes based on the semantic similarities of phenotypes associated with genes and diseases. To the best of our knowledge, this is the first study using an ontology based approach to extract gene–phenotype associations from the literature. We evaluated our disease candidate prediction model on the gene–disease associations from MGI. Our model achieved AUC values of 0.90 and 0.87 on OMIM (human) and MGI (mouse) datasets of gene–disease associations respectively. Our manual analysis on the text mined data revealed that our method can accurately extract gene–phenotype associations which are not currently covered by the existing public gene–phenotype resources. Overall, results indicate that our method can precisely extract known as well as new gene–phenotype associations from literature. All the data and methods are available at https://github.com/bio-ontology-research-group/genepheno.
dc.description.sponsorshipThis work was supported by funding from King Abdullah University of Science and Technology (KAUST) Office of Sponsored Research (OSR) under Award No. URF/1/3454-01-01 and FCC/1/1976-08-01.
dc.publisherOxford University Press (OUP)
dc.relation.urlhttps://academic.oup.com/database/article/doi/10.1093/database/baz019/5365528
dc.rightsThis is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
dc.rights.urihttp://creativecommons.org/licenses/by/4.0/
dc.titleOntology based text mining of gene-phenotype associations: application to candidate gene prediction
dc.typeArticle
dc.contributor.departmentBio-Ontology Research Group (BORG)
dc.contributor.departmentComputational Bioscience Research Center (CBRC)
dc.contributor.departmentComputer Science Program
dc.contributor.departmentComputer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division
dc.identifier.journalDatabase
dc.eprint.versionPublisher's Version/PDF
kaust.personKafkas, Senay
kaust.personHoehndorf, Robert
kaust.grant.numberURF/1/3454-01-01
kaust.grant.numberFCC/1/1976-08-01
dc.relation.issupplementedbygithub:bio-ontology-research-group/genepheno
refterms.dateFOA2019-03-12T13:47:50Z
display.relations<b>Is Supplemented By:</b><br/> <ul><li><i>[Software]</i> <br/> Title: bio-ontology-research-group/genepheno: this repository contains text mined gene-phenotype data. Publication Date: 2018-10-07. github: <a href="https://github.com/bio-ontology-research-group/genepheno" >bio-ontology-research-group/genepheno</a> Handle: <a href="http://hdl.handle.net/10754/668122" >10754/668122</a></a></li></ul>
dc.date.published-online2019-02-27
dc.date.published-print2019-01-01


Files in this item

Thumbnail
Name:
baz019.pdf
Size:
2.060Mb
Format:
PDF
Description:
Published version

This item appears in the following Collection(s)

Show simple item record

This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
Except where otherwise noted, this item's license is described as This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.