Semantic Disease Gene Embeddings (SmuDGE): phenotype-based disease gene prioritization without phenotypes
Type
ArticleAuthors
AlShahrani, Mona
Hoehndorf, Robert

KAUST Department
Bio-Ontology Research Group (BORG)Computational Bioscience Research Center (CBRC)
Computer Science Program
Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division
KAUST Grant Number
URF/1/3454-01-01FCC/1/1976-08-01
Date
2018-09-08Online Publication Date
2018-09-08Print Publication Date
2018-09-01Permanent link to this record
http://hdl.handle.net/10754/627767
Metadata
Show full item recordAbstract
Motivation In the past years, several methods have been developed to incorporate information about phenotypes into computational disease gene prioritization methods. These methods commonly compute the similarity between a disease's (or patient's) phenotypes and a database of gene-to-phenotype associations to find the phenotypically most similar match. A key limitation of these methods is their reliance on knowledge about phenotypes associated with particular genes which is highly incomplete in humans as well as in many model organisms such as the mouse. Results We developed SmuDGE, a method that uses feature learning to generate vector-based representations of phenotypes associated with an entity. SmuDGE can be used as a trainable semantic similarity measure to compare two sets of phenotypes (such as between a disease and gene, or a disease and patient). More importantly, SmuDGE can generate phenotype representations for entities that are only indirectly associated with phenotypes through an interaction network; for this purpose, SmuDGE exploits background knowledge in interaction networks comprised of multiple types of interactions. We demonstrate that SmuDGE can match or outperform semantic similarity in phenotype-based disease gene prioritization, and furthermore significantly extends the coverage of phenotype-based methods to all genes in a connected interaction network. Availability and implementation https://github.com/bio-ontology-research-group/SmuDGE.Citation
Alshahrani M, Hoehndorf R (2018) Semantic Disease Gene Embeddings (SmuDGE): phenotype-based disease gene prioritization without phenotypes. Bioinformatics 34: i901–i907. Available: http://dx.doi.org/10.1093/bioinformatics/bty559.Sponsors
This work was supported by funding from King Abdullah University of Science and Technology (KAUST) Office of Sponsored Research (OSR) under Award No. URF/1/3454-01-01 and FCC/1/1976-08-01.Publisher
Oxford University Press (OUP)Journal
BioinformaticsAdditional Links
https://academic.oup.com/bioinformatics/article/34/17/i901/5093225Relations
Is Supplemented By:- [Software]
Title: bio-ontology-research-group/SMUDGE: SmuDGE: Semantic Disease Gene Embeddings. Publication Date: 2017-10-31. github: bio-ontology-research-group/SMUDGE Handle: 10754/667032
ae974a485f413a2113503eed53cd6c53
10.1093/bioinformatics/bty559
Scopus Count
Except where otherwise noted, this item's license is described as This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.