• Login
    View Item 
    •   Home
    • Theses and Dissertations
    • PhD Dissertations
    • View Item
    •   Home
    • Theses and Dissertations
    • PhD Dissertations
    • View Item
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    Browse

    All of KAUSTCommunitiesIssue DateSubmit DateThis CollectionIssue DateSubmit Date

    My Account

    Login

    Quick Links

    Open Access PolicyORCID LibguideTheses and Dissertations LibguideSubmit an Item

    Statistics

    Display statistics

    Machine Learning Models for Biomedical Ontology Integration and Analysis

    • CSV
    • RefMan
    • EndNote
    • BibTex
    • RefWorks
    Thumbnail
    Name:
    FatimaZohraSmailiThesis.pdf
    Size:
    15.56Mb
    Format:
    PDF
    Download
    Type
    Dissertation
    Authors
    Smaili, Fatima Z. cc
    Advisors
    Gao, Xin cc
    Committee members
    Rzhetsky, Andrey cc
    Hoehndorf, Robert cc
    Arold, Stefan T. cc
    Program
    Computer Science
    KAUST Department
    Computer, Electrical and Mathematical Science and Engineering (CEMSE) Division
    Date
    2020-09-13
    Permanent link to this record
    http://hdl.handle.net/10754/665189
    
    Metadata
    Show full item record
    Abstract
    Biological knowledge is widely represented in the form of ontologies and ontology-based annotations. Biomedical ontologies describe known phenomena in biology using formal axioms, and the annotations associate an entity (e.g. genes, diseases, chemicals, etc.) with a set of biological concepts. In addition to formally structured axioms, ontologies contain meta-data in the form of annotation properties expressed mostly in natural language which provide valuable pieces of information that characterize ontology concepts. The structure and information contained in ontologies and their annotations make them valuable for use in machine learning, data analysis and knowledge extraction tasks. I develop the first approaches that can exploit all of the information encoded in ontologies, both formal and informal, to learn feature embeddings of biological concepts and biological entities based on their annotations to ontologies. Notably, I develop the first approach to use all the formal content of ontologies in the form of logical axioms and entity annotations to generate feature vectors of biological entities using neural language models. I extend the proposed algorithm by enriching the obtained feature vectors through representing the natural language annotation properties within the ontology meta-data as axioms. Transfer learning is then applied to learn from the biomedical literature and apply on the formal knowledge of ontologies. To optimize learning that combines the formal content of biomedical ontologies and natural language data such as the literature, I also propose a new approach that uses self-normalization with a deep Siamese neural network that improves learning from both the formal knowledge within ontologies and textual data. I validate the proposed algorithms by applying them to the Gene Ontology to generate feature vectors of proteins based on their functions, and to the PhenomeNet ontology to generate features of genes and diseases based on the phenotypes they are associated with. The generated features are then used to train a variety of machinelearning based classifiers to perform different prediction tasks including the prediction of protein interactions, gene–disease associations and the toxicological effects of chemicals. I also use the proposed methods to conduct the first quantitative evaluation of the quality of the axioms and meta-data included in ontologies to prove that including axioms as background improves ontology-based prediction. The proposed approaches can be applied to a wide range of other bioinformatics research problems including similarity-based prediction and classification of interaction types using supervised learning, or clustering.
    Citation
    Smaili, F. Z. (2020). Machine Learning Models for Biomedical Ontology Integration and Analysis. KAUST Research Repository. https://doi.org/10.25781/KAUST-V156A
    DOI
    10.25781/KAUST-V156A
    ae974a485f413a2113503eed53cd6c53
    10.25781/KAUST-V156A
    Scopus Count
    Collections
    PhD Dissertations; Computer Science Program; Computer, Electrical and Mathematical Science and Engineering (CEMSE) Division

    entitlement

     
    DSpace software copyright © 2002-2023  DuraSpace
    Quick Guide | Contact Us | KAUST University Library
    Open Repository is a service hosted by 
    Atmire NV
     

    Export search results

    The export option will allow you to export the current search results of the entered query to a file. Different formats are available for download. To export the items, click on the button corresponding with the preferred download format.

    By default, clicking on the export buttons will result in a download of the allowed maximum amount of items. For anonymous users the allowed maximum amount is 50 search results.

    To select a subset of the search results, click "Selective Export" button and make a selection of the items you want to export. The amount of items that can be exported at once is similarly restricted as the full export.

    After making a selection, click one of the export format buttons. The amount of items that will be exported is indicated in the bubble next to export format.