• Login
    View Item 
    •   Home
    • Theses and Dissertations
    • Dissertations
    • View Item
    •   Home
    • Theses and Dissertations
    • Dissertations
    • View Item
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    Browse

    All of KAUSTCommunitiesIssue DateSubmit DateThis CollectionIssue DateSubmit Date

    My Account

    Login

    Quick Links

    Open Access PolicyORCID LibguidePlumX LibguideSubmit an Item

    Statistics

    Display statistics

    Learning from Scholarly Attributed Graphs for Scientific Discovery

    • CSV
    • RefMan
    • EndNote
    • BibTex
    • RefWorks
    Thumbnail
    Name:
    Final_Dissertation_Uchenna_Final.pdf
    Size:
    7.149Mb
    Format:
    PDF
    Download
    Type
    Dissertation
    Authors
    Akujuobi, Uchenna Thankgod cc
    Advisors
    Zhang, Xiangliang cc
    Committee members
    Moshkov, Mikhail cc
    Hoehndorf, Robert cc
    Zhang, Min cc
    Program
    Computer Science
    KAUST Department
    Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division
    Date
    2020-10-18
    Permanent link to this record
    http://hdl.handle.net/10754/665605
    
    Metadata
    Show full item record
    Abstract
    Research and experimentation in various scientific fields are based on the knowledge and ideas from scholarly literature. The advancement of research and development has, thus, strengthened the importance of literary analysis and understanding. However, in recent years, researchers have been facing massive scholarly documents published at an exponentially increasing rate. Analyzing this vast number of publications is far beyond the capability of individual researchers. This dissertation is motivated by the need for large scale analyses of the exploding number of scholarly literature for scientific knowledge discovery. In the first part of this dissertation, the interdependencies between scholarly literature are studied. First, I develop Delve – a data-driven search engine supported by our designed semi-supervised edge classification method. This system enables users to search and analyze the relationship between datasets and scholarly literature. Based on the Delve system, I propose to study information extraction as a node classification problem in attributed networks. Specifically, if we can learn the research topics of documents (nodes in a network), we can aggregate documents by topics and retrieve information specific to each topic (e.g., top-k popular datasets). Node classification in attributed networks has several challenges: a limited number of labeled nodes, effective fusion of topological structure and node/edge attributes, and the co-existence of multiple labels for one node. Existing node classification approaches can only address or partially address a few of these challenges. This dissertation addresses these challenges by proposing semi-supervised multi-class/multi-label node classification models to integrate node/edge attributes and topological relationships. The second part of this dissertation examines the problem of analyzing the interdependencies between terms in scholarly literature. I present two algorithms for the automatic hypothesis generation (HG) problem, which refers to the discovery of meaningful implicit connections between scientific terms, including but not limited to diseases, drugs, and genes extracted from databases of biomedical publications. The automatic hypothesis generation problem is modeled as a future connectivity prediction in a dynamic attributed graph. The key is to capture the temporal evolution of node-pair (term-pair) relations. Experiment results and case study analyses highlight the effectiveness of the proposed algorithms compared to the baselines’ extension.
    Citation
    Akujuobi, U. T. (2020). Learning from Scholarly Attributed Graphs for Scientific Discovery. KAUST Research Repository. https://doi.org/10.25781/KAUST-6IIR9
    DOI
    10.25781/KAUST-6IIR9
    ae974a485f413a2113503eed53cd6c53
    10.25781/KAUST-6IIR9
    Scopus Count
    Collections
    Dissertations; Computer Science Program; Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division

    entitlement

     

    Related items

    Showing items related by title, author, creator and subject.

    • Thumbnail

      SupportNet: a novel incremental learning framework through deep learning and support data

      Li, Yu; Li, Zhongxiao; Ding, Lizhong; Hu, Yuhui; Chen,Wei; Gao, Xin (Cold Spring Harbor Laboratory, 2018-05-08) [Preprint]
      Motivation: In most biological data sets, the amount of data is regularly growing and the number of classes is continuously increasing. To deal with the new data from the new classes, one approach is to train a classification model, e.g., a deep learning model, from scratch based on both old and new data. This approach is highly computationally costly and the extracted features are likely very different from the ones extracted by the model trained on the old data alone, which leads to poor model robustness. Another approach is to fine tune the trained model from the old data on the new data. However, this approach often does not have the ability to learn new knowledge without forgetting the previously learned knowledge, which is known as the catastrophic forgetting problem. To our knowledge, this problem has not been studied in the field of bioinformatics despite its existence in many bioinformatic problems. Results: Here we propose a novel method, SupportNet, to solve the catastrophic forgetting problem efficiently and effectively. SupportNet combines the strength of deep learning and support vector machine (SVM), where SVM is used to identify the support data from the old data, which are fed to the deep learning model together with the new data for further training so that the model can review the essential information of the old data when learning the new information. Two powerful consolidation regularizers are applied to ensure the robustness of the learned model. Comprehensive experiments on various tasks, including enzyme function prediction, subcellular structure classification and breast tumor classification, show that SupportNet drastically outperforms the state-of-the-art incremental learning methods and reaches similar performance as the deep learning model trained from scratch on both old and new data. Availability: Our program is accessible at: \url{https://github.com/lykaust15/SupportNet}.
    • Thumbnail

      Learning from Your Network of Friends: A Trajectory Representation Learning Model Based on Online Social Ties

      Alharbi, Basma Mohammed; Zhang, Xiangliang (2016 IEEE 16th International Conference on Data Mining (ICDM), Institute of Electrical and Electronics Engineers (IEEE), 2017-02-07) [Conference Paper]
      Location-Based Social Networks (LBSNs) capture individuals whereabouts for a large portion of the population. To utilize this data for user (location)-similarity based tasks, one must map the raw data into a low-dimensional uniform feature space. However, due to the nature of LBSNs, many users have sparse and incomplete check-ins. In this work, we propose to overcome this issue by leveraging the network of friends, when learning the new feature space. We first analyze the impact of friends on individuals's mobility, and show that individuals trajectories are correlated with thoseof their friends and friends of friends (2-hop friends) in an online setting. Based on our observation, we propose a mixed-membership model that infers global mobility patterns from users' check-ins and their network of friends, without impairing the model's complexity. Our proposed model infers global patterns and learns new representations for both usersand locations simultaneously. We evaluate the inferred patterns and compare the quality of the new user representation against baseline methods on a social link prediction problem.
    • Thumbnail

      Representation learning with deep extreme learning machines for efficient image set classification

      Uzair, Muhammad; Shafait, Faisal; Ghanem, Bernard; Mian, Ajmal (Neural Computing and Applications, Springer Nature, 2016-12-09) [Article]
      Efficient and accurate representation of a collection of images, that belong to the same class, is a major research challenge for practical image set classification. Existing methods either make prior assumptions about the data structure, or perform heavy computations to learn structure from the data itself. In this paper, we propose an efficient image set representation that does not make any prior assumptions about the structure of the underlying data. We learn the nonlinear structure of image sets with deep extreme learning machines that are very efficient and generalize well even on a limited number of training samples. Extensive experiments on a broad range of public datasets for image set classification show that the proposed algorithm consistently outperforms state-of-the-art image set classification methods both in terms of speed and accuracy.
    DSpace software copyright © 2002-2021  DuraSpace
    Quick Guide | Contact Us | Send Feedback
    Open Repository is a service hosted by 
    Atmire NV
     

    Export search results

    The export option will allow you to export the current search results of the entered query to a file. Different formats are available for download. To export the items, click on the button corresponding with the preferred download format.

    By default, clicking on the export buttons will result in a download of the allowed maximum amount of items. For anonymous users the allowed maximum amount is 50 search results.

    To select a subset of the search results, click "Selective Export" button and make a selection of the items you want to export. The amount of items that can be exported at once is similarly restricted as the full export.

    After making a selection, click one of the export format buttons. The amount of items that will be exported is indicated in the bubble next to export format.