Show simple item record

dc.contributor.advisorZhang, Xiangliang
dc.contributor.authorAkujuobi, Uchenna Thankgod
dc.date.accessioned2016-05-03T12:40:19Z
dc.date.available2016-05-03T12:40:19Z
dc.date.issued2016-04-12
dc.identifier.citationAkujuobi, U. T. (2016). Link Label Prediction in Signed Citation Network. KAUST Research Repository. https://doi.org/10.25781/KAUST-TZW78
dc.identifier.doi10.25781/KAUST-TZW78
dc.identifier.urihttp://hdl.handle.net/10754/607760
dc.description.abstractLink label prediction is the problem of predicting the missing labels or signs of all the unlabeled edges in a network. For signed networks, these labels can either be positive or negative. In recent years, different algorithms have been proposed such as using regression, trust propagation and matrix factorization. These approaches have tried to solve the problem of link label prediction by using ideas from social theories, where most of them predict a single missing label given that labels of other edges are known. However, in most real-world social graphs, the number of labeled edges is usually less than that of unlabeled edges. Therefore, predicting a single edge label at a time would require multiple runs and is more computationally demanding. In this thesis, we look at link label prediction problem on a signed citation network with missing edge labels. Our citation network consists of papers from three major machine learning and data mining conferences together with their references, and edges showing the relationship between them. An edge in our network is labeled either positive (dataset relevant) if the reference is based on the dataset used in the paper or negative otherwise. We present three approaches to predict the missing labels. The first approach converts the label prediction problem into a standard classification problem. We then, generate a set of features for each edge and then adopt Support Vector Machines in solving the classification problem. For the second approach, we formalize the graph such that the edges are represented as nodes with links showing similarities between them. We then adopt a label propagation method to propagate the labels on known nodes to those with unknown labels. In the third approach, we adopt a PageRank approach where we rank the nodes according to the number of incoming positive and negative edges, after which we set a threshold. Based on the ranks, we can infer an edge would be positive if it goes a node above the threshold. Experimental results on our citation network corroborate the efficacy of these approaches. With each edge having a label, we also performed additional network analysis where we extracted a subnetwork of the dataset relevant edges and nodes in our citation network, and then detected different communities from this extracted sub-network. To understand the different detected communities, we performed a case study on several dataset communities. The study shows a relationship between the major topic areas in a dataset community and the data sources in the community.
dc.language.isoen
dc.subjectlink label prediction
dc.subjectcitation network
dc.subjectsigned network
dc.subjectlabel propagation
dc.subjectpageRank
dc.subjectSVM
dc.titleLink Label Prediction in Signed Citation Network
dc.typeThesis
dc.contributor.departmentComputer, Electrical and Mathematical Science and Engineering (CEMSE) Division
thesis.degree.grantorKing Abdullah University of Science and Technology
dc.contributor.committeememberMoshkov, Mikhail
dc.contributor.committeememberGao, Xin
thesis.degree.disciplineComputer Science
thesis.degree.nameMaster of Science
refterms.dateFOA2018-06-13T12:21:52Z


Files in this item

Thumbnail
Name:
Thesis.pdf
Size:
3.522Mb
Format:
PDF

This item appears in the following Collection(s)

Show simple item record