Graph embedding with rich information through heterogeneous graph

Handle URI:
http://hdl.handle.net/10754/626207
Title:
Graph embedding with rich information through heterogeneous graph
Authors:
Sun, Guolei ( 0000-0001-8667-9656 )
Abstract:
Graph embedding, aiming to learn low-dimensional representations for nodes in graphs, has attracted increasing attention due to its critical application including node classification, link prediction and clustering in social network analysis. Most existing algorithms for graph embedding only rely on the topology information and fail to use the copious information in nodes as well as edges. As a result, their performance for many tasks may not be satisfactory. In this thesis, we proposed a novel and general framework for graph embedding with rich text information (GERI) through constructing a heterogeneous network, in which we integrate node and edge content information with graph topology. Specially, we designed a novel biased random walk to explore the constructed heterogeneous network with the notion of flexible neighborhood. Our sampling strategy can compromise between BFS and DFS local search on heterogeneous graph. To further improve our algorithm, we proposed semi-supervised GERI (SGERI), which learns graph embedding in an discriminative manner through heterogeneous network with label information. The efficacy of our method is demonstrated by extensive comparison experiments with 9 baselines over multi-label and multi-class classification on various datasets including Citeseer, Cora, DBLP and Wiki. It shows that GERI improves the Micro-F1 and Macro-F1 of node classification up to 10%, and SGERI improves GERI by 5% in Wiki.
Advisors:
Zhang, Xiangliang ( 0000-0002-3574-5665 )
Committee Member:
Gao, Xin ( 0000-0002-7108-3574 ) ; Moshkov, Mikhail ( 0000-0003-0085-9483 )
KAUST Department:
Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division
Program:
Computer Science
Issue Date:
12-Nov-2017
Type:
Thesis
Appears in Collections:
Theses

Full metadata record

DC FieldValue Language
dc.contributor.advisorZhang, Xiangliangen
dc.contributor.authorSun, Guoleien
dc.date.accessioned2017-11-26T08:45:01Z-
dc.date.available2017-11-26T08:45:01Z-
dc.date.issued2017-11-12-
dc.identifier.urihttp://hdl.handle.net/10754/626207-
dc.description.abstractGraph embedding, aiming to learn low-dimensional representations for nodes in graphs, has attracted increasing attention due to its critical application including node classification, link prediction and clustering in social network analysis. Most existing algorithms for graph embedding only rely on the topology information and fail to use the copious information in nodes as well as edges. As a result, their performance for many tasks may not be satisfactory. In this thesis, we proposed a novel and general framework for graph embedding with rich text information (GERI) through constructing a heterogeneous network, in which we integrate node and edge content information with graph topology. Specially, we designed a novel biased random walk to explore the constructed heterogeneous network with the notion of flexible neighborhood. Our sampling strategy can compromise between BFS and DFS local search on heterogeneous graph. To further improve our algorithm, we proposed semi-supervised GERI (SGERI), which learns graph embedding in an discriminative manner through heterogeneous network with label information. The efficacy of our method is demonstrated by extensive comparison experiments with 9 baselines over multi-label and multi-class classification on various datasets including Citeseer, Cora, DBLP and Wiki. It shows that GERI improves the Micro-F1 and Macro-F1 of node classification up to 10%, and SGERI improves GERI by 5% in Wiki.en
dc.language.isoenen
dc.subjectGraph embeddingen
dc.subjectheterogeneous graphen
dc.subjectrich informationen
dc.subjectrandom walken
dc.titleGraph embedding with rich information through heterogeneous graphen
dc.typeThesisen
dc.contributor.departmentComputer, Electrical and Mathematical Sciences and Engineering (CEMSE) Divisionen
thesis.degree.grantorKing Abdullah University of Science and Technologyen
dc.contributor.committeememberGao, Xinen
dc.contributor.committeememberMoshkov, Mikhailen
thesis.degree.disciplineComputer Scienceen
thesis.degree.nameMaster of Scienceen
dc.person.id146204en
All Items in KAUST are protected by copyright, with all rights reserved, unless otherwise indicated.