Type
DissertationAuthors
Pei, Shichao
Advisors
Zhang, Xiangliang
Committee members
Moshkov, Mikhail
Hoehndorf, Robert

Zhuang,Fuzhen
Program
Computer ScienceDate
2021-11-17Embargo End Date
2022-11-17Permanent link to this record
http://hdl.handle.net/10754/673406
Metadata
Show full item recordAccess Restrictions
At the time of archiving, the student author of this dissertation opted to temporarily restrict access to it. The full text of this dissertation will become available to the public after the expiration of the embargo on 2022-11-17.Abstract
The next generation of artificial intelligence is based on human knowledge and experience that can assist the evolution of artificial intelligence towards learning the capability of planning and reasoning. Although knowledge collection and organiza- tion have achieved tremendous progress, it is non-trivial to construct a comprehen- sive knowledge graph due to different data sources, various construction methods, and alternate entity surface forms. The difficulty motivates the study of knowledge association. Knowledge association has attracted the attention of researchers, and some solutions have been proposed to resolve the problem, yet these current solutions of knowledge association still suffer from two primary shortages, i.e., generalization and robustness. Specifically, most knowledge association methods require a sufficient number of labeled data and ignore the effective exploration and utilization of complex relationships between entities. Besides, prevailing approaches rely on clean labeled data as the training set, making the model vulnerable to noises in the given labeled data. These drawbacks motivate the research on generalization and robustness of knowledge association in this dissertation. This dissertation explores two kinds of knowledge association tasks, i.e., entity alignment and entity synonym discovery, and makes innovative contributions to ad- dress the above drawbacks. First, semi-supervised entity alignment frameworks, which take advantage of both labeled with unlabeled entities, are proposed. One em- ploys an entity-level loss that is based on the cycle-consistency translation loss, and another one dually minimizes both entity-level and group-level loss by utilizing opti- mal transport theory to ease the strict constraint imposed by the cycle-consistency loss and match the whole picture of labeled and unlabeled data in different data sources. Second, robust entity alignment methods are proposed to solve the draw- back of robustness. One is designed by following adversarial training principle and leveraging graph neural network, and is optimized by a unified reinforced training strategy to combine its two components, i.e., noise detection and noise-aware entity alignment. Another one resorts to non-sampling and curriculum learning to address the negative sampling issue and the positive data selection issue remaining in the previous method. Lastly, a set-aware entity synonym discovery model that enables a flexible receptive field by making a breakthrough in using entity synonym set informa- tion is proposed to explore the complex relationship between entities. The contextual information of entities and entity synonym sets are arranged by a two-level network from which both of them can be mapped into the same space to facilitate synonym discovery by encoding the high-order contexts from flexible receptive fields.Citation
Pei, S. (2021). Towards Generalized and Robust Knowledge Association. KAUST Research Repository. https://doi.org/10.25781/KAUST-06414ae974a485f413a2113503eed53cd6c53
10.25781/KAUST-06414