Embargo End Date2022-11-17
Permanent link to this recordhttp://hdl.handle.net/10754/673406
MetadataShow full item record
Access RestrictionsAt the time of archiving, the student author of this dissertation opted to temporarily restrict access to it. The full text of this dissertation will become available to the public after the expiration of the embargo on 2022-11-17.
AbstractThe next generation of artificial intelligence is based on human knowledge and experience that can assist the evolution of artificial intelligence towards learning the capability of planning and reasoning. Although knowledge collection and organiza- tion have achieved tremendous progress, it is non-trivial to construct a comprehen- sive knowledge graph due to different data sources, various construction methods, and alternate entity surface forms. The difficulty motivates the study of knowledge association. Knowledge association has attracted the attention of researchers, and some solutions have been proposed to resolve the problem, yet these current solutions of knowledge association still suffer from two primary shortages, i.e., generalization and robustness. Specifically, most knowledge association methods require a sufficient number of labeled data and ignore the effective exploration and utilization of complex relationships between entities. Besides, prevailing approaches rely on clean labeled data as the training set, making the model vulnerable to noises in the given labeled data. These drawbacks motivate the research on generalization and robustness of knowledge association in this dissertation. This dissertation explores two kinds of knowledge association tasks, i.e., entity alignment and entity synonym discovery, and makes innovative contributions to ad- dress the above drawbacks. First, semi-supervised entity alignment frameworks, which take advantage of both labeled with unlabeled entities, are proposed. One em- ploys an entity-level loss that is based on the cycle-consistency translation loss, and another one dually minimizes both entity-level and group-level loss by utilizing opti- mal transport theory to ease the strict constraint imposed by the cycle-consistency loss and match the whole picture of labeled and unlabeled data in different data sources. Second, robust entity alignment methods are proposed to solve the draw- back of robustness. One is designed by following adversarial training principle and leveraging graph neural network, and is optimized by a unified reinforced training strategy to combine its two components, i.e., noise detection and noise-aware entity alignment. Another one resorts to non-sampling and curriculum learning to address the negative sampling issue and the positive data selection issue remaining in the previous method. Lastly, a set-aware entity synonym discovery model that enables a flexible receptive field by making a breakthrough in using entity synonym set informa- tion is proposed to explore the complex relationship between entities. The contextual information of entities and entity synonym sets are arranged by a two-level network from which both of them can be mapped into the same space to facilitate synonym discovery by encoding the high-order contexts from flexible receptive fields.
CitationPei, S. (2021). Towards Generalized and Robust Knowledge Association. KAUST Research Repository. https://doi.org/10.25781/KAUST-06414