Keywords and Instances: A Hierarchical Contrastive Learning Framework Unifying Hybrid Granularities for Text Generation
KAUST DepartmentKAUST, Ant Grp, Thuwal, Saudi Arabia
Computational Bioscience Research Center (CBRC)
Permanent link to this recordhttp://hdl.handle.net/10754/681592
MetadataShow full item record
AbstractContrastive learning has achieved impressive success in generation tasks to militate the "exposure bias" problem and discriminatively exploit the different quality of references. Existing works mostly focus on contrastive learning on the instance-level without discriminating the contribution of each word, while keywords are the gist of the text and dominant the constrained mapping relationships. Hence, in this work, we propose a hierarchical contrastive learning mechanism, which can unify hybrid granularities semantic meaning in the input text. Concretely, we first propose a keyword graph via contrastive correlations of positive-negative pairs to iteratively polish the keyword representations. Then, we construct intra-contrasts within instance-level and keyword-level, where we assume words are sampled nodes from a sentence distribution. Finally, to bridge the gap between independent contrast levels and tackle the common contrast vanishing problem, we propose an inter-contrast mechanism that measures the discrepancy between contrastive keyword nodes respectively to the instance distribution. Experiments demonstrate that our model outperforms competitive baselines on paraphrasing, dialogue generation, and storytelling tasks.
CitationLi, M., Lin, X., Chen, X., Chang, J., Zhang, Q., Wang, F., Wang, T., Liu, Z., Chu, W., Zhao, D., & Yan, R. (2022). Keywords and Instances: A Hierarchical Contrastive Learning Framework Unifying Hybrid Granularities for Text Generation. Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). https://doi.org/10.18653/v1/2022.acl-long.304
SponsorsWe would like to thank the anonymous reviewers for their constructive comments. This work was supported by National Key Research and Development Program of China (No. 2020AAA0106600), National Natural Science Foundation of China (NSFC Grant No. 62122089, No. 61832017 & No. 61876196), and Beijing Outstanding Young Scientist Program No. BJJWZYJH012019100020098. This work was also supported by Alibaba Group through Alibaba Research Intern Program.
Conference/Event name60th Annual Meeting of the Association-for-Computational-Linguistics (ACL)
Except where otherwise noted, this item's license is described as Archived with thanks to Association for Computational Linguistics under a Creative Commons license, details at: https://creativecommons.org/licenses/by/4.0/