Skip to main content
Article
CULG: Commercial Universal Language Generation
Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Industry Track
  • Haonan Li, School of Computing and Information Systems, The University of Melbourne, Australia
  • Yameng Huang, Microsoft, United States
  • Yeyun Gong, Microsoft Research Asia, China
  • Jian Jiao, Microsoft, United States
  • Ruofei Zhang, Microsoft, United States
  • Timothy Baldwin, School of Computing and Information Systems, The University of Melbourne, Australia & Mohamed bin Zayed University of Artificial Intelligence
  • Nan Duan, Microsoft Research Asia, China
Document Type
Conference Proceeding
Abstract

Pre-trained language models (PLMs) have dramatically improved performance for many natural language processing (NLP) tasks in domains such as finance and healthcare. However, the application of PLMs in the domain of commerce, especially marketing and advertising, remains less studied. In this work, we adapt pretraining methods to the domain of commerce, by proposing CULG, a large-scale commercial universal language generation model which is pre-trained on a corpus drawn from 10 markets across 7 languages. We propose 4 commercial generation tasks and a two-stage training strategy for pre-training, and demonstrate that the proposed strategy yields performance improvements on three generation tasks as compared to single-stage pre-training. Extensive experiments show that our model outperforms other models by a large margin on commercial generation tasks. © 2022 Association for Computational Linguistics.

DOI
10.18653/v1/2022.naacl-industry.14
Publication Date
7-1-2022
Keywords
  • Computational linguistics,
  • Marketing,
  • Natural language processing systems
Comments

IR Deposit conditions: non-described

Citation Information
H. Li et al, "CULG: Commercial Universal Language Generation", in Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Industry Track (NAACL 2022), July 2022, pp. 112–120, doi:10.18653/v1/2022.naacl-industry.14