Skip to main content
Article
Extracting Paraphrases of Technical Terms from Noisy Parallel Software Corpus
CLShort '09: Proceedings of the ACL-IJCNLP 2009 Conference Short Papers
  • Xiaoyin WANG
  • David LO, Singapore Management University
  • Jing JIANG, Singapore Management University
  • LU ZHANG
  • Hong Mei
Publication Type
Conference Paper
Version
Publisher’s Version
Publication Date
8-2009
Abstract

In this paper, we study the problem of extracting technical paraphrases from a parallel software corpus, namely, a collection of duplicate bug reports. Paraphrase acquisition is a fundamental task in the emerging area of text mining for software engineering. Existing paraphrase extraction methods are not entirely suitable here due to the noisy nature of bug reports. We propose a number of techniques to address the noisy data problem. The empirical evaluation shows that our method significantly improves an existing method by upto 58%

City or Country
ACM
Creative Commons License
Creative Commons Attribution-Noncommercial-No Derivative Works 4.0
Additional URL
http://dl.acm.org/citation.cfm?id=1667583.1667644
Citation Information
Xiaoyin WANG, David LO, Jing JIANG, LU ZHANG, et al.. "Extracting Paraphrases of Technical Terms from Noisy Parallel Software Corpus" CLShort '09: Proceedings of the ACL-IJCNLP 2009 Conference Short Papers (2009) p. 197 - 200
Available at: http://works.bepress.com/david_lo/45/