Skip to main content
Article
Text is Software Too
MSR 2004: International Workshop on Mining Software Repositories at ICSE’04: Edinburgh, Scotland
  • Alexander Dekhtyar, University of Kentucky
  • Jane Huffman Hayes, University of Kentucky
  • Tim Menzies, Portland State University
Publication Date
5-1-2004
Abstract

Software compiles and therefore is characterized by a parseable grammar. Natural language text rarely conforms to prescriptive grammars and therefore is much harder to parse. Mining parseable structures is easier than mining less structured entities. Therefore, most work on mining repositories focuses on software, not natural language text. Here, we report experiments with mining natural language text (requirements documents) suggesting that: (a) mining natural language is not too diffcult, so (b) software repositories should routinely be augmented with all the natural language text used to develop that software.

Disciplines
Publisher statement
Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.
Citation Information
Alexander Dekhtyar, Jane Huffman Hayes and Tim Menzies. "Text is Software Too" MSR 2004: International Workshop on Mining Software Repositories at ICSE’04: Edinburgh, Scotland (2004) p. 22 - 26
Available at: http://works.bepress.com/dekhtyar/62/