Software compiles and therefore is characterized by a parseable grammar. Natural language text rarely conforms to prescriptive grammars and therefore is much harder to parse. Mining parseable structures is easier than mining less structured entities. Therefore, most work on mining repositories focuses on software, not natural language text. Here, we report experiments with mining natural language text (requirements documents) suggesting that: (a) mining natural language is not too diffcult, so (b) software repositories should routinely be augmented with all the natural language text used to develop that software.
Available at: http://works.bepress.com/dekhtyar/62/