Skip to main content
Article
OneStopEnglish corpus: A new corpus for automatic readability assessment and text simplification
Proceedings of the Thirteenth Workshopon Innovative Use of NLP for Building Educational Applications
  • Sowmya Vajjala, Iowa State University
  • Ivana Lucic, Iowa State University
Document Type
Conference Proceeding
Conference
Thirteenth Workshop on Innovative Use of NLP for Building Educational Applications
Publication Version
Published Version
Publication Date
1-1-2018
Conference Date
June 5, 2018
Geolocation
(29.95106579999999, -90.0715323)
Abstract

This paper describes the collection and compilation of the OneStopEnglish corpus of texts written at three reading levels, and demonstrates its usefulness for through two applications - automatic readability assessment and automatic text simplification. The corpus consists of 189 texts, each in three versions (567 in total). The corpus is now freely available under a CC by-SA 4.0 license1 and we hope that it would foster further research on the topics of readability assessment and text simplification.

Comments

This proceeding is published as Vajjala, Sowmya, and Ivana Lucic. "OneStopEnglish corpus: A new corpus for automatic readability assessment and text simplification." In Proceedings of the Thirteenth Workshop on Innovative Use of NLP for Building Educational Applications (2018): 297-304.

Creative Commons License
Creative Commons Attribution-Share Alike 4.0
Copyright Owner
Association for Computational Linguistics
Language
en
File Format
application/pdf
Citation Information
Sowmya Vajjala and Ivana Lucic. "OneStopEnglish corpus: A new corpus for automatic readability assessment and text simplification" New Orleans, LAProceedings of the Thirteenth Workshopon Innovative Use of NLP for Building Educational Applications (2018) p. 297 - 304
Available at: http://works.bepress.com/sowmya-vajjala/18/