Skip to main content
Unpublished Paper
Adapting BLSTM Neural Network Based Keyword Spotting Trained on Modern Data to Historical Documents
(2010)
  • Volkmar Frinken
  • Andreas Fischer
  • Horst Bunke
  • R. Manmatha, University of Massachusetts - Amherst
Abstract

Being able to search for words or phrases in historic handwritten documents is of paramount importance when preserving cultural heritage. Storing scanned pages of written text can save the information from degradation, but it does not make the textual information readily available. Automatic keyword spotting systems for handwritten historic documents can fill this gap. However, most such systems have trouble with the great variety of writing styles. It is not uncommon for handwriting processing systems to be built for just a single book. In this paper we show that neural network based keyword spotting systems are flexible enough to be used successfully on historic data, even when they are trained on a modern handwriting database. We demonstrate that with little transcribed historic text, added to the training set, the performance can further be enhanced.

Keywords
  • Keyword Spotting,
  • Historical Data,
  • Handwriting Recognition,
  • Neural Networks,
  • Adaptation
Disciplines
Publication Date
2010
Comments
This is the pre-published version harvested from CIIR.
Citation Information
Volkmar Frinken, Andreas Fischer, Horst Bunke and R. Manmatha. "Adapting BLSTM Neural Network Based Keyword Spotting Trained on Modern Data to Historical Documents" (2010)
Available at: http://works.bepress.com/r_manmatha/45/