
Unpublished Paper
Cryptogram Decoding for OCR using Numerization Strings
(2007)
Abstract
OCR systems for printed documents typically require large numbers of font styles and character models to work well. When given an unseen font, performance degrades even in the absence of noise. In this paper, we perform OCR in an unsupervised fashion without using any character models by using a cryptogram decoding algorithm. We present results on real and artificial OCR data.
Disciplines
Publication Date
2007
Comments
This is the pre-published version harvested from CIIR.
Citation Information
Gary Huang, Erik Learned-Miller and Andrew McCallum. "Cryptogram Decoding for OCR using Numerization Strings" (2007) Available at: http://works.bepress.com/andrew_mccallum/111/