OCR often performs poorly on degraded documents. One approach to improving performance is to determine a good filter to improve the appearance of the document image before sending it to the OCR engine. Quality metrics have been measured in document images to determine what type of filtering would most likely improve the OCR response for that document image. In this paper those same quality metrics are measured for several word images degraded by known parameters in a document degradation model. The correlation between the degradation model parameters and the quality metrics is measured. High correlations do appear in many places that were expected. They are also absent in some expected places and offer a comparison of quality metric definitions proposed by different authors.
Copyright 2008 Society of Photo-Optical Instrumentation Engineers. One print or electronic copy may be made for personal use only. Systematic reproduction and distribution, duplication of any material in this paper for a fee or for commercial purposes, or modification of the content of the paper are prohibited. DOI: 10.1117/12.766784
Available at: http://works.bepress.com/elisa_barney_smith/34/