"Scene text recognition with bilateral regression" by Jacqueline Feild

Selected Works of Erik G Learned-Miller

Follow Contact

Article

Scene text recognition with bilateral regression

UMass Amherst Technical Report (2012)

Jacqueline Feild
Erik G Learned-Miller, University of Massachusetts - Amherst

Download

Abstract

This paper focuses on improving the recognition of text in images of natural scenes, such as storefront signs or street signs. This is a difficult problem due to lighting conditions, variation in font shape and color, and complex backgrounds. We present a word recognition system that addresses these difficulties using an innovative technique to extract and recognize foreground text in an image. First, we develop a new method, called bilateral regression, for extracting and modeling one coherent (although not necessarily contiguous) region from an image. The method models smooth color changes across an image region without being corrupted by neighboring image regions. Second, rather than making a hard decision early in the pipeline about which region is foreground, we generate a set of possible foreground hypotheses, and choose among these using feedback from a recognition system. We show increased recognition performance using our segmentation method compared to the current state of the art. Overall, using our system we also show a substantial increase in word accuracy on the word spotting task over the current state of the art on the ICDAR 2003 word recognition data set.

Disciplines

Computer Sciences

Publication Date

2012

Citation Information

Jacqueline Feild and Erik G Learned-Miller. "Scene text recognition with bilateral regression" UMass Amherst Technical Report (2012)
Available at: http://works.bepress.com/erik_learned_miller/41/