Face and Lip Localization in Unconstrained Imagery
Copyright © 2008 ACTA Press.
When combined with acoustical speech information, visual speech information (lip movement) significantly improves Automatic Speech Recognition (ASR) in acoustically noisy environments. Previous research has demonstrated that visual modality is a viable tool for identifying speech. However, the visual information has yet to become utilized in mainstream ASR systems due to the difficulty in accurately tracking lips in real-world conditions. This paper presents our current progress in addressing this issue. We derive several algorithms based on a modified HSI color space to successfully locate the face, eyes, and lips. These algorithms are then tested over imagery collected in visually challenging environments.
Brandon Crow and Jane Zhang. "Face and Lip Localization in Unconstrained Imagery" 10th IASTED International Conference on Signal and Image Processing Proceedings: Kailua-Kona, HI.. Aug. 2008.
Available at: http://works.bepress.com/jzhang/14