
Building the digital libraries of the future will require a number of different component technologies including the ability to retrieve multi-media information. This paper will describe progress in this area at the Center for Intelligent Information Retrieval (CIIR). This includes: 1) Multi-modal retrieval using appearance based image retrieval and text retrieval. This work has been applied to a large database of trademarks containing image and text data from the US Patent and Trademark Office. 68,000 trademarks may be searched using either image retrieval or image and text retrieval while 615,000 trademarks may be searched using text retrieval. 2) Indexing handwritten manuscripts. Recently we have developed a scale-space technique for word segmentation in handwritten manuscripts. 3) item Other projects including color based image retrieval and the extraction of text from images.
Available at: http://works.bepress.com/r_manmatha/12/