Professor Allan's research focuses on the automatic organization of textual information. He is interested in statistical methods extracting "concepts" from text and finding relationships between those concepts that can be used to impose structure that is otherwise hidden. For example, in the Topic Detection and Tracking work, the effort is toward adding topic-based structure to broadcast news sources as it evolves over time. He is also interested in information visualization, or methods for showing people the relationships between text or concepts--often when those relationships are not easily described by rules.
Articles
Topic Detection and Tracking Pilot Study Final Report (with Jaime G. Carbonell, George Doddington, Jonathan Yamron, and Yiming Yang), Computer Science Department (1998)
Topic Detection and Tracking (TDT) is a DARPA-sponsored initiative to investigate the state of the...
Other
Find-Similar: Similarity Browsing as a Search Tool (with Mark D. Smucker), Computer Science Department Faculty Publication Series (2006)
Search systems have for some time provided users with the ability to request documents similar...
Incremental Test Collections (with Ben Carterette), Computer Science Department Faculty Publication Series (2005)
Corpora and topics are readily available for information retrieval research. Relevance judgments, which are necessary...
When Will Information Retrieval Be “Good Enough”?, Computer Science Department Faculty Publication Series (2005)
We describe a user study that examined the relationship between the quality of an Information...
Dynamic Composition of Information Retrieval Techniques (with Andrew Arnt and Shlomo Zilberstein), Computer Science Department Faculty Publication Series (2004)
This paper presents a new approach to information retrieval (IR) based on run-time selection of...
HARD Track Overview in TREC 2004 High Accuracy Retrieval from Documents, Computer Science Department Faculty Publication Series (2004)
The HARD track of TREC 2004 aims to improve the accuracy of information retrieval through...