Measuring Importance and Query Relevance in Toopic-Focused Multi-Document SummarizationDepartmental Papers (CIS)
Date of this Version6-1-2007
Document TypeConference Paper
AbstractThe increasing complexity of summarization systems makes it difficult to analyze exactly which modules make a difference in performance. We carried out a principled comparison between the two most commonly used schemes for assigning importance to words in the context of query focused multi-document summarization: raw frequency (word probability) and log-likelihood ratio. We demonstrate that the advantages of log-likelihood ratio come from its known distributional properties which allow for the identification of a set of words that in its entirety defines the aboutness of the input. We also find that LLR is more suitable for query-focused summarization since, unlike raw frequency, it is more sensitive to the integration of the information need defined by the user.
Citation InformationSurabhi Gupta, Ani Nenkova and Dan Jurafsky. "Measuring Importance and Query Relevance in Toopic-Focused Multi-Document Summarization" (2007)
Available at: http://works.bepress.com/ani_nenkova/17/