Skip to main content
Contribution to Book
Promoting Diversity in Top Hits for Biomedical Passage Retrieval
Advances in Data Management (2009)
  • Bill Andreopoulos, York University
  • Xiangji Huang, York University
  • Aijun An, York University
  • Dirk Labudde, York University
  • Qinmin Hu, York University
Abstract
With the volume of biomedical literature exploding, such as BMC or PubMed, it is of paramount importance to have scalable passage retrieval systems that allow researchers to quickly find desired information. While topical relevance is the most important factor in biomedical text retrieval, an effective retrieval system needs to also cover diverse aspects of the topic. Aspect-level performance means that top-ranked passages for a topic should cover diverse aspects. Aspect-level retrieval methods often involve clustering the retrieved passages on the basis of textual similarity. We propose the HIERDENC text retrieval system that ranks the retrieved passages, achieving scalability and improved aspect-level performance over other clustering methods. HIERDENC runtimes scale on large datasets, such as PubMed and BMC. The HIERDENC aspect-level performance is consistently better than cosine similarity and Hamming Distance-based clustering methods. HIERDENC is comparable to biclustering separation of relevant passages, and improves on topics where many aspects are involved. Converting textual passages to GO/MeSH ontological terms improves the HIERDENC aspect-level performance.
Keywords
  • Singular Vector,
  • Cosine Similarity,
  • Ontological Term,
  • Query Expansion,
  • Mean Average Precision
Publication Date
2009
Editor
Zbigniew W. Ras & Agnieszka Dardzinska
Publisher
Springer
Series
Studies in Computational Intelligence
ISBN
978-3-642-02189-3
DOI
10.1007/978-3-642-02190-9_18
Citation Information
Bill Andreopoulos, Xiangji Huang, Aijun An, Dirk Labudde, et al.. "Promoting Diversity in Top Hits for Biomedical Passage Retrieval" Berlin, HeidelbergAdvances in Data Management Vol. 223 (2009) p. 371 - 393
Available at: http://works.bepress.com/william-andreopoulos/12/