Skip to main content
Article
Translingual Information Retrieval: A Comparative Evaluation
Computer Science Department
  • Jaime G. Carbonell, Carnegie Mellon University
  • Yiming Yang, Carnegie Mellon University
  • Robert Frederking, Carnegie Mellon University
  • Ralf D Brown, Carnegie Mellon University
  • Yibing Geng, Carnegie Mellon University
  • Danny Lee, Carnegie Mellon University
Date of Original Version
8-1-1997
Type
Conference Proceeding
Abstract or Description

Translingual information retrieval (TIR) consists of providing a query in one language and searching document collections in one or more different languages. This paper introduces new TIR methods and reports on comparative TIR experiments with these new methods and with previously reported ones in a realistic setting. Methods fall into two categories, query translation based, and statistical-IR approaches establishing translingual associations. The results show that using bilingual corpora for automated extraction of term equivalences in context outperforms other methods. Translingual versions of the Generalized Vector Space Model (GVSM) and Latent Semantic Indexing (LSI) perform relatively well, as does translingual pseudo relevance feedback (PRF). All showed relatively small performance loss between monolingual and translingual versions. Query translation based on a general machine- readable bilingual dictionary heretofore the most popular method did not match the performance of other, more sophisticated methods. Also, the previous very high LSI results in the literature were disconfirmed by more realistic relevance-based evaluations.

Citation Information
Jaime G. Carbonell, Yiming Yang, Robert Frederking, Ralf D Brown, et al.. "Translingual Information Retrieval: A Comparative Evaluation" (1997)
Available at: http://works.bepress.com/jaime_carbonell/109/