Skip to main content
Unpublished Paper
Penn/Umass/CHOP Biocreative II systems
(2007)
  • Kuzman Ganchev
  • Koby Crammer
  • Fernando Pereira
  • Gideon Mann
  • Kedar Bellare
  • Andrew McCallum, University of Massachusetts - Amherst
  • Steve Carroll
  • Yang Jin
  • Peter White
Abstract
Our team participated in the entity tagging and normalization tasks of Biocreative II. For the entity tagging task, we used a k-best MIRA learning algorithm with lexicons and automatically derived word clusters. MIRA accommodates different training loss functions, which allowed us to exploit gene alternatives in training. We also performed a greedy search over feature templates and the development data, achieving a final F-measure of 86.28%. For the normalization task, we proposed a new specialized on-line learning algorithm and applied it for filtering out false positives from a high recall list of candidates. For normalization we received an F-measure of 69.8%.
Keywords
  • aberrant splicing,
  • database,
  • point mutation,
  • scanning model
Disciplines
Publication Date
2007
Comments
This is the pre-published version harvested from CIIR.
Citation Information
Kuzman Ganchev, Koby Crammer, Fernando Pereira, Gideon Mann, et al.. "Penn/Umass/CHOP Biocreative II systems" (2007)
Available at: http://works.bepress.com/andrew_mccallum/109/