
Unpublished Paper
Penn/Umass/CHOP Biocreative II systems
(2007)
Abstract
Our team participated in the entity tagging and normalization tasks of Biocreative II. For the entity tagging task, we used a k-best MIRA learning algorithm with lexicons and automatically derived word clusters. MIRA accommodates different training loss functions, which allowed us to exploit gene alternatives in training. We also performed a greedy search over feature templates and the development data, achieving a final F-measure of 86.28%. For the normalization task, we proposed a new specialized on-line learning algorithm and applied it for filtering out false positives from a high recall list of candidates. For normalization we received an F-measure of 69.8%.
Keywords
- aberrant splicing,
- database,
- point mutation,
- scanning model
Disciplines
Publication Date
2007
Comments
This is the pre-published version harvested from CIIR.
Citation Information
Kuzman Ganchev, Koby Crammer, Fernando Pereira, Gideon Mann, et al.. "Penn/Umass/CHOP Biocreative II systems" (2007) Available at: http://works.bepress.com/andrew_mccallum/109/