"Penn/Umass/CHOP Biocreative II systems" by Kuzman Ganchev

Selected Works of Andrew McCallum

Follow Contact

Unpublished Paper

Penn/Umass/CHOP Biocreative II systems

(2007)

Kuzman Ganchev
Koby Crammer
Fernando Pereira
Gideon Mann
Kedar Bellare
Andrew McCallum, University of Massachusetts - Amherst
Steve Carroll
Yang Jin
Peter White

Download

Abstract

Our team participated in the entity tagging and normalization tasks of Biocreative II. For the entity tagging task, we used a k-best MIRA learning algorithm with lexicons and automatically derived word clusters. MIRA accommodates different training loss functions, which allowed us to exploit gene alternatives in training. We also performed a greedy search over feature templates and the development data, achieving a final F-measure of 86.28%. For the normalization task, we proposed a new specialized on-line learning algorithm and applied it for filtering out false positives from a high recall list of candidates. For normalization we received an F-measure of 69.8%.

Keywords

aberrant splicing,
database,
point mutation,
scanning model

Disciplines

Computer Sciences

Publication Date

2007

Comments

This is the pre-published version harvested from CIIR.

Citation Information

Kuzman Ganchev, Koby Crammer, Fernando Pereira, Gideon Mann, et al.. "Penn/Umass/CHOP Biocreative II systems" (2007)
Available at: http://works.bepress.com/andrew_mccallum/109/