
Unpublished Paper
Rapid Development of Hindi Named Entity Recognition using Conditional Random Fields and Feature Induction
(2003)
Abstract
This paper describes our application of Conditional Random Fields (CRFs) with feature induction to a Hindi named entity recognition task. With only five days development time and little knowledge of this language, we automatically discover relevant features by providing a large array of lexical tests and using feature induction to automatically construct the features that most increase conditional likelihood. In an effort to reduce overfitting, we use a combination of a Gaussian prior and early-stopping based on the results of 10-fold cross validation.
Keywords
- Artificial intelligence,
- natural language processing,
- text analysis,
- mathematics of computing,
- probablistic algorithms
Disciplines
Publication Date
2003
Comments
This is the pre-published version harvested from CIIR.
Citation Information
Wei Li and Andrew McCallum. "Rapid Development of Hindi Named Entity Recognition using Conditional Random Fields and Feature Induction" (2003) Available at: http://works.bepress.com/andrew_mccallum/138/