Information Extraction using Natural Language Processing (NLP) produces discrete entities along with some of the relationships that may exist among them. To be semantically useful, however, such discrete extractions must be put into context through some form of intelligent analysis. This paper offers a two-part architecture that employs the statistical methods of traditional NLP to extract discrete information elements in a relatively domain-agnostic manner, which are then injected into an inference-enabled environment where they can be semantically analyzed within an evolving context. Within this semantic environment, extractions are woven into the contextual fabric of a user-provided, domain-centric ontology where users together with user-provided logic can analyze these extractions within a more contextually complete picture. The Ontology Driven Information eXtraction system (ODIX) has three focal points: 1) flexible platform, which integrates configurable services, 2) access to large volumes of text documents and 3) user interaction at every step of the process with the ability to guide and adjust the results of extraction and inference.
- information extraction,
- Natural Language Processing,
- service-oriented architecture,
Available at: http://works.bepress.com/fkurfess/27/