Skip to main content
Article
Ontology-guided extraction of structured information from unstructured text: Identifying and capturing complex relationships
Graduate Theses and Dissertations
  • Sushain Pandit, Iowa State University
Degree Type
Thesis
Date of Award
2010
Degree Name
Master of Science
Department
Computer Science
First Advisor
Vasant Honavar
Subject Categories
Abstract

Many applications call for methods to enable automatic extraction of structured information from unstructured natural language text. Due to the inherent challenges of natural language processing, most of the existing methods for information extraction from text tend to be domain specific. This thesis explores a modular ontology-based approach to information extraction that decouples domain-specific knowledge from the rules used for information extraction. Specifically, the thesis describes:

1. A framework for ontology-driven extraction of a subset of nested complex relationships (e.g., Joe reports that Jim is a reliable employee) from free text. The extracted relationships are semantically represented in the form of RDF (resource description framework) graphs, which can be stored in RDF knowledge bases and queried using query languages for RDF.

2. An open source implementation of SEMANTIXS, a system for ontology-guided extraction and semantic representation of structured information from unstructured text.

3. Results of experiments that offer evidence of the utility of the proposed ontology-based approach to extract complex relationships from text.

DOI
https://doi.org/10.31274/etd-180810-2519
Copyright Owner
Sushain Pandit
Language
en
Date Available
2012-04-30
File Format
application/pdf
File Size
92 pages
Citation Information
Sushain Pandit. "Ontology-guided extraction of structured information from unstructured text: Identifying and capturing complex relationships" (2010)
Available at: http://works.bepress.com/sushain/4/