"Automatic Domain Model Creation Using Pattern-Based Fact Extraction" by Christopher Thomas

Selected Works of Amit P. Sheth

Follow Contact

Article

Automatic Domain Model Creation Using Pattern-Based Fact Extraction

Kno.e.sis Publications

Christopher Thomas, Wright State University - Main Campus
Pankaj Mehra
Wenbo Wang, Wright State University - Main Campus
Amit P. Sheth, Wright State University - Main Campus
Gerhard Weikum
Victor Chan

Download

Document Type

Conference Proceeding

Publication Date

6-1-2011

Disciplines

Abstract

This paper describes a minimally guided approach to automatic domain model creation. The first step is to carve an area of interest out of the Wikipedia hierarchy based on a simple query or other starting point. The second step is to connect the concepts in this domain hierarchy with named relationships. A starting point is provided by Linked Open Data, such as DBPedia. Based on these community-generated facts we train a pattern-based fact-extraction algorithm to augment a domain hierarchy with previously unknown relationship occurrences. Pattern vectors are learned that represent occurrences of relationships between concepts. The process described can be fully automated and the number of relationships that can be learned grows as the community adds more information. Unlike approaches that are aimed at finding single, highly indicative patterns, we use the cumulative score of many pattern occurrences to increase extraction recall. The relationship identification process itself is based on positive only classification of training facts.

Comments

Submitted to the Sixth International Conference on Knowledge Capture, Banff, Alberta, Canada, June 25-29, 2011.

Citation Information

Christopher Thomas, Pankaj Mehra, Wenbo Wang, Amit P. Sheth, et al.. "Automatic Domain Model Creation Using Pattern-Based Fact Extraction" (2011)
Available at: http://works.bepress.com/amit_sheth/491/