"Parsing Natural Language Queries for Extracting Data from Large-Scale Geospatial Transportation Asset Repositories" by Tuyen Le

Selected Works of Evgeny Chukharev-Hudilainen

Follow Contact

Presentation

Parsing Natural Language Queries for Extracting Data from Large-Scale Geospatial Transportation Asset Repositories

Construction Research Congress 2018: Infrastructure and Facility Management

Tuyen Le, Iowa State University
H. David Jeong, Iowa State University
Stephen B. Gilbert, Iowa State University
Evgeny Chukharev-Hudilainen, Iowa State University

Download

Document Type

Conference Proceeding

Disciplines

Conference

Construction Research Congress 2018: Infrastructure and Facility Management

Publication Version

Accepted Manuscript

Link to Published Version

https://doi.org/10.1061/9780784481295.008

Publication Date

3-29-2018

DOI

10.1061/9780784481295.008

Conference Title

Construction Research Congress 2018: Infrastructure and Facility Management

Conference Date

April 2-4, 2018

Geolocation

(29.95106579999999, -90.0715323)

Abstract

Recent advances in data and information technologies have enabled extensive digital datasets to be available to decision makers throughout the life cycle of a transportation project. However, most of these data are not yet fully reused due to the challenging and time-consuming process of extracting the desired data for a specific purpose. Digital datasets are presented only in computer-readable formats and they are mostly complicated. Extracting data from complex and large data sources is significantly time-consuming and requires considerable expertise. Thus, there is a need for a user-friendly data exploration framework that allows users to present their data interests in human language. To fulfill that demand, this study employs natural language processing (NLP) techniques to develop a natural language interface (NLI) which can understand users’ intent and automatically convert their inputs in the human language into formal queries. This paper presents the results of an important task of the development of such a NLI that is to establish a method for classifying the tokens of an ad-hoc query in accordance with their semantic contribution to the corresponding formal query. The method was validated on a small test set of 30 plain English questions manually annotated by an expert. The result shows an impressive accuracy of over 95%. The token classification presented in this paper is expected to provide a fundamental means for developing an effective NLI to transportation asset databases.

Comments

This is a manuscript of a proceeding published as Le, Tuyen, H. David Jeong, Stephen B. Gilbert, and Evgeny Chukharev-Hudilainen. "Parsing Natural Language Queries for Extracting Data from Large-Scale Geospatial Transportation Asset Repositories." In Construction Research Congress 2018: Building Community Partnerships. (2018): 70-79. DOI: 10.1061/9780784481295.008. Posted with permission.

Rights

This material may be downloaded for personal use only. Any other use requires prior permission of the American Society of Civil Engineers.

American Society of Civil Engineers

2018

Language

File Format

application/pdf

Citation Information

Tuyen Le, H. David Jeong, Stephen B. Gilbert and Evgeny Chukharev-Hudilainen. "Parsing Natural Language Queries for Extracting Data from Large-Scale Geospatial Transportation Asset Repositories" New Orleans, LAConstruction Research Congress 2018: Infrastructure and Facility Management (2018) p. 70 - 79
Available at: http://works.bepress.com/evgeny-chukharev-hudilainen/15/