"Text Mining for Korean: Characteristics and Application to 2011 Korean Economic Census Data" by Juna Goo

Selected Works of Juna Goo

Follow Contact

Article

Text Mining for Korean: Characteristics and Application to 2011 Korean Economic Census Data

The Korean Journal of Applied Statistics (2014)

Juna Goo, Samsung Medical Center
Kyunga Kim, Samsung Medical Center

Link Find in your library

Abstract
2011 Korean Economic Census is the first economic census in Korea, which contains text data on menus
served by Korean-food restaurants as well as structured data on characteristics of restaurants including
area, opening year and total sales. In this paper, we applied text mining to the text data and investigated
statistical and technical issues and characteristics of Korean text mining. Pork belly roast was the most
popular menu across provinces and/or restaurant types in year 2010, and the number of restaurants per
10000 people was especially high in Kangwon-do and Daejeon metropolitan city. Beef tartare and fried pork cutlet are popular menus in start-up restaurants while whole chicken soup and maeuntang (spicy fish stew) are in long-lived restaurants. These results can be used as a guideline for menu development to restaurant owners, and for government policy-making process that lead small restaurants to choose proper menus for successful business.

Keywords

text mining,
dictionary construction,
big data,
Korean economic census

Disciplines

Statistics and Probability

Publication Date

December, 2014

DOI

http://dx.doi.org/10.5351/KJAS.2014.27.7.1207

Citation Information

Juna Goo and Kyunga Kim. "Text Mining for Korean: Characteristics and Application to 2011 Korean Economic Census Data" The Korean Journal of Applied Statistics Vol. 27 Iss. 7 (2014) p. 1207 - 1217 ISSN: 1225-066X
Available at: http://works.bepress.com/juna-goo/3/