The mining of user generated content in social media has proven very effective in domains ranging from personalization and recommendation systems to crisis management. The knowledge of online users locations makes their tweets more informative and adds another dimension to their analysis. Existing approaches to predict the location of Twitter users are purely data-driven and require large training data sets of geo-tagged tweets. The collection and modelling process of tweets can be time intensive. To overcome this drawback, we propose a novel knowledge based approach that does not require any training data. Our approach uses information in Wikipedia, about cities in the geographical area of our interest, to score entities most relevant to a city. By semantically matching the scored entities of a city and the entities mentioned by the user in his/her tweets, we predict the most likely location of the user. Using a publicly available benchmark dataset, we achieve 3% increase in accuracy and 80 miles drop in the average error distance with respect to the state-of-the-art approaches.
Available at: http://works.bepress.com/tk_prasad/74/