The main goal of my research is to dramatically increase our ability to mine actionable knowledge from unstructured text. I am especially interested in information extraction from the Web, understanding the connections between people and between organizations, expert finding, social network analysis, and mining the scientific literature & community. Toward this end my group develops and employs various methods in statistical machine learning, natural language processing, information retrieval and data mining---tending toward probabilistic approaches and graphical models.
Latent Dirichlet Allocation Models
Topic and Role Discovery in Social Networks with Experiments on Enron and Academic Email (with Xuerui Wang and Andrés Corrada-Emmanuel), Journal of Artificial Intelligence Research (2007)
Previous work in social network analysis (SNA) has modeled the existence of links from one...
Machine Learning
Topic and Role Discovery in Social Networks with Experiments on Enron and Academic Email (with Xuerui Wang and Andrés Corrada-Emmanuel), Journal of Artificial Intelligence Research (2007)
Previous work in social network analysis (SNA) has modeled the existence of links from one...
No subject area
Combining joint models for biomedical event extraction (with David McClosky, Sebastian Riedel, Mihai Surdeanu, and Christopher D. Manning), BMC Bioinformatics (2012)
Background: We explore techniques for performing model combination between the UMass and Stanford biomedical event...
Piecewise Pseudolikelihood for Efficient Training of Conditional Random Fields (with Charles Sutton), Computer Science Department Faculty Publication Series (2007)
Discriminative training of graphical models can be expensive if the variables have large cardinality, even...
A Conditional Random Field for Discriminatively-trained Finite-state String Edit Distance (with Kedar Bellare and Fernando Pereira), Computer Science Department Faculty Publication Series (2005)
The need to measure sequence similarity arises in information extraction, object identity, data mining, biological...
Collective Multi-Label Classification (with Nadia Ghamrawi), Computer Science Department Faculty Publication Series (2005)
Common approaches to multi-label classification learn independent classifiers for each category, and employ ranking or...
The Author-Recipient-Topic Model for Topic and Role Discovery in Social Networks: Experiments with Enron and Academic Email (with Andrés Corrada-Emmanuel and Xuerui Wang), Computer Science Department Faculty Publication Series (2005)
Previous work in social network analysis (SNA) has modeled the existence of links from one...