The main goal of my research is to dramatically increase our ability to mine
actionable knowledge from unstructured text. I am especially interested in information
extraction from the Web, understanding the connections between people and between
organizations, expert finding, social network analysis, and mining the scientific
literature & community. Toward this end my group develops and employs various methods
in statistical machine learning, natural language processing, information retrieval and
data mining---tending toward probabilistic approaches and graphical models. 

Latent Dirichlet Allocation Models

Link

Topic and Role Discovery in Social Networks with Experiments on Enron and Academic Email (with Xuerui Wang and Andrés Corrada-Emmanuel), Journal of Artificial Intelligence Research (2007)

Previous work in social network analysis (SNA) has modeled the existence of links from one...

 

Machine Learning

Link

Topic and Role Discovery in Social Networks with Experiments on Enron and Academic Email (with Xuerui Wang and Andrés Corrada-Emmanuel), Journal of Artificial Intelligence Research (2007)

Previous work in social network analysis (SNA) has modeled the existence of links from one...

 

No subject area

PDF

Combining joint models for biomedical event extraction (with David McClosky, Sebastian Riedel, Mihai Surdeanu, and Christopher D. Manning), BMC Bioinformatics (2012)

Background: We explore techniques for performing model combination between the UMass and Stanford biomedical event...

 

PDF

Piecewise Pseudolikelihood for Efficient Training of Conditional Random Fields (with Charles Sutton), Computer Science Department Faculty Publication Series (2007)

Discriminative training of graphical models can be expensive if the variables have large cardinality, even...

 

PDF

A Conditional Random Field for Discriminatively-trained Finite-state String Edit Distance (with Kedar Bellare and Fernando Pereira), Computer Science Department Faculty Publication Series (2005)

The need to measure sequence similarity arises in information extraction, object identity, data mining, biological...

 

PDF

Collective Multi-Label Classification (with Nadia Ghamrawi), Computer Science Department Faculty Publication Series (2005)

Common approaches to multi-label classification learn independent classifiers for each category, and employ ranking or...

 

PDF

The Author-Recipient-Topic Model for Topic and Role Discovery in Social Networks: Experiments with Enron and Academic Email (with Andrés Corrada-Emmanuel and Xuerui Wang), Computer Science Department Faculty Publication Series (2005)

Previous work in social network analysis (SNA) has modeled the existence of links from one...