Spoken dialog systems typically rely on recognition confidence scores to guard against potential misunderstandings. While confidence scores can provide an initial assessment for the reliability of the information obtained from the user, ideally systems should leverage information that is available in subsequent user responses to update and improve the accuracy of their beliefs. We present a machine-learning based solution for this problem. We use a compressed representation of beliefs that tracks up to k hypotheses for each concept at any given time. We train a generalized linear model to perform the updates. Experimental results show that the proposed approach significantly outperforms heuristic rules used for this task in current systems. Furthermore, a user study with a mixed-initiative spoken dialog system shows that the approach leads to significant gains in task success and in the efficiency of the interaction, across a wide range of recognition error-rates.
Available at: http://works.bepress.com/alexander_rudnicky/53/