Jackson Liscombe


2010

pdf bib
How to Drink from a Fire Hose: One Person Can Annoscribe One Million Utterances in One Month
David Suendermann | Jackson Liscombe | Roberto Pieraccini
Proceedings of the SIGDIAL 2010 Conference

pdf bib
WITcHCRafT: A Workbench for Intelligent exploraTion of Human ComputeR conversaTions
Alexander Schmitt | Gregor Bertrand | Tobias Heinroth | Wolfgang Minker | Jackson Liscombe
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)

We present Witchcraft, an open-source framework for the evaluation of prediction models for spoken dialogue systems based on interaction logs and audio recordings. The use of Witchcraft is two fold: first, it provides an adaptable user interface to easily manage and browse thousands of logged dialogues (e.g. calls). Second, with help of the underlying models and the connected machine learning framework RapidMiner the workbench is able to display at each dialogue turn the probability of the task being completed based on the dialogue history. It estimates the emotional state, gender and age of the user. While browsing through a logged conversation, the user can directly observe the prediction result of the models at each dialogue step. By that, Witchcraft allows for spotting problematic dialogue situations and demonstrates where the current system and the prediction models have design flaws. Witchcraft will be made publically available to the community and will be deployed as open-source project.

pdf bib
The Influence of the Utterance Length on the Recognition of Aged Voices
Alexander Schmitt | Tim Polzehl | Wolfgang Minker | Jackson Liscombe
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)

This paper addresses the recognition of elderly callers based on short and narrow-band utterances, which are typical for Interactive Voice Response (IVR) systems. Our study is based on 2308 short utterances from a deployed IVR application. We show that features such as speaking rate, jitter and shimmer that are considered as most meaningful ones for determining elderly users underperform when used in the IVR context while pitch and intensity features seem to gain importance. We further demonstrate the influence of the utterance length on the classifier’s performance: for both humans and classifier, the distinction between aged and non-aged voices becomes increasingly difficult the shorter the utterances get. Our setup based on a Support Vector Machine (SVM) with linear kernel reaches a comparably poor performance of 58% accuracy, which can be attributed to an average utterance length of only 1.6 seconds. The automatic distinction between aged and non-aged utterances drops to random when the utterance length falls below 1.2 seconds.

2009

pdf bib
On NoMatchs, NoInputs and BargeIns: Do Non-Acoustic Features Support Anger Detection?
Alexander Schmitt | Tobias Heinroth | Jackson Liscombe
Proceedings of the SIGDIAL 2009 Conference

pdf bib
A Handsome Set of Metrics to Measure Utterance Classification Performance in Spoken Dialog Systems
David Suendermann | Jackson Liscombe | Krishna Dayanidhi | Roberto Pieraccini
Proceedings of the SIGDIAL 2009 Conference

2007

pdf bib
Proceedings of the NAACL-HLT 2007 Doctoral Consortium
Jackson Liscombe | Phillip Michalak
Proceedings of the NAACL-HLT 2007 Doctoral Consortium

2006

pdf bib
Detecting Emotion in Speech: Experiments in Three Domains
Jackson Liscombe
Proceedings of the Human Language Technology Conference of the NAACL, Companion Volume: Doctoral Consortium