Vicent Tamarit


2010

pdf bib
Evaluation of HMM-based Models for the Annotation of Unsegmented Dialogue Turns
Carlos-D. Martínez-Hinarejos | Vicent Tamarit | José-M. Benedí
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)

Corpus-based dialogue systems rely on statistical models, whose parameters are inferred from annotated dialogues. The dialogues are usually annotated in terms of Dialogue Acts (DA), and the manual annotation is difficult (as annotation rule are hard to define), error-prone and time-consuming. Therefore, several semi-automatic annotation processes have been proposed to speed-up the process and consequently obtain a dialogue system in less total time. These processes are usually based on statistical models. The standard statistical annotation model is based on Hidden Markov Models (HMM). In this work, we explore the impact of different types of HMM, with different number of states, on annotation accuracy. We performed experiments using these models on two dialogue corpora (Dihana and SwitchBoard) of dissimilar features. The results show that some types of models improve standard HMM in a human-computer task-oriented dialogue corpus (Dihana corpus), but their impact is lower in a human-human non-task-oriented dialogue corpus (SwitchBoard corpus).

2009

pdf bib
Improving Unsegmented Dialogue Turns Annotation with N-gram Transducers
Carlos-D. Martínez-Hinarejos | Vicent Tamarit | José-Miguel Benedí
Proceedings of the 23rd Pacific Asia Conference on Language, Information and Computation, Volume 1

pdf bib
Improving Unsegmented Statistical Dialogue Act Labelling
Vicent Tamarit | Carlos-D. Martínez-Hinarejos | José Miguel Benedí Ruíz
Proceedings of the International Conference RANLP-2009

2008

pdf bib
Evaluation of Different Segmentation Techniques for Dialogue Turns
Carlos D. Martínez-Hinarejos | Vicent Tamarit
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)

In dialogue systems, it is necessary to decode the user input into semantically meaningful units. These semantical units, usually Dialogue Acts (DA), are used by the system to produce the most appropriate response. The user turns can be segmented into utterances, which are meaningful segments from the dialogue viewpoint. In this case, a single DA is associated to each utterance. Many previous works have used DA assignation models on segmented dialogue corpora, but only a few have tried to perform the segmentation and assignation at the same time. The knowledge of the segmentation of turns into utterances is not common in dialogue corpora, and knowing the quality of the segmentations provided by the models that simultaneously perform segmentation and assignation would be interesting. In this work, we evaluate the accuracy of the segmentation offered by this type of model. The evaluation is done on a Spanish dialogue system on a railway information task. The results reveal that one of these techniques provides a high quality segmentation for this corpus.