Spyros Matsoukas


2023

pdf bib
Toward More Accurate and Generalizable Evaluation Metrics for Task-Oriented Dialogs
Abishek Komma | Nagesh Panyam Chandrasekarasastry | Timothy Leffel | Anuj Goyal | Angeliki Metallinou | Spyros Matsoukas | Aram Galstyan
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 5: Industry Track)

Measurement of interaction quality is a critical task for the improvement of large-scale spoken dialog systems. Existing approaches to dialog quality estimation either focus on evaluating the quality of individual turns, or collect dialog-level quality measurements from end users immediately following an interaction. In contrast to these approaches, we introduce a new dialog-level annotation workflow called Dialog Quality Annotation (DQA). DQA expert annotators evaluate the quality of dialogs as a whole, and also label dialogs for attributes such as goal completion and user sentiment. In this contribution, we show that: (i) while dialog quality cannot be completely decomposed into dialog-level attributes, there is a strong relationship between some objective dialog attributes and judgments of dialog quality; (ii) for the task of dialog-level quality estimation, a supervised model trained on dialog-level annotations outperforms methods based purely on aggregating turn-level features; and (iii) the proposed evaluation model shows better domain generalization ability compared to the baselines. On the basis of these results, we argue that having high-quality human-annotated data is an important component of evaluating interaction quality for large industrial-scale voice assistant platforms.

2021

pdf bib
A Scalable Framework for Learning From Implicit User Feedback to Improve Natural Language Understanding in Large-Scale Conversational AI Systems
Sunghyun Park | Han Li | Ameen Patel | Sidharth Mudgal | Sungjin Lee | Young-Bum Kim | Spyros Matsoukas | Ruhi Sarikaya
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing

Natural Language Understanding (NLU) is an established component within a conversational AI or digital assistant system, and it is responsible for producing semantic understanding of a user request. We propose a scalable and automatic approach for improving NLU in a large-scale conversational AI system by leveraging implicit user feedback, with an insight that user interaction data and dialog context have rich information embedded from which user satisfaction and intention can be inferred. In particular, we propose a domain-agnostic framework for curating new supervision data for improving NLU from live production traffic. With an extensive set of experiments, we show the results of applying the framework and improving NLU for a large-scale production system across 10 domains.

2020

pdf bib
Data Augmentation for Training Dialog Models Robust to Speech Recognition Errors
Longshaokan Wang | Maryam Fazel-Zarandi | Aditya Tiwari | Spyros Matsoukas | Lazaros Polymenakos
Proceedings of the 2nd Workshop on Natural Language Processing for Conversational AI

Speech-based virtual assistants, such as Amazon Alexa, Google assistant, and Apple Siri, typically convert users’ audio signals to text data through automatic speech recognition (ASR) and feed the text to downstream dialog models for natural language understanding and response generation. The ASR output is error-prone; however, the downstream dialog models are often trained on error-free text data, making them sensitive to ASR errors during inference time. To bridge the gap and make dialog models more robust to ASR errors, we leverage an ASR error simulator to inject noise into the error-free text data, and subsequently train the dialog models with the augmented data. Compared to other approaches for handling ASR errors, such as using ASR lattice or end-to-end methods, our data augmentation approach does not require any modification to the ASR or downstream dialog models; our approach also does not introduce any additional latency during inference time. We perform extensive experiments on benchmark data and show that our approach improves the performance of downstream dialog models in the presence of ASR errors, and it is particularly effective in the low-resource situations where there are constraints on model size or the training data is scarce.

pdf bib
Joint Turn and Dialogue level User Satisfaction Estimation on Multi-Domain Conversations
Praveen Kumar Bodigutla | Aditya Tiwari | Spyros Matsoukas | Josep Valls-Vargas | Lazaros Polymenakos
Findings of the Association for Computational Linguistics: EMNLP 2020

Dialogue level quality estimation is vital for optimizing data driven dialogue management. Current automated methods to estimate turn and dialogue level user satisfaction employ hand-crafted features and rely on complex annotation schemes, which reduce the generalizability of the trained models. We propose a novel user satisfaction estimation approach which minimizes an adaptive multi-task loss function in order to jointly predict turn-level Response Quality labels provided by experts and explicit dialogue-level ratings provided by end users. The proposed BiLSTM based deep neural net model automatically weighs each turn’s contribution towards the estimated dialogue-level rating, implicitly encodes temporal dependencies, and removes the need to hand-craft features. On dialogues sampled from 28 Alexa domains, two dialogue systems and three user groups, the joint dialogue-level satisfaction estimation model achieved up to an absolute 27% (0.43 -> 0.70) and 7% (0.63 -> 0.70) improvement in linear correlation performance over baseline deep neural net and benchmark Gradient boosting regression models, respectively.

2019

pdf bib
Active Learning for New Domains in Natural Language Understanding
Stanislav Peshterliev | John Kearney | Abhyuday Jagannatha | Imre Kiss | Spyros Matsoukas
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Industry Papers)

We explore active learning (AL) for improving the accuracy of new domains in a natural language understanding (NLU) system. We propose an algorithm called Majority-CRF that uses an ensemble of classification models to guide the selection of relevant utterances, as well as a sequence labeling model to help prioritize informative examples. Experiments with three domains show that Majority-CRF achieves 6.6%-9% relative error rate reduction compared to random sampling with the same annotation budget, and statistically significant improvements compared to other AL approaches. Additionally, case studies with human-in-the-loop AL on six new domains show 4.6%-9% improvement on an existing NLU system.

2018

pdf bib
Fast and Scalable Expansion of Natural Language Understanding Functionality for Intelligent Agents
Anuj Kumar Goyal | Angeliki Metallinou | Spyros Matsoukas
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 3 (Industry Papers)

Fast expansion of natural language functionality of intelligent virtual agents is critical for achieving engaging and informative interactions. However, developing accurate models for new natural language domains is a time and data intensive process. We propose efficient deep neural network architectures that maximally re-use available resources through transfer learning. Our methods are applied for expanding the understanding capabilities of a popular commercial agent and are evaluated on hundreds of new domains, designed by internal or external developers. We demonstrate that our proposed methods significantly increase accuracy in low resource settings and enable rapid development of accurate models with less data.

pdf bib
The Alexa Meaning Representation Language
Thomas Kollar | Danielle Berry | Lauren Stuart | Karolina Owczarzak | Tagyoung Chung | Lambert Mathias | Michael Kayser | Bradford Snow | Spyros Matsoukas
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 3 (Industry Papers)

This paper introduces a meaning representation for spoken language understanding. The Alexa meaning representation language (AMRL), unlike previous approaches, which factor spoken utterances into domains, provides a common representation for how people communicate in spoken language. AMRL is a rooted graph, links to a large-scale ontology, supports cross-domain queries, fine-grained types, complex utterances and composition. A spoken language dataset has been collected for Alexa, which contains ∼20k examples across eight domains. A version of this meaning representation was released to developers at a trade show in 2016.

2013

pdf bib
Systematic Comparison of Professional and Crowdsourced Reference Translations for Machine Translation
Rabih Zbib | Gretchen Markiewicz | Spyros Matsoukas | Richard Schwartz | John Makhoul
Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

2012

pdf bib
Machine Translation of Arabic Dialects
Rabih Zbib | Erika Malchiodi | Jacob Devlin | David Stallard | Spyros Matsoukas | Richard Schwartz | John Makhoul | Omar F. Zaidan | Chris Callison-Burch
Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

pdf bib
Trait-Based Hypothesis Selection For Machine Translation
Jacob Devlin | Spyros Matsoukas
Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

pdf bib
Review of Hypothesis Alignment Algorithms for MT System Combination via Confusion Network Decoding
Antti-Veikko Rosti | Xiaodong He | Damianos Karakos | Gregor Leusch | Yuan Cao | Markus Freitag | Spyros Matsoukas | Hermann Ney | Jason Smith | Bing Zhang
Proceedings of the Seventh Workshop on Statistical Machine Translation

2011

pdf bib
System Combination Using Discriminative Cross-Adaptation
Jacob Devlin | Antti-Veikko Rosti | Sankaranarayanan Ananthakrishnan | Spyros Matsoukas
Proceedings of 5th International Joint Conference on Natural Language Processing

pdf bib
Improving Low-Resource Statistical Machine Translation with a Novel Semantic Word Clustering Algorithm
Jeff Ma | Spyros Matsoukas | Richard Schwartz
Proceedings of Machine Translation Summit XIII: Papers

pdf bib
Building a Statistical Machine Translation System for Translating Patent Documents
Jeff Ma | Spyros Matsoukas
Proceedings of the 4th Workshop on Patent Translation

pdf bib
Expected BLEU Training for Graphs: BBN System Description for WMT11 System Combination Task
Antti-Veikko Rosti | Bing Zhang | Spyros Matsoukas | Richard Schwartz
Proceedings of the Sixth Workshop on Statistical Machine Translation

2010

pdf bib
BBN System Description for WMT10 System Combination Task
Antti-Veikko Rosti | Bing Zhang | Spyros Matsoukas | Richard Schwartz
Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR

pdf bib
Decision Trees for Lexical Smoothing in Statistical Machine Translation
Rabih Zbib | Spyros Matsoukas | Richard Schwartz | John Makhoul
Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR

pdf bib
Statistical Machine Translation with a Factorized Grammar
Libin Shen | Bing Zhang | Spyros Matsoukas | Jinxi Xu | Ralph Weischedel
Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing

2009

pdf bib
Effective Use of Linguistic and Contextual Information for Statistical Machine Translation
Libin Shen | Jinxi Xu | Bing Zhang | Spyros Matsoukas | Ralph Weischedel
Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing

pdf bib
Discriminative Corpus Weight Estimation for Machine Translation
Spyros Matsoukas | Antti-Veikko I. Rosti | Bing Zhang
Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing

pdf bib
Incremental Hypothesis Alignment with Flexible Matching for Building Confusion Networks: BBN System Description for WMT09 System Combination Task
Antti-Veikko Rosti | Bing Zhang | Spyros Matsoukas | Richard Schwartz
Proceedings of the Fourth Workshop on Statistical Machine Translation

2008

pdf bib
Incremental Hypothesis Alignment for Building Confusion Networks with Application to Machine Translation System Combination
Antti-Veikko Rosti | Bing Zhang | Spyros Matsoukas | Richard Schwartz
Proceedings of the Third Workshop on Statistical Machine Translation

2007

pdf bib
Improved Word-Level System Combination for Machine Translation
Antti-Veikko Rosti | Spyros Matsoukas | Richard Schwartz
Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics

pdf bib
Combining Outputs from Multiple Machine Translation Systems
Antti-Veikko Rosti | Necip Fazil Ayan | Bing Xiang | Spyros Matsoukas | Richard Schwartz | Bonnie Dorr
Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Proceedings of the Main Conference