Beth Ann Hockey

Also published as: B. A. Hockey, Beth A. Hockey, Beth Hockey


2022

pdf bib
Domain-specific knowledge distillation yields smaller and better models for conversational commerce
Kristen Howell | Jian Wang | Akshay Hazare | Joseph Bradley | Chris Brew | Xi Chen | Matthew Dunn | Beth Hockey | Andrew Maurer | Dominic Widdows
Proceedings of the Fifth Workshop on e-Commerce and NLP (ECNLP 5)

We demonstrate that knowledge distillation can be used not only to reduce model size, but to simultaneously adapt a contextual language model to a specific domain. We use Multilingual BERT (mBERT; Devlin et al., 2019) as a starting point and follow the knowledge distillation approach of (Sahn et al., 2019) to train a smaller multilingual BERT model that is adapted to the domain at hand. We show that for in-domain tasks, the domain-specific model shows on average 2.3% improvement in F1 score, relative to a model distilled on domain-general data. Whereas much previous work with BERT has fine-tuned the encoder weights during task training, we show that the model improvements from distillation on in-domain data persist even when the encoder weights are frozen during task training, allowing a single encoder to support classifiers for multiple tasks and languages.

pdf bib
OpenEL: An Annotated Corpus for Entity Linking and Discourse in Open Domain Dialogue
Wen Cui | Leanne Rolston | Marilyn Walker | Beth Ann Hockey
Proceedings of the Thirteenth Language Resources and Evaluation Conference

Entity linking in dialogue is the task of mapping entity mentions in utterances to a target knowledge base. Prior work on entity linking has mainly focused on well-written articles such as Wikipedia, annotated newswire, or domain-specific datasets. We extend the study of entity linking to open domain dialogue by presenting the OpenEL corpus: an annotated multi-domain corpus for linking entities in natural conversation to Wikidata. Each dialogic utterance in 179 dialogues over 12 topics from the EDINA dataset has been annotated for entities realized by definite referring expressions as well as anaphoric forms such as he, she, it and they. This dataset supports training and evaluation of entity linking in open-domain dialogue, as well as analysis of the effect of using dialogue context and anaphora resolution in model training. It could also be used for fine-tuning a coreference resolution algorithm. To the best of our knowledge, this is the first substantial entity linking corpus publicly available for open-domain dialogue. We also establish baselines for this task using several existing entity linking systems. We found that the Transformer-based system Flair + BLINK has the best performance with a 0.65 F1 score. Our results show that dialogue context is extremely beneficial for entity linking in conversations, with Flair + Blink achieving an F1 of 0.61 without discourse context. These results also demonstrate the remaining performance gap between the baselines and human performance, highlighting the challenges of entity linking in open-domain dialogue, and suggesting many avenues for future research using OpenEL.

2009

pdf bib
Using Paraphrases of Deep Semantic Representions to Support Regression Testing in Spoken Dialogue Systems
Beth Ann Hockey | Manny Rayner
Proceedings of the Workshop on Software Engineering, Testing, and Quality Assurance for Natural Language Processing (SETQA-NLP 2009)

pdf bib
Using Artificially Generated Data to Evaluate Statistical Machine Translation
Manny Rayner | Paula Estrella | Pierrette Bouillon | Beth Ann Hockey | Yukie Nakao
Proceedings of the 2009 Workshop on Grammar Engineering Across Frameworks (GEAF 2009)

2008

pdf bib
Developing Non-European Translation Pairs in a Medium-Vocabulary Medical Speech Translation System
Pierrette Bouillon | Sonia Halimi | Yukie Nakao | Kyoko Kanzaki | Hitoshi Isahara | Nikos Tsourakis | Marianne Starlander | Beth Ann Hockey | Manny Rayner
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)

We describe recent work on MedSLT, a medium-vocabulary interlingua-based medical speech translation system, focussing on issues that arise when handling languages of which the grammar engineer has little or no knowledge. We show how we can systematically create and maintain multiple forms of grammars, lexica and interlingual representations, with some versions being used by language informants, and some by grammar engineers. In particular, we describe the advantages of structuring the interlingua definition as a simple semantic grammar, which includes a human-readable surface form. We show how this allows us to rationalise the process of evaluating translations between languages lacking common speakers, and also makes it possible to create a simple generic tool for debugging to-interlingua translation rules. Examples presented focus on the concrete case of translation between Japanese and Arabic in both directions.

pdf bib
Proceedings of the 9th SIGdial Workshop on Discourse and Dialogue
David Schlangen | Beth Ann Hockey
Proceedings of the 9th SIGdial Workshop on Discourse and Dialogue

pdf bib
Zero to Spoken Dialogue System in One Quarter: Teaching Computational Linguistics to Linguists Using Regulus
Beth Ann Hockey | Gwen Christian
Proceedings of the Third Workshop on Issues in Teaching Computational Linguistics

pdf bib
The 2008 MedSLT System
Manny Rayner | Pierrette Bouillon | Jane Brotanek | Glenn Flores | Sonia Halimi | Beth Ann Hockey | Hitoshi Isahara | Kyoko Kanzaki | Elisabeth Kron | Yukie Nakao | Marianne Santaholma | Marianne Starlander | Nikos Tsourakis
Coling 2008: Proceedings of the workshop on Speech Processing for Safety Critical Translation and Pervasive Applications

pdf bib
A Small-Vocabulary Shared Task for Medical Speech Translation
Manny Rayner | Pierrette Bouillon | Glenn Flores | Farzad Ehsani | Marianne Starlander | Beth Ann Hockey | Jane Brotanek | Lukas Biewald
Coling 2008: Proceedings of the workshop on Speech Processing for Safety Critical Translation and Pervasive Applications

pdf bib
Almost Flat Functional Semantics for Speech Translation
Manny Rayner | Pierrette Bouillon | Beth Ann Hockey | Yukie Nakao
Proceedings of the 22nd International Conference on Computational Linguistics (Coling 2008)

2007

pdf bib
Adapting a Medical speech to speech translation system (MedSLT) to Arabic
Pierrette Bouillon | Sonia Halimi | Manny Rayner | Beth Ann Hockey
Proceedings of the 2007 Workshop on Computational Approaches to Semitic Languages: Common Issues and Resources

pdf bib
A Bidirectional Grammar-Based Medical Speech Translator
Pierrette Bouillon | Glenn Flores | Marianne Starlander | Nikos Chatzichrisafis | Marianne Santaholma | Nikos Tsourakis | Manny Rayner | Beth Ann Hockey
Proceedings of the Workshop on Grammar-Based Approaches to Spoken Language Processing

2006

pdf bib
REGULUS: A Generic Multilingual Open Source Platform for Grammar-Based Speech Applications
Manny Rayner | Pierrette Bouillon | Beth Ann Hockey | Nikos Chatzichrisafis
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)

We present an overview of Regulus, an Open Source platform that supports corpus-based derivation of efficient domain-specific speech recognisers from general linguistically motivated unification grammars. We list available Open Source resources, which include compilers, resource grammars for various languages, documentation and a development environment. The greater part of the paper presents a series of experiments carried out using a medium-vocabulary medical speech translation application and a corpus of 801 recorded domain utterances, designed to investigate the impact on speech understanding performance of vocabulary size, grammatical coverage, presence or absence of various linguistic features, degree of generality of thegrammar and use or otherwise of probabilistic weighting in the CFGlanguage model. In terms of task accuracy, the most significant factors were the use of probabilistic weighting, the degree of generality of the grammar and the inclusion of features which model sortal restrictions.

pdf bib
Evaluating Task Performance for a Unidirectional Controlled Language Medical Speech Translation System
Nikos Chatzichrisafis | Pierrette Bouillon | Manny Rayner | Marianne Santaholma | Marianne Starlander | Beth Ann Hockey
Proceedings of the First International Workshop on Medical Speech Translation

pdf bib
MedSLT: A Limited-Domain Unidirectional Grammar-Based Medical Speech Translator
Manny Rayner | Pierrette Bouillon | Nikos Chatzichrisafis | Marianne Santaholma | Marianne Starlander | Beth Ann Hockey | Yukie Nakao | Hitoshi Isahara | Kyoko Kanzaki
Proceedings of the First International Workshop on Medical Speech Translation

2005

pdf bib
A generic multi-lingual open source platform for limited-domain medical speech translation
Pierrette Bouillon | Manny Rayner | Nikos Chatzichrisafis | Beth Ann Hockey | Marianne Santaholma | Marianne Starlander | Yukie Nakao | Kyoko Kanzaki | Hitoshi Isahara
Proceedings of the 10th EAMT Conference: Practical applications of machine translation

pdf bib
Japanese Speech Understanding using Grammar Specialization
Manny Rayner | Nikos Chatzichrisafis | Pierrette Bouillon | Yukie Nakao | Hitoshi Isahara | Kyoko Kanzaki | Beth Ann Hockey | Marianne Santaholma | Marianne Starlander
Proceedings of HLT/EMNLP 2005 Interactive Demonstrations

pdf bib
A Voice Enabled Procedure Browser for the International Space Station
Manny Rayner | Beth A. Hockey | Nikos Chatzichrisafis | Kim Farrell | Jean-Michel Renders
Proceedings of the ACL Interactive Poster and Demonstration Sessions

pdf bib
Practicing Controlled Language through a Help System integrated into the Medical Speech Translation System (MedSLT)
Marianne Starlander | Pierrette Bouillon | Nikos Chatzichrisafis | Marianne Santaholma | Manny Rayner | Beth Ann Hockey | Hitoshi Isahara | Kyoko Kanzaki | Yukie Nakao
Proceedings of Machine Translation Summit X: Papers

In this paper, we present evidence that providing users of a speech to speech translation system for emergency diagnosis (MedSLT) with a tool that helps them to learn the coverage greatly improves their success in using the system. In MedSLT, the system uses a grammar-based recogniser that provides more predictable results to the translation component. The help module aims at addressing the lack of robustness inherent in this type of approach. It takes as input the result of a robust statistical recogniser that performs better for out-of-coverage data and produces a list of in-coverage example sentences. These examples are selected from a defined list using a heuristic that prioritises sentences maximising the number of N-grams shared with those extracted from the recognition result.

2004

pdf bib
Comparing rule-based and statistical approaches to speech understanding in a limited domain speech translation system
Manny Rayner | Pierrette Bouillon | Beth Ann Hockey | Nikos Chatzichrisafis | Marianne Starlander
Proceedings of the 10th Conference on Theoretical and Methodological Issues in Machine Translation of Natural Languages

2003

pdf bib
A procedure assistant for astronauts in a functional programming architecture, with step previewing and spoken correction of dialogue moves
Gregory Aist | Manny Rayner | John Dowding | Beth Ann Hockey | Susana Early | Jim Hieronymus
Proceedings of the Fourth SIGdial Workshop of Discourse and Dialogue

pdf bib
Targeted Help for Spoken Dialogue Systems
Beth Ann Hockey | Oliver Lemon | Ellen Campana | Laura Hiatt | Gregory Aist | James Hieronymus | Alexander Gruenstein | John Dowding
10th Conference of the European Chapter of the Association for Computational Linguistics

pdf bib
Transparent combination of rule-based and data-driven approaches in speech understanding
Manny Rayner | Beth Ann Hockey
10th Conference of the European Chapter of the Association for Computational Linguistics

pdf bib
Talking through Procedures: An Intelligent Space Station Procedure Assistant
Greg Aist | J. Dowding | B. A. Hockey | M. Rayner | J. Hieronymus | D. Bohus | B. Boven | N. Blaylock | E. Campana | S. Early | G. Gorrell | S. Phan
Demonstrations

pdf bib
An Open-Source Environment for Compiling Typed Unification Grammars into Speech Recognisers
Manny Rayner | Beth Ann Hockey | John Dowding
Demonstrations

pdf bib
A Limited-Domain English to Japanese Medical Speech Translator Built Using REGULUS 2
Manny Rayner | Pierrette Bouillon | Vol Van Dalsem III | Hitoshi Isahara | Kyoko Kanzaki | Beth Ann Hockey
The Companion Volume to the Proceedings of 41st Annual Meeting of the Association for Computational Linguistics

pdf bib
An Intelligent Procedure Assistant Built Using REGULUS 2 and ALTERF
Manny Rayner | Beth Ann Hockey | Jim Hieronymus | John Dowding | Greg Aist | Susana Early
The Companion Volume to the Proceedings of 41st Annual Meeting of the Association for Computational Linguistics

2001

pdf bib
Do CFG-Based Language Models Need Agreement Constraints?
Manny Rayner | Genevieve Gorrell | Beth Ann Hockey | John Dowding | Johan Boye
Second Meeting of the North American Chapter of the Association for Computational Linguistics

pdf bib
Practical Issues in Compiling Typed Unification Grammars for Speech Recognition
John Dowding | Beth Ann Hockey | Jean Mark Gawron | Christopher Culy
Proceedings of the 39th Annual Meeting of the Association for Computational Linguistics

2000

pdf bib
A Compact Architecture for Dialogue Management Based on Scripts and Meta-Outputs
Manny Rayner | Beth Ann Hockey | Frankie James
Sixth Applied Natural Language Processing Conference

pdf bib
A Compact Architecture for Dialogue Management Based on Scripts and Meta-Outputs
Manny Rayner | Beth Ann Hockey | Frankie James
ANLP-NAACL 2000 Workshop: Conversational Systems

pdf bib
A comparison of the XTAG and CLE Grammars for English
Manny Rayner | Beth Ann Hockey | Frankie James
Proceedings of the Fifth International Workshop on Tree Adjoining Grammar and Related Frameworks (TAG+5)

pdf bib
Compiling Language Models from a Linguistically Motivated Unification Grammar
Manny Rayner | Beth Ann Hockey | Frankie James | Elizabeth Owen Bratt | Sharon Goldwater | Jean Mark Gawron
COLING 2000 Volume 2: The 18th International Conference on Computational Linguistics

1997

pdf bib
Maintaining the Forest and Burning out the Underbrush in XTAG
Christine Doran | Beth Hockey | Philip Hopely | Joseph Rosenzweig | Anoop Sarkar | B. Srinivas | Fei Xia
Computational Environments for Grammar Development and Linguistic Engineering

1994

pdf bib
XTAG System - A Wide Coverage Grammar for English
Christy Doran | Dania Egedi | Beth Ann Hockey | B. Srinivas | Martin Zaidel
COLING 1994 Volume 2: The 15th International Conference on Computational Linguistics