Emily M. Bender

Also published as: Emily Bender


2022

pdf bib
Building Analyses from Syntactic Inference in Local Languages: An HPSG Grammar Inference System
Kristen Howell | Emily M. Bender
Northern European Journal of Language Technology, Volume 8

We present a grammar inference system that leverages linguistic knowledge recorded in the form of annotations in interlinear glossed text (IGT) and in a meta-grammar engineering system (the LinGO Grammar Matrix customization system) to automatically produce machine-readable HPSG grammars. Building on prior work to handle the inference of lexical classes, stems, affixes and position classes, and preliminary work on inferring case systems and word order, we introduce an integrated grammar inference system that covers a wide range of fundamental linguistic phenomena. System development was guided by 27 geneologically and geographically diverse languages, and we test the system’s cross-linguistic generalizability on an additional 5 held-out languages, using datasets provided by field linguists. Our system out-performs three baseline systems in increasing coverage while limiting ambiguity and producing richer semantic representations, while also producing richer representations than previous work in grammar inference.

2021

pdf bib
Developing a Shared Task for Speech Processing on Endangered Languages
Gina-Anne Levow | Emily Ahn | Emily M. Bender
Proceedings of the 4th Workshop on the Use of Computational Methods in the Study of Endangered Languages Volume 1 (Papers)

2020

pdf bib
Climbing towards NLU: On Meaning, Form, and Understanding in the Age of Data
Emily M. Bender | Alexander Koller
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

The success of the large neural language models on many NLP tasks is exciting. However, we find that these successes sometimes lead to hype in which these models are being described as “understanding” language or capturing “meaning”. In this position paper, we argue that a system trained only on form has a priori no way to learn meaning. In keeping with the ACL 2020 theme of “Taking Stock of Where We’ve Been and Where We’re Going”, we argue that a clear understanding of the distinction between form and meaning will help guide the field towards better science around natural language understanding.

pdf bib
Integrating Ethics into the NLP Curriculum
Emily M. Bender | Dirk Hovy | Alexandra Schofield
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: Tutorial Abstracts

To raise awareness among future NLP practitioners and prevent inertia in the field, we need to place ethics in the curriculum for all NLP students—not as an elective, but as a core part of their education. Our goal in this tutorial is to empower NLP researchers and practitioners with tools and resources to teach others about how to ethically apply NLP techniques. We will present both high-level strategies for developing an ethics-oriented curriculum, based on experience and best practices, as well as specific sample exercises that can be brought to a classroom. This highly interactive work session will culminate in a shared online resource page that pools lesson plans, assignments, exercise ideas, reading suggestions, and ideas from the attendees. Though the tutorial will focus particularly on examples for university classrooms, we believe these ideas can extend to company-internal workshops or tutorials in a variety of organizations. In this setting, a key lesson is that there is no single approach to ethical NLP: each project requires thoughtful consideration about what steps can be taken to best support people affected by that project. However, we can learn (and teach) what issues to be aware of, what questions to ask, and what strategies are available to mitigate harm.

2019

pdf bib
Neural Text Generation from Rich Semantic Representations
Valerie Hajdik | Jan Buys | Michael Wayne Goodman | Emily M. Bender
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)

We propose neural models to generate high-quality text from structured representations based on Minimal Recursion Semantics (MRS). MRS is a rich semantic representation that encodes more precise semantic detail than other representations such as Abstract Meaning Representation (AMR). We show that a sequence-to-sequence model that maps a linearization of Dependency MRS, a graph-based representation of MRS, to text can achieve a BLEU score of 66.11 when trained on gold data. The performance of the model can be improved further using a high-precision, broad coverage grammar-based parser to generate a large silver training corpus, achieving a final BLEU score of 77.17 on the full test set, and 83.37 on the subset of test data most closely matching the silver data domain. Our results suggest that MRS-based representations are a good choice for applications that need both structured semantics and the ability to produce natural language text as output.

pdf bib
Visualizing Inferred Morphotactic Systems
Haley Lepp | Olga Zamaraeva | Emily M. Bender
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics (Demonstrations)

We present a web-based system that facilitates the exploration of complex morphological patterns found in morphologically very rich languages. The need for better understanding of such patterns is urgent for linguistics and important for cross-linguistically applicable natural language processing. In this paper we give an overview of the system architecture and describe a sample case study on Abui [abz], a Trans-New Guinea language spoken in Indonesia.

pdf bib
Modeling Clausal Complementation for a Grammar Engineering Resource
Olga Zamaraeva | Kristen Howell | Emily M. Bender
Proceedings of the Society for Computation in Linguistics (SCiL) 2019

pdf bib
Handling cross-cutting properties in automatic inference of lexical classes: A case study of Chintang
Olga Zamaraeva | Kristen Howell | Emily M. Bender
Proceedings of the 3rd Workshop on the Use of Computational Methods in the Study of Endangered Languages Volume 1 (Papers)

2018

pdf bib
Data Statements for Natural Language Processing: Toward Mitigating System Bias and Enabling Better Science
Emily M. Bender | Batya Friedman
Transactions of the Association for Computational Linguistics, Volume 6

In this paper, we propose data statements as a design solution and professional practice for natural language processing technologists, in both research and development. Through the adoption and widespread use of data statements, the field can begin to address critical scientific and ethical issues that result from the use of data from certain populations in the development of technology for other populations. We present a form that data statements can take and explore the implications of adopting them as part of regular practice. We argue that data statements will help alleviate issues related to exclusion and bias in language technology, lead to better precision in claims about how natural language processing research can generalize and thus better engineering results, protect companies from public embarrassment, and ultimately lead to language technology that meets its users in their own preferred linguistic style and furthermore does not misrepresent them to others.

pdf bib
Proceedings of the 27th International Conference on Computational Linguistics
Emily M. Bender | Leon Derczynski | Pierre Isabelle
Proceedings of the 27th International Conference on Computational Linguistics

pdf bib
100 Things You Always Wanted to Know about Semantics & Pragmatics But Were Afraid to Ask
Emily M. Bender
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics: Tutorial Abstracts

Meaning is a fundamental concept in Natural Language Processing (NLP), given its aim to build systems that mean what they say to you, and understand what you say to them. In order for NLP to scale beyond partial, task-specific solutions, it must be informed by what is known about how humans use language to express and understand communicative intents. The purpose of this tutorial is to present a selection of useful information about semantics and pragmatics, as understood in linguistics, in a way that’s accessible to and useful for NLP practitioners with minimal (or even no) prior training in linguistics. The tutorial content is based on a manuscript in progress I am co-authoring with Prof. Alex Lascarides of the University of Edinburgh.

2017

pdf bib
STREAMLInED Challenges: Aligning Research Interests with Shared Tasks
Gina-Anne Levow | Emily M. Bender | Patrick Littell | Kristen Howell | Shobhana Chelliah | Joshua Crowgey | Dan Garrette | Jeff Good | Sharon Hargus | David Inman | Michael Maxwell | Michael Tjalve | Fei Xia
Proceedings of the 2nd Workshop on the Use of Computational Methods in the Study of Endangered Languages

pdf bib
Inferring Case Systems from IGT: Enriching the Enrichment
Kristen Howell | Emily M. Bender | Michel Lockwood | Fei Xia | Olga Zamaraeva
Proceedings of the 2nd Workshop on the Use of Computational Methods in the Study of Endangered Languages

pdf bib
Computational Support for Finding Word Classes: A Case Study of Abui
Olga Zamaraeva | František Kratochvíl | Emily M. Bender | Fei Xia | Kristen Howell
Proceedings of the 2nd Workshop on the Use of Computational Methods in the Study of Endangered Languages

pdf bib
Proceedings of the First ACL Workshop on Ethics in Natural Language Processing
Dirk Hovy | Shannon Spruit | Margaret Mitchell | Emily M. Bender | Michael Strube | Hanna Wallach
Proceedings of the First ACL Workshop on Ethics in Natural Language Processing

pdf bib
Proceedings of the First Workshop on Building Linguistically Generalizable NLP Systems
Emily Bender | Hal Daumé III | Allyson Ettinger | Sudha Rao
Proceedings of the First Workshop on Building Linguistically Generalizable NLP Systems

pdf bib
Towards Linguistically Generalizable NLP Systems: A Workshop and Shared Task
Allyson Ettinger | Sudha Rao | Hal Daumé III | Emily M. Bender
Proceedings of the First Workshop on Building Linguistically Generalizable NLP Systems

This paper presents a summary of the first Workshop on Building Linguistically Generalizable Natural Language Processing Systems, and the associated Build It Break It, The Language Edition shared task. The goal of this workshop was to bring together researchers in NLP and linguistics with a carefully designed shared task aimed at testing the generalizability of NLP systems beyond the distributions of their training data. We describe the motivation, setup, and participation of the shared task, provide discussion of some highlighted results, and discuss lessons learned.

2016

pdf bib
English Resource Semantics
Dan Flickinger | Emily M. Bender | Woodley Packard
Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Tutorial Abstracts

2015

pdf bib
Proceedings of the ACL-IJCNLP 2015 Student Research Workshop
Kuan-Yu Chen | Angelina Ivanova | Ellie Pavlick | Emily Bender | Chin-Yew Lin | Stephan Oepen
Proceedings of the ACL-IJCNLP 2015 Student Research Workshop

pdf bib
Layers of Interpretation: On Grammar and Compositionality
Emily M. Bender | Dan Flickinger | Stephan Oepen | Woodley Packard | Ann Copestake
Proceedings of the 11th International Conference on Computational Semantics

pdf bib
Proceedings of the Grammar Engineering Across Frameworks (GEAF) 2015 Workshop
Emily M. Bender | Lori Levin | Stefan Müller | Yannick Parmentier | Aarne Ranta
Proceedings of the Grammar Engineering Across Frameworks (GEAF) 2015 Workshop

2014

pdf bib
Enriching ODIN
Fei Xia | William Lewis | Michael Wayne Goodman | Joshua Crowgey | Emily M. Bender
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

In this paper, we describe the expansion of the ODIN resource, a database containing many thousands of instances of Interlinear Glossed Text (IGT) for over a thousand languages harvested from scholarly linguistic papers posted to the Web. A database containing a large number of instances of IGT, which are effectively richly annotated and heuristically aligned bitexts, provides a unique resource for bootstrapping NLP tools for resource-poor languages. To make the data in ODIN more readily consumable by tool developers and NLP researchers, we propose a new XML format for IGT, called Xigt. We call the updated release ODIN-II.

pdf bib
Towards an Encyclopedia of Compositional Semantics: Documenting the Interface of the English Resource Grammar
Dan Flickinger | Emily M. Bender | Stephan Oepen
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

We motivate and describe the design and development of an emerging encyclopedia of compositional semantics, pursuing three objectives. We first seek to compile a comprehensive catalogue of interoperable semantic analyses, i.e., a precise characterization of meaning representations for a broad range of common semantic phenomena. Second, we operationalize the discovery of semantic phenomena and their definition in terms of what we call their semantic fingerprint, a formal account of the building blocks of meaning representation involved and their configuration. Third, we ground our work in a carefully constructed semantic test suite of minimal exemplars for each phenomenon, along with a ‘target’ fingerprint that enables automated regression testing. We work towards these objectives by codifying and documenting the body of knowledge that has been constructed in a long-term collaborative effort, the development of the LinGO English Resource Grammar. Documentation of its semantic interface is a prerequisite to use by non-experts of the grammar and the analyses it produces, but this effort also advances our own understanding of relevant interactions among phenomena, as well as of areas for future work in the grammar.

pdf bib
Language CoLLAGE: Grammatical Description with the LinGO Grammar Matrix
Emily M. Bender
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

Language CoLLAGE is a collection of grammatical descriptions developed in the context of a grammar engineering graduate course with the LinGO Grammar Matrix. These grammatical descriptions include testsuites in well-formed interlinear glossed text (IGT) format, high-level grammatical characterizations called ‘choices files’, HPSG grammar fragments (capable of parsing and generation), and documentation. As of this writing, Language CoLLAGE includes resources for 52 typologically and areally diverse languages and this number is expected to grow over time. The resources for each language cover a similar range of core grammatical phenomena and are implemented in a uniform framework, compatible with the DELPH-IN suite of processing tools.

pdf bib
Learning Grammar Specifications from IGT: A Case Study of Chintang
Emily M. Bender | Joshua Crowgey | Michael Wayne Goodman | Fei Xia
Proceedings of the 2014 Workshop on the Use of Computational Methods in the Study of Endangered Languages

pdf bib
Simple Negation Scope Resolution through Deep Parsing: A Semantic Solution to a Semantic Problem
Woodley Packard | Emily M. Bender | Jonathon Read | Stephan Oepen | Rebecca Dridan
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
Obituary: Ivan A. Sag
Emily M. Bender
Computational Linguistics, Volume 40, Issue 1 - March 2014

2013

pdf bib
Towards Creating Precision Grammars from Interlinear Glossed Text: Inferring Large-Scale Typological Properties
Emily M. Bender | Michael Wayne Goodman | Joshua Crowgey | Fei Xia
Proceedings of the 7th Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities

2012

pdf bib
Getting More from Morphology in Multilingual Dependency Parsing
Matt Hohensee | Emily M. Bender
Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

pdf bib
100 Things You Always Wanted to Know about Linguistics But Were Afraid to Ask*
Emily M. Bender
Tutorial Abstracts at the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

pdf bib
Deriving a Lexicon for a Precision Grammar from Language Documentation Resources: A Case Study of Chintang
Emily M. Bender | Robert Schikowski | Balthasar Bickel
Proceedings of COLING 2012

2011

pdf bib
Spring Cleaning and Grammar Compression: Two Techniques for Detection of Redundancy in HPSG Grammars
Antske Fokkens | Yi Zhang | Emily M. Bender
Proceedings of the 25th Pacific Asia Conference on Language, Information and Computation

pdf bib
Parser Evaluation over Local and Non-Local Deep Dependencies in a Large Corpus
Emily M. Bender | Dan Flickinger | Stephan Oepen | Yi Zhang
Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing

pdf bib
Annotating Social Acts: Authority Claims and Alignment Moves in Wikipedia Talk Pages
Emily M. Bender | Jonathan T. Morgan | Meghan Oxley | Mark Zachry | Brian Hutchinson | Alex Marin | Bin Zhang | Mari Ostendorf
Proceedings of the Workshop on Language in Social Media (LSM 2011)

2010

pdf bib
Grammar Prototyping and Testing with the LinGO Grammar Matrix Customization System
Emily M. Bender | Scott Drellishak | Antske Fokkens | Michael Wayne Goodman | Daniel P. Mills | Laurie Poulson | Safiyyah Saleem
Proceedings of the ACL 2010 System Demonstrations

pdf bib
Argument Optionality in the LinGO Grammar Matrix
Safiyyah Saleem | Emily M. Bender
Coling 2010: Posters

2009

pdf bib
Linguistically Naïve != Language Independent: Why NLP Needs Linguistic Typology
Emily M. Bender
Proceedings of the EACL 2009 Workshop on the Interaction between Linguistics and Computational Linguistics: Virtuous, Vicious or Vacuous?

2008

pdf bib
Evaluating a Crosslinguistic Grammar Resource: A Case Study of Wambaya
Emily M. Bender
Proceedings of ACL-08: HLT

pdf bib
Building a Flexible, Collaborative, Intensive Master’s Program in Computational Linguistics
Emily M. Bender | Fei Xia | Erik Bansleben
Proceedings of the Third Workshop on Issues in Teaching Computational Linguistics

pdf bib
Semantic Representations of Syntactically Marked Discourse Status in Crosslinguistic Perspective
Emily M. Bender | David Goss-Grubbs
Semantics in Text Processing. STEP 2008 Conference Proceedings

2007

pdf bib
Validation and Regression Testing for a Cross-linguistic Grammar Resource
Emily M. Bender | Laurie Poulson | Scott Drellishak | Chris Evans
ACL 2007 Workshop on Deep Linguistic Processing

2005

pdf bib
Rapid Prototyping of Scalable Grammars: Towards Modularity in Extensions to a Language-Independent Core
Emily M. Bender | Dan Flickinger
Companion Volume to the Proceedings of Conference including Posters/Demos and tutorial abstracts

2004

pdf bib
Road-testing the English Resource Grammar Over the British National Corpus
Timothy Baldwin | Emily M. Bender | Dan Flickinger | Ara Kim | Stephan Oepen
Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC’04)

2002

pdf bib
Efficient Deep Processing of Japanese
Melanie Siegel | Emily M. Bender
COLING-02: The 3rd Workshop on Asian Language Resources and International Standardization

pdf bib
The Grammar Matrix: An Open-Source Starter-Kit for the Rapid Development of Cross-linguistically Consistent Broad-Coverage Precision Grammars
Emily M. Bender | Dan Flickinger | Stephan Oepen
COLING-02: Grammar Engineering and Evaluation

pdf bib
Parallel Distributed Grammar Engineering for Practical Applications
Stephan Oepen | Emily M. Bender | Uli Callmeier | Dan Flickinger | Melanie Siegel
COLING-02: Grammar Engineering and Evaluation