Emily Chen


2023

pdf bib
Community consultation and the development of an online Akuzipik-English dictionary
Benjamin Hunt | Lane Schwartz | Sylvia Schreiner | Emily Chen
Proceedings of the Workshop on Natural Language Processing for Indigenous Languages of the Americas (AmericasNLP)

In this paper, we present a new online dictionary of Akuzipik, an Indigenous language of St. Lawrence Island (Alaska) and Chukotka (Russia).We discuss community desires for strengthening language use in the community and in educational settings, and present specific features of an online dictionary designed to serve these community goals.

2020

pdf bib
Improved Finite-State Morphological Analysis for St. Lawrence Island Yupik Using Paradigm Function Morphology
Emily Chen | Hyunji Hayley Park | Lane Schwartz
Proceedings of the Twelfth Language Resources and Evaluation Conference

St. Lawrence Island Yupik is an endangered polysynthetic language of the Bering Strait region. While conducting linguistic fieldwork between 2016 and 2019, we observed substantial support within the Yupik community for language revitalization and for resource development to support Yupik education. To that end, Chen & Schwartz (2018) implemented a finite-state morphological analyzer as a critical enabling technology for use in Yupik language education and technology. Chen & Schwartz (2018) reported a morphological analysis coverage rate of approximately 75% on a dataset of 60K Yupik tokens, leaving considerable room for improvement. In this work, we present a re-implementation of the Chen & Schwartz (2018) finite-state morphological analyzer for St. Lawrence Island Yupik that incorporates new linguistic insights; in particular, in this implementation we make use of the Paradigm Function Morphology (PFM) theory of morphology. We evaluate this new PFM-based morphological analyzer, and demonstrate that it consistently outperforms the existing analyzer of Chen & Schwartz (2018) with respect to accuracy and coverage rate across multiple datasets.

2019

pdf bib
Community lexical access for an endangered polysynthetic language: An electronic dictionary for St. Lawrence Island Yupik
Benjamin Hunt | Emily Chen | Sylvia L.R. Schreiner | Lane Schwartz
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics (Demonstrations)

In this paper, we introduce a morphologically-aware electronic dictionary for St. Lawrence Island Yupik, an endangered language of the Bering Strait region. Implemented using HTML, Javascript, and CSS, the dictionary is set in an uncluttered interface and permits users to search in Yupik or in English for Yupik root words and Yupik derivational suffixes. For each matching result, our electronic dictionary presents the user with the corresponding entry from the Badten (2008) Yupik-English paper dictionary. Because Yupik is a polysynthetic language, handling of multimorphemic word forms is critical. If a user searches for an inflected Yupik word form, we perform a morphological analysis and return entries for the root word and for any derivational suffixes present in the word. This electronic dictionary should serve not only as a valuable resource for all students and speakers of Yupik, but also for field linguists working towards documentation and conservation of the language.

pdf bib
Bootstrapping a Neural Morphological Analyzer for St. Lawrence Island Yupik from a Finite-State Transducer
Lane Schwartz | Emily Chen | Benjamin Hunt | Sylvia L.R. Schreiner
Proceedings of the 3rd Workshop on the Use of Computational Methods in the Study of Endangered Languages Volume 1 (Papers)

pdf bib
Measuring the Value of Linguistics: A Case Study from St. Lawrence Island Yupik
Emily Chen
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics: Student Research Workshop

The adaptation of neural approaches to NLP is a landmark achievement that has called into question the utility of linguistics in the development of computational systems. This research proposal consequently explores this question in the context of a neural morphological analyzer for a polysynthetic language, St. Lawrence Island Yupik. It asks whether incorporating elements of Yupik linguistics into the implementation of the analyzer can improve performance, both in low-resource settings and in high-resource settings, where rich quantities of data are readily available.

2018

pdf bib
A Morphological Analyzer for St. Lawrence Island / Central Siberian Yupik
Emily Chen | Lane Schwartz
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

2013

pdf bib
Automated Pyramid Scoring of Summaries using Distributional Semantics
Rebecca J. Passonneau | Emily Chen | Weiwei Guo | Dolores Perin
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)