Stephanie Lukin

Also published as: Stephanie M. Lukin


2022

pdf bib
The Search for Agreement on Logical Fallacy Annotation of an Infodemic
Claire Bonial | Austin Blodgett | Taylor Hudson | Stephanie M. Lukin | Jeffrey Micher | Douglas Summers-Stay | Peter Sutor | Clare Voss
Proceedings of the Thirteenth Language Resources and Evaluation Conference

We evaluate an annotation schema for labeling logical fallacy types, originally developed for a crowd-sourcing annotation paradigm, now using an annotation paradigm of two trained linguist annotators. We apply the schema to a variety of different genres of text relating to the COVID-19 pandemic. Our linguist (as opposed to crowd-sourced) annotation of logical fallacies allows us to evaluate whether the annotation schema category labels are sufficiently clear and non-overlapping for both manual and, later, system assignment. We report inter-annotator agreement results over two annotation phases as well as a preliminary assessment of the corpus for training and testing a machine learning algorithm (Pattern-Exploiting Training) for fallacy detection and recognition. The agreement results and system performance underscore the challenging nature of this annotation task and suggest that the annotation schema and paradigm must be iteratively evaluated and refined in order to arrive at a set of annotation labels that can be reproduced by human annotators and, in turn, provide reliable training data for automatic detection and recognition systems.

2020

pdf bib
Dialogue-AMR: Abstract Meaning Representation for Dialogue
Claire Bonial | Lucia Donatelli | Mitchell Abrams | Stephanie M. Lukin | Stephen Tratz | Matthew Marge | Ron Artstein | David Traum | Clare Voss
Proceedings of the Twelfth Language Resources and Evaluation Conference

This paper describes a schema that enriches Abstract Meaning Representation (AMR) in order to provide a semantic representation for facilitating Natural Language Understanding (NLU) in dialogue systems. AMR offers a valuable level of abstraction of the propositional content of an utterance; however, it does not capture the illocutionary force or speaker’s intended contribution in the broader dialogue context (e.g., make a request or ask a question), nor does it capture tense or aspect. We explore dialogue in the domain of human-robot interaction, where a conversational robot is engaged in search and navigation tasks with a human partner. To address the limitations of standard AMR, we develop an inventory of speech acts suitable for our domain, and present “Dialogue-AMR”, an enhanced AMR that represents not only the content of an utterance, but the illocutionary force behind it, as well as tense and aspect. To showcase the coverage of the schema, we use both manual and automatic methods to construct the “DialAMR” corpus—a corpus of human-robot dialogue annotated with standard AMR and our enriched Dialogue-AMR schema. Our automated methods can be used to incorporate AMR into a larger NLU pipeline supporting human-robot dialogue.

pdf bib
InfoForager: Leveraging Semantic Search with AMR for COVID-19 Research
Claire Bonial | Stephanie M. Lukin | David Doughty | Steven Hill | Clare Voss
Proceedings of the Second International Workshop on Designing Meaning Representations

This paper examines how Abstract Meaning Representation (AMR) can be utilized for finding answers to research questions in medical scientific documents, in particular, to advance the study of UV (ultraviolet) inactivation of the novel coronavirus that causes the disease COVID-19. We describe the development of a proof-of-concept prototype tool, InfoForager, which uses AMR to conduct a semantic search, targeting the meaning of the user question, and matching this to sentences in medical documents that may contain information to answer that question. This work was conducted as a sprint over a period of six weeks, and reveals both promising results and challenges in reducing the user search time relating to COVID-19 research, and in general, domain adaption of AMR for this task.

pdf bib
Workshop on Games and Natural Language Processing
Stephanie M. Lukin
Workshop on Games and Natural Language Processing

2019

pdf bib
A Research Platform for Multi-Robot Dialogue with Humans
Matthew Marge | Stephen Nogar | Cory J. Hayes | Stephanie M. Lukin | Jesse Bloecker | Eric Holder | Clare Voss
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics (Demonstrations)

This paper presents a research platform that supports spoken dialogue interaction with multiple robots. The demonstration showcases our crafted MultiBot testing scenario in which users can verbally issue search, navigate, and follow instructions to two robotic teammates: a simulated ground robot and an aerial robot. This flexible language and robotic platform takes advantage of existing tools for speech recognition and dialogue management that are compatible with new domains, and implements an inter-agent communication protocol (tactical behavior specification), where verbal instructions are encoded for tasks assigned to the appropriate robot.

pdf bib
Augmenting Abstract Meaning Representation for Human-Robot Dialogue
Claire Bonial | Lucia Donatelli | Stephanie M. Lukin | Stephen Tratz | Ron Artstein | David Traum | Clare Voss
Proceedings of the First International Workshop on Designing Meaning Representations

We detail refinements made to Abstract Meaning Representation (AMR) that make the representation more suitable for supporting a situated dialogue system, where a human remotely controls a robot for purposes of search and rescue and reconnaissance. We propose 36 augmented AMRs that capture speech acts, tense and aspect, and spatial information. This linguistic information is vital for representing important distinctions, for example whether the robot has moved, is moving, or will move. We evaluate two existing AMR parsers for their performance on dialogue data. We also outline a model for graph-to-graph conversion, in which output from AMR parsers is converted into our refined AMRs. The design scheme presented here, though task-specific, is extendable for broad coverage of speech acts using AMR in future task-independent work.

pdf bib
Proceedings of the Second Workshop on Storytelling
Francis Ferraro | Ting-Hao ‘Kenneth’ Huang | Stephanie M. Lukin | Margaret Mitchell
Proceedings of the Second Workshop on Storytelling

2018

pdf bib
A Pipeline for Creative Visual Storytelling
Stephanie Lukin | Reginald Hobbs | Clare Voss
Proceedings of the First Workshop on Storytelling

Computational visual storytelling produces a textual description of events and interpretations depicted in a sequence of images. These texts are made possible by advances and cross-disciplinary approaches in natural language processing, generation, and computer vision. We define a computational creative visual storytelling as one with the ability to alter the telling of a story along three aspects: to speak about different environments, to produce variations based on narrative goals, and to adapt the narrative to the audience. These aspects of creative storytelling and their effect on the narrative have yet to be explored in visual storytelling. This paper presents a pipeline of task-modules, Object Identification, Single-Image Inferencing, and Multi-Image Narration, that serve as a preliminary design for building a creative visual storyteller. We have piloted this design for a sequence of images in an annotation task. We present and analyze the collected corpus and describe plans towards automation.

pdf bib
Consequences and Factors of Stylistic Differences in Human-Robot Dialogue
Stephanie Lukin | Kimberly Pollard | Claire Bonial | Matthew Marge | Cassidy Henry | Ron Artstein | David Traum | Clare Voss
Proceedings of the 19th Annual SIGdial Meeting on Discourse and Dialogue

This paper identifies stylistic differences in instruction-giving observed in a corpus of human-robot dialogue. Differences in verbosity and structure (i.e., single-intent vs. multi-intent instructions) arose naturally without restrictions or prior guidance on how users should speak with the robot. Different styles were found to produce different rates of miscommunication, and correlations were found between style differences and individual user variation, trust, and interaction experience with the robot. Understanding potential consequences and factors that influence style can inform design of dialogue systems that are robust to natural variation from human users.

pdf bib
Controlling Personality-Based Stylistic Variation with Neural Natural Language Generators
Shereen Oraby | Lena Reed | Shubhangi Tandon | Sharath T.S. | Stephanie Lukin | Marilyn Walker
Proceedings of the 19th Annual SIGdial Meeting on Discourse and Dialogue

Natural language generators for task-oriented dialogue must effectively realize system dialogue actions and their associated semantics. In many applications, it is also desirable for generators to control the style of an utterance. To date, work on task-oriented neural generation has primarily focused on semantic fidelity rather than achieving stylistic goals, while work on style has been done in contexts where it is difficult to measure content preservation. Here we present three different sequence-to-sequence models and carefully test how well they disentangle content and style. We use a statistical generator, Personage, to synthesize a new corpus of over 88,000 restaurant domain utterances whose style varies according to models of personality, giving us total control over both the semantic content and the stylistic variation in the training data. We then vary the amount of explicit stylistic supervision given to the three models. We show that our most explicit model can simultaneously achieve high fidelity to both semantic and stylistic goals: this model adds a context vector of 36 stylistic parameters as input to the hidden state of the encoder at each time step, showing the benefits of explicit stylistic supervision, even when the amount of training data is large.

pdf bib
Dialogue Structure Annotation for Multi-Floor Interaction
David Traum | Cassidy Henry | Stephanie Lukin | Ron Artstein | Felix Gervits | Kimberly Pollard | Claire Bonial | Su Lei | Clare Voss | Matthew Marge | Cory Hayes | Susan Hill
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

pdf bib
ScoutBot: A Dialogue System for Collaborative Navigation
Stephanie M. Lukin | Felix Gervits | Cory J. Hayes | Pooja Moolchandani | Anton Leuski | John G. Rogers III | Carlos Sanchez Amaro | Matthew Marge | Clare R. Voss | David Traum
Proceedings of ACL 2018, System Demonstrations

ScoutBot is a dialogue interface to physical and simulated robots that supports collaborative exploration of environments. The demonstration will allow users to issue unconstrained spoken language commands to ScoutBot. ScoutBot will prompt for clarification if the user’s instruction needs additional input. It is trained on human-robot dialogue collected from Wizard-of-Oz experiments, where robot responses were initiated by a human wizard in previous interactions. The demonstration will show a simulated ground robot (Clearpath Jackal) in a simulated environment supported by ROS (Robot Operating System).

2017

pdf bib
Argument Strength is in the Eye of the Beholder: Audience Effects in Persuasion
Stephanie Lukin | Pranav Anand | Marilyn Walker | Steve Whittaker
Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers

Americans spend about a third of their time online, with many participating in online conversations on social and political issues. We hypothesize that social media arguments on such issues may be more engaging and persuasive than traditional media summaries, and that particular types of people may be more or less convinced by particular styles of argument, e.g. emotional arguments may resonate with some personalities while factual arguments resonate with others. We report a set of experiments testing at large scale how audience variables interact with argument style to affect the persuasiveness of an argument, an under-researched topic within natural language processing. We show that belief change is affected by personality factors, with conscientious, open and agreeable people being more convinced by emotional arguments.

2016

pdf bib
PersonaBank: A Corpus of Personal Narratives and Their Story Intention Graphs
Stephanie Lukin | Kevin Bowden | Casey Barackman | Marilyn Walker
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

We present a new corpus, PersonaBank, consisting of 108 personal stories from weblogs that have been annotated with their Story Intention Graphs, a deep representation of the content of a story. We describe the topics of the stories and the basis of the Story Intention Graph representation, as well as the process of annotating the stories to produce the Story Intention Graphs and the challenges of adapting the tool to this new personal narrative domain. We also discuss how the corpus can be used in applications that retell the story using different styles of tellings, co-tellings, or as a content planner.

2015

pdf bib
Generating Sentence Planning Variations for Story Telling
Stephanie Lukin | Lena Reed | Marilyn Walker
Proceedings of the 16th Annual Meeting of the Special Interest Group on Discourse and Dialogue

2014

pdf bib
Getting Reliable Annotations for Sarcasm in Online Dialogues
Reid Swanson | Stephanie Lukin | Luke Eisenberg | Thomas Corcoran | Marilyn Walker
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

The language used in online forums differs in many ways from that of traditional language resources such as news. One difference is the use and frequency of nonliteral, subjective dialogue acts such as sarcasm. Whether the aim is to develop a theory of sarcasm in dialogue, or engineer automatic methods for reliably detecting sarcasm, a major challenge is simply the difficulty of getting enough reliably labelled examples. In this paper we describe our work on methods for achieving highly reliable sarcasm annotations from untrained annotators on Mechanical Turk. We explore the use of a number of common statistical reliability measures, such as Kappa, Karger’s, Majority Class, and EM. We show that more sophisticated measures do not appear to yield better results for our data than simple measures such as assuming that the correct label is the one that a majority of Turkers apply.

2013

pdf bib
Really? Well. Apparently Bootstrapping Improves the Performance of Sarcasm and Nastiness Classifiers for Online Dialogue
Stephanie Lukin | Marilyn Walker
Proceedings of the Workshop on Language Analysis in Social Media