Jon Gillick


2020

pdf bib
Attending to Long-Distance Document Context for Sequence Labeling
Matthew Jörke | Jon Gillick | Matthew Sims | David Bamman
Findings of the Association for Computational Linguistics: EMNLP 2020

We present in this work a method for incorporating global context in long documents when making local decisions in sequence labeling problems like NER. Inspired by work in featurized log-linear models (Chieu and Ng, 2002; Sutton and McCallum, 2004), our model learns to attend to multiple mentions of the same word type in generating a representation for each token in context, extending that work to learning representations that can be incorporated into modern neural models. Attending to broader context at test time provides complementary information to pretraining (Gururangan et al., 2020), yields strong gains over equivalently parameterized models lacking such context, and performs best at recognizing entities with high TF-IDF scores (i.e., those that are important within a document).

2018

pdf bib
Telling Stories with Soundtracks: An Empirical Analysis of Music in Film
Jon Gillick | David Bamman
Proceedings of the First Workshop on Storytelling

Soundtracks play an important role in carrying the story of a film. In this work, we collect a corpus of movies and television shows matched with subtitles and soundtracks and analyze the relationship between story, song, and audience reception. We look at the content of a film through the lens of its latent topics and at the content of a song through descriptors of its musical attributes. In two experiments, we find first that individual topics are strongly associated with musical attributes, and second, that musical attributes of soundtracks are predictive of film ratings, even after controlling for topic and genre.

pdf bib
Please Clap: Modeling Applause in Campaign Speeches
Jon Gillick | David Bamman
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers)

This work examines the rhetorical techniques that speakers employ during political campaigns. We introduce a new corpus of speeches from campaign events in the months leading up to the 2016 U.S. presidential election and develop new models for predicting moments of audience applause. In contrast to existing datasets, we tackle the challenge of working with transcripts that derive from uncorrected closed captioning, using associated audio recordings to automatically extract and align labels for instances of audience applause. In prediction experiments, we find that lexical features carry the most information, but that a variety of features are predictive, including prosody, long-term contextual dependencies, and theoretically motivated features designed to capture rhetorical techniques.