Harsh Agrawal


2023

pdf bib
Multimodal Persona Based Generation of Comic Dialogs
Harsh Agrawal | Aditya Mishra | Manish Gupta | Mausam
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

We focus on the novel problem of persona based dialogue generation for comic strips. Dialogs in comic strips is a unique and unexplored area where every strip contains utterances from various characters with each one building upon the previous utterances and the associated visual scene. Previous works like DialoGPT, PersonaGPT and other dialog generation models encode two-party dialogues and do not account for the visual information. To the best of our knowledge we are the first to propose the paradigm of multimodal persona based dialogue generation. We contribute a novel dataset, ComSet, consisting of 54K strips, harvested from 13 popular comics available online. Further, we propose a multimodal persona-based architecture, MPDialog, to generate dialogues for the next panel in the strip which decreases the perplexity score by ~10 points over strong dialogue generation baseline models. We demonstrate that there is still ample opportunity for improvement, highlighting the importance of building stronger dialogue systems that are able to generate persona-consistent dialogues and understand the context through various modalities.

2016

pdf bib
Sort Story: Sorting Jumbled Images and Captions into Stories
Harsh Agrawal | Arjun Chandrasekaran | Dhruv Batra | Devi Parikh | Mohit Bansal
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing

pdf bib
Human Attention in Visual Question Answering: Do Humans and Deep Networks look at the same regions?
Abhishek Das | Harsh Agrawal | Larry Zitnick | Devi Parikh | Dhruv Batra
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing