Compositional Data and Task Augmentation for Instruction Following

Soham Dan, Xinran Han, Dan Roth


Abstract
Executing natural language instructions in a physically grounded domain requires a model that understands both spatial concepts such as “left of” and “above”, and the compositional language used to identify landmarks and articulate instructions relative to them. In this paper, we study instruction understanding in the blocks world domain. Given an initial arrangement of blocks and a natural language instruction, the system executes the instruction by manipulating selected blocks. The highly compositional instructions are composed of atomic components and understanding these components is a necessary step to executing the instruction. We show that while end-to-end training (supervised only by the correct block location) fails to address the challenges of this task and performs poorly on instructions involving a single atomic component, knowledge-free auxiliary signals can be used to significantly improve performance by providing supervision for the instruction’s components. Specifically, we generate signals that aim at helping the model gradually understand components of the compositional instructions, as well as those that help it better understand spatial concepts, and show their benefit to the overall task for two datasets and two state-of-the-art (SOTA) models, especially when the training data is limited—which is usual in such tasks.
Anthology ID:
2021.findings-emnlp.178
Volume:
Findings of the Association for Computational Linguistics: EMNLP 2021
Month:
November
Year:
2021
Address:
Punta Cana, Dominican Republic
Editors:
Marie-Francine Moens, Xuanjing Huang, Lucia Specia, Scott Wen-tau Yih
Venue:
Findings
SIG:
SIGDAT
Publisher:
Association for Computational Linguistics
Note:
Pages:
2076–2081
Language:
URL:
https://aclanthology.org/2021.findings-emnlp.178
DOI:
10.18653/v1/2021.findings-emnlp.178
Bibkey:
Cite (ACL):
Soham Dan, Xinran Han, and Dan Roth. 2021. Compositional Data and Task Augmentation for Instruction Following. In Findings of the Association for Computational Linguistics: EMNLP 2021, pages 2076–2081, Punta Cana, Dominican Republic. Association for Computational Linguistics.
Cite (Informal):
Compositional Data and Task Augmentation for Instruction Following (Dan et al., Findings 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.findings-emnlp.178.pdf
Video:
 https://aclanthology.org/2021.findings-emnlp.178.mp4