CoQAR: Question Rewriting on CoQA

Quentin Brabant; Gwénolé Lecorvé; Lina M. Rojas Barahona

CoQAR: Question Rewriting on CoQA

Quentin Brabant, Gwénolé Lecorvé, Lina M. Rojas Barahona

Abstract

Questions asked by humans during a conversation often contain contextual dependencies, i.e., explicit or implicit references to previous dialogue turns. These dependencies take the form of coreferences (e.g., via pronoun use) or ellipses, and can make the understanding difficult for automated systems. One way to facilitate the understanding and subsequent treatments of a question is to rewrite it into an out-of-context form, i.e., a form that can be understood without the conversational context. We propose CoQAR, a corpus containing 4.5K conversations from the Conversational Question-Answering dataset CoQA, for a total of 53K follow-up question-answer pairs. Each original question was manually annotated with at least 2 at most 3 out-of-context rewritings. CoQA originally contains 8k conversations, which sum up to 127k question-answer pairs. CoQAR can be used in the supervised learning of three tasks: question paraphrasing, question rewriting and conversational question answering. In order to assess the quality of CoQAR’s rewritings, we conduct several experiments consisting in training and evaluating models for these three tasks. Our results support the idea that question rewriting can be used as a preprocessing step for (conversational and non-conversational) question answering models, thereby increasing their performances.

Anthology ID:: 2022.lrec-1.13
Volume:: Proceedings of the Thirteenth Language Resources and Evaluation Conference
Month:: June
Year:: 2022
Address:: Marseille, France
Editors:: Nicoletta Calzolari, Frédéric Béchet, Philippe Blache, Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Hélène Mazo, Jan Odijk, Stelios Piperidis
Venue:: LREC
SIG:
Publisher:: European Language Resources Association
Note:
Pages:: 119–126
Language:
URL:: https://aclanthology.org/2022.lrec-1.13
DOI:
Bibkey:
Cite (ACL):: Quentin Brabant, Gwénolé Lecorvé, and Lina M. Rojas Barahona. 2022. CoQAR: Question Rewriting on CoQA. In Proceedings of the Thirteenth Language Resources and Evaluation Conference, pages 119–126, Marseille, France. European Language Resources Association.
Cite (Informal):: CoQAR: Question Rewriting on CoQA (Brabant et al., LREC 2022)
Copy Citation:
PDF:: https://aclanthology.org/2022.lrec-1.13.pdf
Code: orange-opensource/coqar
Data: CANARD, CoQA, QuAC, SQuAD

PDF Cite Search Code