FREDSum: A Dialogue Summarization Corpus for French Political Debates

Virgile Rennard, Guokan Shang, Damien Grari, Julie Hunter, Michalis Vazirgiannis


Abstract
Recent advances in deep learning, and especially the invention of encoder-decoder architectures, have significantly improved the performance of abstractive summarization systems. While the majority of research has focused on written documents, we have observed an increasing interest in the summarization of dialogues and multi-party conversations over the past few years. In this paper, we present a dataset of French political debates for the purpose of enhancing resources for multi-lingual dialogue summarization. Our dataset consists of manually transcribed and annotated political debates, covering a range of topics and perspectives. We highlight the importance of high-quality transcription and annotations for training accurate and effective dialogue summarization models, and emphasize the need for multilingual resources to support dialogue summarization in non-English languages. We also provide baseline experiments using state-of-the-art methods, and encourage further research in this area to advance the field of dialogue summarization. Our dataset will be made publicly available for use by the research community, enabling further advances in multilingual dialogue summarization.
Anthology ID:
2023.findings-emnlp.280
Original:
2023.findings-emnlp.280v1
Version 2:
2023.findings-emnlp.280v2
Volume:
Findings of the Association for Computational Linguistics: EMNLP 2023
Month:
December
Year:
2023
Address:
Singapore
Editors:
Houda Bouamor, Juan Pino, Kalika Bali
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
4241–4253
Language:
URL:
https://aclanthology.org/2023.findings-emnlp.280
DOI:
10.18653/v1/2023.findings-emnlp.280
Bibkey:
Cite (ACL):
Virgile Rennard, Guokan Shang, Damien Grari, Julie Hunter, and Michalis Vazirgiannis. 2023. FREDSum: A Dialogue Summarization Corpus for French Political Debates. In Findings of the Association for Computational Linguistics: EMNLP 2023, pages 4241–4253, Singapore. Association for Computational Linguistics.
Cite (Informal):
FREDSum: A Dialogue Summarization Corpus for French Political Debates (Rennard et al., Findings 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.findings-emnlp.280.pdf
Video:
 https://aclanthology.org/2023.findings-emnlp.280.mp4