Comparing Methods for Segmenting Elementary Discourse Units in a French Conversational Corpus

Laurent Prevot, Julie Hunter, Philippe Muller


Abstract
While discourse parsing has made considerable progress in recent years, discourse segmentation of conversational speech remains a difficult issue. In this paper, we exploit a French data set that has been manually segmented into discourse units to compare two approaches to discourse segmentation: fine-tuning existing systems on manual segmentation vs. using hand-crafted labelling rules to develop a weakly supervised segmenter. Our results show that both approaches yield similar performance in terms of f-score while data programming requires less manual annotation work. In a second experiment we play with the amount of training data used for fine-tuning systems and show that a small amount of hand labelled data is enough to obtain good results (although significantly lower than in the first experiment using all the annotated data available).
Anthology ID:
2023.nodalida-1.44
Volume:
Proceedings of the 24th Nordic Conference on Computational Linguistics (NoDaLiDa)
Month:
May
Year:
2023
Address:
Tórshavn, Faroe Islands
Editors:
Tanel Alumäe, Mark Fishel
Venue:
NoDaLiDa
SIG:
Publisher:
University of Tartu Library
Note:
Pages:
436–446
Language:
URL:
https://aclanthology.org/2023.nodalida-1.44
DOI:
Bibkey:
Cite (ACL):
Laurent Prevot, Julie Hunter, and Philippe Muller. 2023. Comparing Methods for Segmenting Elementary Discourse Units in a French Conversational Corpus. In Proceedings of the 24th Nordic Conference on Computational Linguistics (NoDaLiDa), pages 436–446, Tórshavn, Faroe Islands. University of Tartu Library.
Cite (Informal):
Comparing Methods for Segmenting Elementary Discourse Units in a French Conversational Corpus (Prevot et al., NoDaLiDa 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.nodalida-1.44.pdf