Discourse Sense Flows: Modelling the Rhetorical Style of Documents across Various Domains

Rene Knaebel, Manfred Stede


Abstract
Recent research on shallow discourse parsing has given renewed attention to the role of discourse relation signals, in particular explicit connectives and so-called alternative lexicalizations. In our work, we first develop new models for extracting signals and classifying their senses, both for explicit connectives and alternative lexicalizations, based on the Penn Discourse Treebank v3 corpus. Thereafter, we apply these models to various raw corpora, and we introduce ‘discourse sense flows’, a new way of modeling the rhetorical style of a document by the linear order of coherence relations, as captured by the PDTB senses. The corpora span several genres and domains, and we undertake comparative analyses of the sense flows, as well as experiments on automatic genre/domain discrimination using discourse sense flow patterns as features. We find that n-gram patterns are indeed stronger predictors than simple sense (unigram) distributions.
Anthology ID:
2023.findings-emnlp.964
Volume:
Findings of the Association for Computational Linguistics: EMNLP 2023
Month:
December
Year:
2023
Address:
Singapore
Editors:
Houda Bouamor, Juan Pino, Kalika Bali
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
14462–14482
Language:
URL:
https://aclanthology.org/2023.findings-emnlp.964
DOI:
10.18653/v1/2023.findings-emnlp.964
Bibkey:
Cite (ACL):
Rene Knaebel and Manfred Stede. 2023. Discourse Sense Flows: Modelling the Rhetorical Style of Documents across Various Domains. In Findings of the Association for Computational Linguistics: EMNLP 2023, pages 14462–14482, Singapore. Association for Computational Linguistics.
Cite (Informal):
Discourse Sense Flows: Modelling the Rhetorical Style of Documents across Various Domains (Knaebel & Stede, Findings 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.findings-emnlp.964.pdf