Multi-doc Hybrid Summarization via Salient Representation Learning

Min Xiao


Abstract
Multi-document summarization is gaining more and more attention recently and serves as an invaluable tool to obtain key facts among a large information pool. In this paper, we proposed a multi-document hybrid summarization approach, which simultaneously generates a human-readable summary and extracts corresponding key evidences based on multi-doc inputs. To fulfill that purpose, we crafted a salient representation learning method to induce latent salient features, which are effective for joint evidence extraction and summary generation. In order to train this model, we conducted multi-task learning to optimize a composited loss, constructed over extractive and abstractive sub-components in a hierarchical way. We implemented the system based on a ubiquiotously adopted transformer architecture and conducted experimental studies on multiple datasets across two domains, achieving superior performance over the baselines.
Anthology ID:
2023.acl-industry.37
Volume:
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 5: Industry Track)
Month:
July
Year:
2023
Address:
Toronto, Canada
Editors:
Sunayana Sitaram, Beata Beigman Klebanov, Jason D Williams
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
379–389
Language:
URL:
https://aclanthology.org/2023.acl-industry.37
DOI:
10.18653/v1/2023.acl-industry.37
Bibkey:
Cite (ACL):
Min Xiao. 2023. Multi-doc Hybrid Summarization via Salient Representation Learning. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 5: Industry Track), pages 379–389, Toronto, Canada. Association for Computational Linguistics.
Cite (Informal):
Multi-doc Hybrid Summarization via Salient Representation Learning (Xiao, ACL 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.acl-industry.37.pdf