Reading between the Lines: Information Extraction from Industry Requirements

Ole Magnus Holter, Basil Ell


Abstract
Industry requirements describe the qualities that a project or a service must provide. Most requirements are, however, only available in natural language format and are embedded in textual documents. To be machine-understandable, a requirement needs to be represented in a logical format. We consider that a requirement consists of a scope, which is the requirement’s subject matter, a condition, which is any condition that must be fulfilled for the requirement to be relevant, and a demand, which is what is required. We introduce a novel task, the identification of the semantic components scope, condition, and demand in a requirement sentence, and establish baselines using sequence labelling and few-shot learning. One major challenge with this task is the implicit nature of the scope, often not stated in the sentence. By including document context information, we improved the average performance for scope detection. Our study provides insights into the difficulty of machine understanding of industry requirements and suggests strategies for addressing this challenge.
Anthology ID:
2023.ranlp-1.76
Volume:
Proceedings of the 14th International Conference on Recent Advances in Natural Language Processing
Month:
September
Year:
2023
Address:
Varna, Bulgaria
Editors:
Ruslan Mitkov, Galia Angelova
Venue:
RANLP
SIG:
Publisher:
INCOMA Ltd., Shoumen, Bulgaria
Note:
Pages:
703–711
Language:
URL:
https://aclanthology.org/2023.ranlp-1.76
DOI:
Bibkey:
Cite (ACL):
Ole Magnus Holter and Basil Ell. 2023. Reading between the Lines: Information Extraction from Industry Requirements. In Proceedings of the 14th International Conference on Recent Advances in Natural Language Processing, pages 703–711, Varna, Bulgaria. INCOMA Ltd., Shoumen, Bulgaria.
Cite (Informal):
Reading between the Lines: Information Extraction from Industry Requirements (Holter & Ell, RANLP 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.ranlp-1.76.pdf