The Financial Document Structure Extraction Shared Task (FinTOC 2022)

Juyeon Kang, Abderrahim Ait Azzi, Sandra Bellato, Blanca Carbajo Coronado, Mahmoud El-Haj, Ismail El Maarouf, Mei Gan, Ana Gisbert, Antonio Moreno Sandoval


Abstract
This paper describes the FinTOC-2022 Shared Task on the structure extraction from financial documents, its participants results and their findings. This shared task was organized as part of The 4th Financial Narrative Processing Workshop (FNP 2022), held jointly at The 13th Edition of the Language Resources and Evaluation Conference (LREC 2022), Marseille, France (El-Haj et al., 2022). This shared task aimed to stimulate research in systems for extracting table-of-contents (TOC) from investment documents (such as financial prospectuses) by detecting the document titles and organizing them hierarchically into a TOC. For the forth edition of this shared task, three subtasks were presented to the participants: one with English documents, one with French documents and the other one with Spanish documents. This year, we proposed a different and revised dataset for English and French compared to the previous editions of FinTOC and a new dataset for Spanish documents was added. The task attracted 6 submissions for each language from 4 teams, and the most successful methods make use of textual, structural and visual features extracted from the documents and propose classification models for detecting titles and TOCs for all of the subtasks.
Anthology ID:
2022.fnp-1.12
Volume:
Proceedings of the 4th Financial Narrative Processing Workshop @LREC2022
Month:
June
Year:
2022
Address:
Marseille, France
Editors:
Mahmoud El-Haj, Paul Rayson, Nadhem Zmandar
Venue:
FNP
SIG:
Publisher:
European Language Resources Association
Note:
Pages:
83–88
Language:
URL:
https://aclanthology.org/2022.fnp-1.12
DOI:
Bibkey:
Cite (ACL):
Juyeon Kang, Abderrahim Ait Azzi, Sandra Bellato, Blanca Carbajo Coronado, Mahmoud El-Haj, Ismail El Maarouf, Mei Gan, Ana Gisbert, and Antonio Moreno Sandoval. 2022. The Financial Document Structure Extraction Shared Task (FinTOC 2022). In Proceedings of the 4th Financial Narrative Processing Workshop @LREC2022, pages 83–88, Marseille, France. European Language Resources Association.
Cite (Informal):
The Financial Document Structure Extraction Shared Task (FinTOC 2022) (Kang et al., FNP 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.fnp-1.12.pdf