Ekaterina Borisova


2024

pdf bib
Surveying the FAIRness of Annotation Tools: Difficult to find, difficult to reuse
Ekaterina Borisova | Raia Abu Ahmad | Leyla Garcia-Castro | Ricardo Usbeck | Georg Rehm
Proceedings of The 18th Linguistic Annotation Workshop (LAW-XVIII)

In the realm of Machine Learning and Deep Learning, there is a need for high-quality annotated data to train and evaluate supervised models. An extensive number of annotation tools have been developed to facilitate the data labelling process. However, finding the right tool is a demanding task involving thorough searching and testing. Hence, to effectively navigate the multitude of tools, it becomes essential to ensure their findability, accessibility, interoperability, and reusability (FAIR). This survey addresses the FAIRness of existing annotation software by evaluating 50 different tools against the FAIR principles for research software (FAIR4RS). The study indicates that while being accessible and interoperable, annotation tools are difficult to find and reuse. In addition, there is a need to establish community standards for annotation software development, documentation, and distribution.

2023

pdf bib
RCLN at SemEval-2023 Task 1: Leveraging Stable Diffusion and Image Captions for Visual WSD
Antonina Mijatovic | Davide Buscaldi | Ekaterina Borisova
Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023)

This paper describes the participation of the RCLN team at the Visual Word Sense Disambiguation task at SemEval 2023. The participation was focused on the use of CLIP as a base model for the matching between text and images with additional information coming from captions generated from images and the generation of images from the prompt text using Stable Diffusion. The results we obtained are not particularly good, but interestingly enough, we were able to improve over the CLIP baseline in Italian by recurring simply to the generated images.