Fernando Sánchez Vega

Also published as: Fernando Sanchez-Vega, Fernando Sánchez-Vega


2023

pdf bib
Walter Burns at SemEval-2023 Task 5: NLP-CIMAT - Leveraging Model Ensembles for Clickbait Spoiling
Emilio Villa Cueva | Daniel Vallejo Aldana | Fernando Sánchez Vega | Adrián Pastor López Monroy
Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023)

This paper describes our participation in the Clickbait challenge at SemEval 2023. In this work, we address the Clickbait classification task using transformers models in an ensemble configuration. We tackle the Spoiler Generation task using a two-level ensemble strategy of models trained for extractive QA, and selecting the best K candidates for multi-part spoilers. In the test partitions, our approaches obtained a classification accuracy of 0.716 for classification and a BLEU-4 score of 0.439 for spoiler generation.

pdf bib
Dynamic Regularization in UDA for Transformers in Multimodal Classification
Ivonne Monter-Aldana | Adrian Pastor Lopez Monroy | Fernando Sanchez-Vega
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Multimodal machine learning is a cutting-edge field that explores ways to incorporate information from multiple sources into models. As more multimodal data becomes available, this field has become increasingly relevant. This work focuses on two key challenges in multimodal machine learning. The first is finding efficient ways to combine information from different data types. The second is that often, one modality (e.g., text) is stronger and more relevant, making it difficult to identify meaningful patterns in the weaker modality (e.g., image). Our approach focuses on more effectively exploiting the weaker modality while dynamically regularizing the loss function. First, we introduce a new two-stream model called Multimodal BERT-ViT, which features a novel intra-CLS token fusion. Second, we utilize a dynamic adjustment that maintains a balance between specialization and generalization during the training to avoid overfitting, which we devised. We add this dynamic adjustment to the Unsupervised Data Augmentation (UDA) framework. We evaluate the effectiveness of these proposals on the task of multi-label movie genre classification using the Moviescope and MM-IMDb datasets. The evaluation revealed that our proposal offers substantial benefits, while simultaneously enabling us to harness the weaker modality without compromising the information provided by the stronger.

pdf bib
CIMAT-NLP@LT-EDI-2023: Finegrain Depression Detection by Multiple Binary Problems Approach
María de Jesús García Santiago | Fernando Sánchez Vega | Adrián Pastor López Monroy
Proceedings of the Third Workshop on Language Technology for Equality, Diversity and Inclusion

This work described the work of the team CIMAT-NLP on the Shared task of Detecting Signs of Depression from Social Media Text at LT-EDI@RANLP 2023, which consists of depression classification on three levels: “not depression”, “moderate” depression and “severe” depression on text from social media. In this work, we proposed two approaches: (1) a transformer model which can handle big text without truncation of its length, and (2) an ensemble of six binary Bag of Words. Our team placed fourth in the competition and found that models trained with our approaches could place second

2018

pdf bib
INAOE-UPV at SemEval-2018 Task 3: An Ensemble Approach for Irony Detection in Twitter
Delia Irazú Hernández Farías | Fernando Sánchez-Vega | Manuel Montes-y-Gómez | Paolo Rosso
Proceedings of the 12th International Workshop on Semantic Evaluation

This paper describes an ensemble approach to the SemEval-2018 Task 3. The proposed method is composed of two renowned methods in text classification together with a novel approach for capturing ironic content by exploiting a tailored lexicon for irony detection. We experimented with different ensemble settings. The obtained results show that our method has a good performance for detecting the presence of ironic content in Twitter.

2013

pdf bib
INAOE_UPV-CORE: Extracting Word Associations from Document Corpora to estimate Semantic Textual Similarity
Fernando Sánchez-Vega | Manuel Montes-y-Gómez | Paolo Rosso | Luis Villaseñor-Pineda
Second Joint Conference on Lexical and Computational Semantics (*SEM), Volume 1: Proceedings of the Main Conference and the Shared Task: Semantic Textual Similarity