Matheus Camasmie Pavan


2023

pdf bib
BERTabaporu: Assessing a Genre-Specific Language Model for Portuguese NLP
Pablo Botton Costa | Matheus Camasmie Pavan | Wesley Ramos Santos | Samuel Caetano Silva | Ivandré Paraboni
Proceedings of the 14th International Conference on Recent Advances in Natural Language Processing

Transformer-based language models such as Bidirectional Encoder Representations from Transformers (BERT) are now mainstream in the NLP field, but extensions to languages other than English, to new domains and/or to more specific text genres are still in demand. In this paper we introduced BERTabaporu, a BERT language model that has been pre-trained on Twitter data in the Brazilian Portuguese language. The model is shown to outperform the best-known general-purpose model for this language in three Twitter-related NLP tasks, making a potentially useful resource for Portuguese NLP in general.

pdf bib
Stance Prediction from Multimodal Social Media Data
Lais Carraro Leme Cavalheiro | Matheus Camasmie Pavan | Ivandré Paraboni
Proceedings of the 14th International Conference on Recent Advances in Natural Language Processing

Stance prediction - the computational task of inferring attitudes towards a given target topic of interest - relies heavily on text data provided by social media or similar sources, but it may also benefit from non-text information such as demographics (e.g., users’ gender, age, etc.), network structure (e.g., friends, followers, etc.), interactions (e.g., mentions, replies, etc.) and other non-text properties (e.g., time information, etc.). However, so-called hybrid (or in some cases multimodal) approaches to stance prediction have only been developed for a small set of target languages, and often making use of count-based text models (e.g., bag-of-words) and time-honoured classification methods (e.g., support vector machines). As a means to further research in the field, in this work we introduce a number of text- and non-text models for stance prediction in the Portuguese language, which make use of more recent methods based on BERT and an ensemble architecture, and ask whether a BERT stance classifier may be enhanced with different kinds of network-related information.