Andriy Kosar


2023

pdf bib
Advancing Topical Text Classification: A Novel Distance-Based Method with Contextual Embeddings
Andriy Kosar | Guy De Pauw | Walter Daelemans
Proceedings of the 14th International Conference on Recent Advances in Natural Language Processing

This study introduces a new method for distance-based unsupervised topical text classification using contextual embeddings. The method applies and tailors sentence embeddings for distance-based topical text classification. This is achieved by leveraging the semantic similarity between topic labels and text content, and reinforcing the relationship between them in a shared semantic space. The proposed method outperforms a wide range of existing sentence embeddings on average by 35%. Presenting an alternative to the commonly used transformer-based zero-shot general-purpose classifiers for multiclass text classification, the method demonstrates significant advantages in terms of computational efficiency and flexibility, while maintaining comparable or improved classification results.