Investigating the Effectiveness of Graph-based Algorithm for Bangla Text Classification

Farhan Dehan; Md Fahim; Amin Ahsan Ali; M Ashraful Amin; Akmmahbubur Rahman

doi:10.18653/v1/2023.banglalp-1.12

Investigating the Effectiveness of Graph-based Algorithm for Bangla Text Classification

Farhan Dehan, Md Fahim, Amin Ahsan Ali, M Ashraful Amin, Akmmahbubur Rahman

Abstract

In this study, we examine and analyze the behavior of several graph-based models for Bangla text classification tasks. Graph-based algorithms create heterogeneous graphs from text data. Each node represents either a word or a document, and each edge indicates relationship between any two words or word and document. We applied the BERT model and different graph-based models including TextGCN, GAT, BertGAT, and BertGCN on five different datasets including SentNoB, Sarcasm detection, BanFakeNews, Hate speech detection, and Emotion detection datasets for Bangla text. BERT’s model bested the TextGCN and the GAT models by a large difference in terms of accuracy, Macro F1 score, and weighted F1 score. BertGCN and BertGAT are shown to outperform standalone graph models and BERT model. BertGAT excelled in the Emotion detection dataset and achieved a 1%-2% performance boost in Sarcasm detection, Hate speech detection, and BanFakeNews datasets from BERT’s performance. Whereas, BertGCN outperformed BertGAT by 1% for SetNoB, and BanFakeNews datasets while beating BertGAT by 2% for Sarcasm detection, Hate Speech, and Emotion detection datasets. We also examined different variations in graph structure and analyzed their effects.

Anthology ID:: 2023.banglalp-1.12
Volume:: Proceedings of the First Workshop on Bangla Language Processing (BLP-2023)
Month:: December
Year:: 2023
Address:: Singapore
Editors:: Firoj Alam, Sudipta Kar, Shammur Absar Chowdhury, Farig Sadeque, Ruhul Amin
Venue:: BanglaLP
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 104–116
Language:
URL:: https://aclanthology.org/2023.banglalp-1.12
DOI:: 10.18653/v1/2023.banglalp-1.12
Bibkey:
Cite (ACL):: Farhan Dehan, Md Fahim, Amin Ahsan Ali, M Ashraful Amin, and Akmmahbubur Rahman. 2023. Investigating the Effectiveness of Graph-based Algorithm for Bangla Text Classification. In Proceedings of the First Workshop on Bangla Language Processing (BLP-2023), pages 104–116, Singapore. Association for Computational Linguistics.
Cite (Informal):: Investigating the Effectiveness of Graph-based Algorithm for Bangla Text Classification (Dehan et al., BanglaLP 2023)
Copy Citation:
PDF:: https://aclanthology.org/2023.banglalp-1.12.pdf

PDF Cite Search