Automatic Detection of Machine-Generated Text Using Pre-Trained Language Models

Yunhao Fang


Abstract
In this paper, I provide a detailed description of my approach to tackling the ALTA 2023 shared task whose objective is to build an automatic detection system to distinguish between humanauthored text and text generated from Large Language Models. By leveraging several pretrained language models through model finetuning as well as the multi-model ensemble, the system managed to achieve second place on the test set leaderboard in the competition.
Anthology ID:
2023.alta-1.19
Volume:
Proceedings of the 21st Annual Workshop of the Australasian Language Technology Association
Month:
November
Year:
2023
Address:
Melbourne, Australia
Editors:
Smaranda Muresan, Vivian Chen, Kennington Casey, Vandyke David, Dethlefs Nina, Inoue Koji, Ekstedt Erik, Ultes Stefan
Venue:
ALTA
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
159–163
Language:
URL:
https://aclanthology.org/2023.alta-1.19
DOI:
Bibkey:
Cite (ACL):
Yunhao Fang. 2023. Automatic Detection of Machine-Generated Text Using Pre-Trained Language Models. In Proceedings of the 21st Annual Workshop of the Australasian Language Technology Association, pages 159–163, Melbourne, Australia. Association for Computational Linguistics.
Cite (Informal):
Automatic Detection of Machine-Generated Text Using Pre-Trained Language Models (Fang, ALTA 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.alta-1.19.pdf