G-Tuning: Improving Generalization of Pre-trained Language Models with Generative Adversarial Network

Rongxiang Weng; Wen Sen Cheng; Min Zhang

doi:10.18653/v1/2023.findings-acl.291

G-Tuning: Improving Generalization of Pre-trained Language Models with Generative Adversarial Network

Rongxiang Weng, Wen Sen Cheng, Min Zhang

Abstract

The generalization ability of pre-trained language models (Plms) in downstream tasks is heavily influenced by fine-tuning. The objective of fine-tuning is to transform the latent representation of Plms from a universal space to a target space, allowing the model to be applied to downstream tasks with the capability of generalizing to unseen samples. However, the effect of Plms will be diminished when the training data coverage is insufficient, in which fine-tuning is inadequate to learn the complete mapping. In this study, we propose a new fine-tuning framework, referred to as G-Tuning, that aims to preserve the generalization ability of Plms in downstream tasks. Specifically, we integrate a generative adversarial network into the fine-tuning process to aid in the transformation of the latent representation in the entire space. Empirical evaluations on the GLUE benchmark, as well as two additional demanding scenarios involving domain and language generalization, demonstrate that G-Tuning can accurately map the universal representation to the target space, thus effectively enhancing the generalization performance of Plms across various downstream tasks.

Anthology ID:: 2023.findings-acl.291
Volume:: Findings of the Association for Computational Linguistics: ACL 2023
Month:: July
Year:: 2023
Address:: Toronto, Canada
Editors:: Anna Rogers, Jordan Boyd-Graber, Naoaki Okazaki
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 4747–4755
Language:
URL:: https://aclanthology.org/2023.findings-acl.291
DOI:: 10.18653/v1/2023.findings-acl.291
Bibkey:
Cite (ACL):: Rongxiang Weng, Wen Sen Cheng, and Min Zhang. 2023. G-Tuning: Improving Generalization of Pre-trained Language Models with Generative Adversarial Network. In Findings of the Association for Computational Linguistics: ACL 2023, pages 4747–4755, Toronto, Canada. Association for Computational Linguistics.
Cite (Informal):: G-Tuning: Improving Generalization of Pre-trained Language Models with Generative Adversarial Network (Weng et al., Findings 2023)
Copy Citation:
PDF:: https://aclanthology.org/2023.findings-acl.291.pdf

PDF Cite Search