Overcoming Language Priors with Counterfactual Inference for Visual Question Answering

Ren Zhibo, Wang Huizhen, Zhu Muhua, Wang Yichao, Xiao Tong, Zhu Jingbo


Abstract
“Recent years have seen a lot of efforts in attacking the issue of language priors in the field ofVisual Question Answering (VQA). Among the extensive efforts, causal inference is regarded asa promising direction to mitigate language bias by weakening the direct causal effect of questionson answers. In this paper, we follow the same direction and attack the issue of language priorsby incorporating counterfactual data. Moreover, we propose a two-stage training strategy whichis deemed to make better use of counterfactual data. Experiments on the widely used bench-mark VQA-CP v2 demonstrate the effectiveness of the proposed approach, which improves thebaseline by 21.21% and outperforms most of the previous systems.”
Anthology ID:
2023.ccl-1.52
Volume:
Proceedings of the 22nd Chinese National Conference on Computational Linguistics
Month:
August
Year:
2023
Address:
Harbin, China
Editors:
Maosong Sun, Bing Qin, Xipeng Qiu, Jing Jiang, Xianpei Han
Venue:
CCL
SIG:
Publisher:
Chinese Information Processing Society of China
Note:
Pages:
600–610
Language:
English
URL:
https://aclanthology.org/2023.ccl-1.52
DOI:
Bibkey:
Cite (ACL):
Ren Zhibo, Wang Huizhen, Zhu Muhua, Wang Yichao, Xiao Tong, and Zhu Jingbo. 2023. Overcoming Language Priors with Counterfactual Inference for Visual Question Answering. In Proceedings of the 22nd Chinese National Conference on Computational Linguistics, pages 600–610, Harbin, China. Chinese Information Processing Society of China.
Cite (Informal):
Overcoming Language Priors with Counterfactual Inference for Visual Question Answering (Zhibo et al., CCL 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.ccl-1.52.pdf