Log-linear Guardedness and its Implications

Shauli Ravfogel, Yoav Goldberg, Ryan Cotterell


Abstract
Methods for erasing human-interpretable concepts from neural representations that assume linearity have been found to be tractable and useful. However, the impact of this removal on the behavior of downstream classifiers trained on the modified representations is not fully understood. In this work, we formally define the notion of linear guardedness as the inability of an adversary to predict the concept directly from the representation, and study its implications. We show that, in the binary case, under certain assumptions, a downstream log-linear model cannot recover the erased concept. However, we constructively demonstrate that a multiclass log-linear model can be constructed that indirectly recovers the concept in some cases, pointing to the inherent limitations of linear guardedness as a downstream bias mitigation technique.These findings shed light on the theoretical limitations of linear erasure methods and highlight the need for further research on the connections between intrinsic and extrinsic bias in neural models.
Anthology ID:
2023.acl-long.523
Volume:
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:
July
Year:
2023
Address:
Toronto, Canada
Editors:
Anna Rogers, Jordan Boyd-Graber, Naoaki Okazaki
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
9413–9431
Language:
URL:
https://aclanthology.org/2023.acl-long.523
DOI:
10.18653/v1/2023.acl-long.523
Bibkey:
Cite (ACL):
Shauli Ravfogel, Yoav Goldberg, and Ryan Cotterell. 2023. Log-linear Guardedness and its Implications. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 9413–9431, Toronto, Canada. Association for Computational Linguistics.
Cite (Informal):
Log-linear Guardedness and its Implications (Ravfogel et al., ACL 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.acl-long.523.pdf
Video:
 https://aclanthology.org/2023.acl-long.523.mp4