Unsupervised Methods for Domain Specific Ambiguity Detection. The Case of German Physics Language

Vitor Fontanella, Christian Wartena, Gunnar Friege


Abstract
Many terms used in physics have a different meaning or usage pattern in general language, constituting a learning barrier in physics teaching. The systematic identification of such terms is considered to be useful for science education as well as for terminology extraction. This article compares three methods based on vector semantics and a simple frequency-based baseline for automatically identifying terms used in general language with domain-specific use in physics. For evaluation, we use ambiguity scores from a survey among physicists and data about the number of term senses from Wiktionary. We show that the so-called Vector Initialization method obtains the best results.
Anthology ID:
2023.iwcs-1.26
Volume:
Proceedings of the 15th International Conference on Computational Semantics
Month:
June
Year:
2023
Address:
Nancy, France
Editors:
Maxime Amblard, Ellen Breitholtz
Venue:
IWCS
SIG:
SIGSEM
Publisher:
Association for Computational Linguistics
Note:
Pages:
252–257
Language:
URL:
https://aclanthology.org/2023.iwcs-1.26
DOI:
Bibkey:
Cite (ACL):
Vitor Fontanella, Christian Wartena, and Gunnar Friege. 2023. Unsupervised Methods for Domain Specific Ambiguity Detection. The Case of German Physics Language. In Proceedings of the 15th International Conference on Computational Semantics, pages 252–257, Nancy, France. Association for Computational Linguistics.
Cite (Informal):
Unsupervised Methods for Domain Specific Ambiguity Detection. The Case of German Physics Language (Fontanella et al., IWCS 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.iwcs-1.26.pdf