Steffen Staab


2023

pdf bib
Knowledge Graph Embeddings using Neural Ito Process: From Multiple Walks to Stochastic Trajectories
Mojtaba Nayyeri | Bo Xiong | Majid Mohammadi | Mst. Mahfuja Akter | Mirza Mohtashim Alam | Jens Lehmann | Steffen Staab
Findings of the Association for Computational Linguistics: ACL 2023

Knowledge graphs mostly exhibit a mixture of branching relations, e.g., hasFriend, and complex structures, e.g., hierarchy and loop. Most knowledge graph embeddings have problems expressing them, because they model a specific relation r from a head h to tails by starting at the node embedding of h and transitioning deterministically to exactly one other point in the embedding space. We overcome this issue in our novel framework ItCAREToE by modeling relations between nodes by relation-specific, stochastic transitions. Our framework is based on stochastic ItCARETo processes, which operate on low-dimensional manifolds. ItCAREToE is highly expressive and generic subsuming various state-of-the-art models operating on different, also non-Euclidean, manifolds. Experimental results show the superiority of ItCAREToE over other deterministic embedding models with regard to the KG completion task.

pdf bib
Shrinking Embeddings for Hyper-Relational Knowledge Graphs
Bo Xiong | Mojtaba Nayyeri | Shirui Pan | Steffen Staab
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Link prediction on knowledge graphs (KGs) has been extensively studied on binary relational KGs, wherein each fact is represented by a triple. A significant amount of important knowledge, however, is represented by hyper-relational facts where each fact is composed of a primal triple and a set of qualifiers comprising a key-value pair that allows for expressing more complicated semantics. Although some recent works have proposed to embed hyper-relational KGs, these methods fail to capture essential inference patterns of hyper-relational facts such as qualifier monotonicity, qualifier implication, and qualifier mutual exclusion, limiting their generalization capability. To unlock this, we present ShrinkE, a geometric hyper-relational KG embedding method aiming to explicitly model these patterns. ShrinkE models the primal triple as a spatial-functional transformation from the head into a relation-specific box. Each qualifier “shrinks” the box to narrow down the possible answer set and, thus, realizes qualifier monotonicity. The spatial relationships between the qualifier boxes allow for modeling core inference patterns of qualifiers such as implication and mutual exclusion. Experimental results demonstrate ShrinkE’s superiority on three benchmarks of hyper-relational KGs.

2020

pdf bib
Is Language Modeling Enough? Evaluating Effective Embedding Combinations
Rudolf Schneider | Tom Oberhauser | Paul Grundmann | Felix Alexander Gers | Alexander Loeser | Steffen Staab
Proceedings of the Twelfth Language Resources and Evaluation Conference

Universal embeddings, such as BERT or ELMo, are useful for a broad set of natural language processing tasks like text classification or sentiment analysis. Moreover, specialized embeddings also exist for tasks like topic modeling or named entity disambiguation. We study if we can complement these universal embeddings with specialized embeddings. We conduct an in-depth evaluation of nine well known natural language understanding tasks with SentEval. Also, we extend SentEval with two additional tasks to the medical domain. We present PubMedSection, a novel topic classification dataset focussed on the biomedical domain. Our comprehensive analysis covers 11 tasks and combinations of six embeddings. We report that combined embeddings outperform state of the art universal embeddings without any embedding fine-tuning. We observe that adding topic model based embeddings helps for most tasks and that differing pre-training tasks encode complementary features. Moreover, we present new state of the art results on the MPQA and SUBJ tasks in SentEval.

2019

pdf bib
CLEARumor at SemEval-2019 Task 7: ConvoLving ELMo Against Rumors
Ipek Baris | Lukas Schmelzeisen | Steffen Staab
Proceedings of the 13th International Workshop on Semantic Evaluation

This paper describes our submission to SemEval-2019 Task 7: RumourEval: Determining Rumor Veracity and Support for Rumors. We participated in both subtasks. The goal of subtask A is to classify the type of interaction between a rumorous social media post and a reply post as support, query, deny, or comment. The goal of subtask B is to predict the veracity of a given rumor. For subtask A, we implement a CNN-based neural architecture using ELMo embeddings of post text combined with auxiliary features and achieve a F1-score of 44.6%. For subtask B, we employ a MLP neural network leveraging our estimates for subtask A and achieve a F1-score of 30.1% (second place in the competition). We provide results and analysis of our system performance and present ablation experiments.

2014

pdf bib
A Generalized Language Model as the Combination of Skipped n-grams and Modified Kneser Ney Smoothing
Rene Pickhardt | Thomas Gottron | Martin Körner | Paul Georg Wagner | Till Speicher | Steffen Staab
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

2004

pdf bib
Feature Weighting for Co-occurrence-based Classification of Words
Viktor Pekar | Michael Krkoska | Steffen Staab
COLING 2004: Proceedings of the 20th International Conference on Computational Linguistics

pdf bib
Clustering Concept Hierarchies from Text
Philipp Cimiano | Andreas Hotho | Steffen Staab
Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC’04)

2003

pdf bib
Word classification based on combined measures of distributional and semantic similarity
Viktor Pekar | Steffen Staab
10th Conference of the European Chapter of the Association for Computational Linguistics

2002

pdf bib
Taxonomy Learning - Factoring the Structure of a Taxonomy into a Semantic Classification Decision
Viktor Pekar | Steffen Staab
COLING 2002: The 19th International Conference on Computational Linguistics

2001

pdf bib
Knowledge Portals (invited talk)
Steffen Staab
Proceedings of the ACL 2001 Workshop on Human Language Technology and Knowledge Management

2000

pdf bib
From Manual to Semi-Automatic Semantic Annotation: About Ontology-Based Text Annotation Tools
Michael Erdmann | Alexander Maedche | Hans-Peter Schnurr | Steffen Staab
Proceedings of the COLING-2000 Workshop on Semantic Annotation and Intelligent Content